How BarkCard Uses AI to Detect Dog Vocalizations
BarkCard uses a machine learning model called YAMNet to classify audio in real time — entirely on your device. Here's how it works.
What Is YAMNet?
YAMNet (Yet Another Model for Audio Recognition) is a pre-trained deep learning model developed by Google. It was trained on millions of audio samples from the AudioSet dataset and can classify 521 different sound categories — including dog barking, howling, whimpering, growling, and more.
BarkCard runs YAMNet through TensorFlow.js, which means the model executes in your browser's JavaScript engine. No audio is ever sent to a server.
How Detection Works
- Audio capture — BarkCard captures audio from your microphone in 1-second chunks.
- Preprocessing — Each chunk is resampled to 16kHz mono (what YAMNet expects).
- Classification — The chunk is fed through YAMNet, which outputs confidence scores for each of 521 sound categories.
- Filtering — BarkCard looks at scores for dog-related categories (bark, howl, whimper, growl, yip) and triggers a detection if any exceed the sensitivity threshold.
- Event recording — Detections are logged with timestamp, type, and confidence scores.
This entire pipeline runs at ~10fps on a modern laptop — fast enough for real-time monitoring.
Why On-Device?
Running AI locally has several advantages:
- Privacy — Your audio never leaves your computer. Period.
- No latency — Classification happens in milliseconds, not seconds.
- Works offline — No internet connection needed.
- No usage limits — No API calls, no rate limits, no per-minute charges.
The tradeoff is that the model is smaller than what you'd run on a cloud GPU. YAMNet is designed for efficiency — it's based on MobileNet and runs well on consumer hardware. For dog vocalization detection, it's more than accurate enough.
Accuracy
YAMNet wasn't specifically trained for dog separation anxiety monitoring — it's a general audio classifier. But it performs well for our use case because:
- "Dog bark" is one of its strongest categories (very common in training data)
- It distinguishes between barks, howls, and whimpers reasonably well
- The confidence scoring lets users tune sensitivity to their environment
BarkCard also stores the full confidence breakdown for each detection, so you can verify the classification and see how certain the model was.
The Sensitivity Slider
Every environment is different. A quiet apartment needs different sensitivity than a house near a busy road. BarkCard's sensitivity slider adjusts the threshold for triggering a detection:
- Lower threshold = more detections (catches quiet whimpers, but may have false positives from TV or ambient noise)
- Higher threshold = fewer detections (only confident classifications, but might miss subtle vocalizations)
We recommend starting at the default (0.30) and adjusting based on your first session's results.