hms-detection
Real-time YOLO object detection, event recording, and AI vision context for your security cameras. Built in C++20 with FFmpeg, ONNX Runtime, and Drogon — fast, low-latency, and Home Assistant native.
What it does
Real-time YOLO inference
ONNX Runtime inference supporting YOLOv8, YOLOv9, YOLO11, and YOLO26. The output format is auto-detected at runtime, and custom-trained models with non-COCO classes are supported.
Event recording with pre-roll
A lock-free ring buffer per camera captures pre-roll frames, then an FFmpeg muxer writes the full clip with post-roll plus a best-frame snapshot annotated with bounding boxes.
AI vision context
LLaVA via Ollama turns detections into natural-language scene descriptions, and every event is logged to PostgreSQL alongside its detections and recording URLs.
Get it running
Quick start (Docker)
cp config.yaml.example config.yaml
# Run with docker-compose (includes go2rtc)
docker compose up -d
# Check health
curl http://localhost:8000/health
# Get a snapshot
curl http://localhost:8000/api/cameras/patio/snapshot -o snap.jpgPull the container
docker pull ghcr.io/hms-homelab/hms-detection:latest
docker run -d --network host \
-v ./config.yaml:/app/config/config.yaml:ro \
-v /mnt/ssd/events:/mnt/ssd/events \
-v /mnt/ssd/snapshots:/mnt/ssd/snapshots \
ghcr.io/hms-homelab/hms-detection:latestTrigger and integrate
Trigger detection over MQTT
mosquitto_pub -h localhost \
-t "camera/event/motion/start" \
-m '{"camera_id": "patio", "post_roll_seconds": 5}' Publish to camera/event/motion/start (and .../stop) to drive the detect → record → snapshot pipeline.
Home Assistant MQTT topics
yolo_detection/{cam}/detected # binary ON/OFF
yolo_detection/{cam}/result # detections + URLs
yolo_detection/{cam}/context # LLaVA description
yolo_detection/status # online/offline (retained) The detected topic maps cleanly to a Home Assistant binary sensor; result and context carry the rich detection payloads.
Open source and self-hosted
Run it on your own hardware, point it at your cameras, and keep your footage private. Build from source or pull the prebuilt image.