hms-detection

Real-time YOLO object detection, event recording, and AI vision context for your security cameras. Built in C++20 with FFmpeg, ONNX Runtime, and Drogon — fast, low-latency, and Home Assistant native.

YOLOC++20RTSPMQTTHome Assistant

What it does

Real-time YOLO inference

ONNX Runtime inference supporting YOLOv8, YOLOv9, YOLO11, and YOLO26. The output format is auto-detected at runtime, and custom-trained models with non-COCO classes are supported.

Event recording with pre-roll

A lock-free ring buffer per camera captures pre-roll frames, then an FFmpeg muxer writes the full clip with post-roll plus a best-frame snapshot annotated with bounding boxes.

AI vision context

LLaVA via Ollama turns detections into natural-language scene descriptions, and every event is logged to PostgreSQL alongside its detections and recording URLs.

Get it running

Quick start (Docker)

cp config.yaml.example config.yaml

# Run with docker-compose (includes go2rtc)
docker compose up -d

# Check health
curl http://localhost:8000/health

# Get a snapshot
curl http://localhost:8000/api/cameras/patio/snapshot -o snap.jpg

Pull the container

docker pull ghcr.io/hms-homelab/hms-detection:latest

docker run -d --network host \
  -v ./config.yaml:/app/config/config.yaml:ro \
  -v /mnt/ssd/events:/mnt/ssd/events \
  -v /mnt/ssd/snapshots:/mnt/ssd/snapshots \
  ghcr.io/hms-homelab/hms-detection:latest

Trigger and integrate

Trigger detection over MQTT

mosquitto_pub -h localhost \
  -t "camera/event/motion/start" \
  -m '{"camera_id": "patio", "post_roll_seconds": 5}'

Publish to camera/event/motion/start (and .../stop) to drive the detect → record → snapshot pipeline.

Home Assistant MQTT topics

yolo_detection/{cam}/detected   # binary ON/OFF
yolo_detection/{cam}/result     # detections + URLs
yolo_detection/{cam}/context    # LLaVA description
yolo_detection/status            # online/offline (retained)

The detected topic maps cleanly to a Home Assistant binary sensor; result and context carry the rich detection payloads.

Open source and self-hosted

Run it on your own hardware, point it at your cameras, and keep your footage private. Build from source or pull the prebuilt image.