# Visual Reasoning Playground [![Moondream](https://img.shields.io/badge/Powered%20by-Moondream-blue)](https://moondream.ai) [![PTZOptics](https://img.shields.io/badge/Compatible-PTZOptics-orange)](https://ptzoptics.com) [![StreamGeeks](https://img.shields.io/badge/By-StreamGeeks-red)](https://streamgeeks.com) [![Get the Book](https://img.shields.io/badge/Get%20the%20Book-VisualReasoning.ai-green)](https://visualreasoning.ai/book) **AI-powered visual reasoning tools for broadcast, live streaming, and ProAV professionals.** 17 ready-to-use tools demonstrating real-world applications of Vision Language Models (VLMs) using [Moondream](https://moondream.ai). From PTZ camera auto-tracking to multimodal audio+video automation. > ๐Ÿš€ **[Try All Tools Online Now](https://streamgeeks.github.io/visual-reasoning-playground/)** - No installation required! > ๐ŸŽฎ **Playground Mode**: All tools work without a camera! Sample videos included for testing. > **From the book**: *Visual Reasoning AI for Broadcast and ProAV* by Paul Richards > > **Author**: Paul Richards - Co-CEO at [PTZOptics](https://ptzoptics.com) | Chief Streaming Officer at [StreamGeeks](https://streamgeeks.com) --- ## Why Visual Reasoning? Traditional computer vision requires training custom models for each task. **Visual Reasoning** uses pre-trained Vision Language Models that understand natural language - just describe what you want to detect. ``` Old way: Train a model on 10,000 images of "person at podium" New way: Just ask "Is there a person standing at the podium?" ``` **Perfect for:** - Live streaming & broadcast automation - PTZ camera control & auto-tracking - Smart conference rooms - Security & monitoring - Content creation workflows - OBS & vMix integration --- ## The Tools ### ๐Ÿ‘๏ธ Tool 1: Scene Describer โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/01-scene-describer/) Natural language descriptions of any scene in real-time. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream API โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ "A person at โ”‚ โ”‚ Frame โ”‚ โ”‚ /caption โ”‚ โ”‚ a desk with โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ a laptop..." โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `01-scene-describer/` --- ### ๐Ÿ“ฆ Tool 2: Detection Boxes โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/02-detection-boxes/) Draw bounding boxes around any object you describe. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream API โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Video Feed โ”‚ โ”‚ Frame โ”‚ โ”‚ /detect โ”‚ โ”‚ + Colored โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ "person","mug" โ”‚ โ”‚ Bounding Boxesโ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `02-detection-boxes/` --- ### โœ‹ Tool 3: Gesture OBS Control โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/03-gesture-obs/) Control OBS scene switching with hand gestures. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream API โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ OBS WebSocket โ”‚ โ”‚ Frame โ”‚ โ”‚ "thumbs up?" โ†’ โ”‚ โ”‚ Scene Switch โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ YES/NO โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ OBS Studio โ”‚ โ”‚ Scene 1 โ†’ 2 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` > ๐Ÿ”Œ **OBS Script Available!** Install directly in OBS Studio: [moondream-gesture-control.py](https://github.com/streamgeeks/visual-reasoning-playground/blob/master/03-gesture-obs/moondream-gesture-control.py) ๐Ÿ“ `03-gesture-obs/` --- ### ๐Ÿ”ข Tool 5: Smart Counter โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/05-smart-counter/) Count objects entering or exiting across a virtual line. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Define Line โ”‚ โ”‚ โ”€ โ”€ โ”€ โ”€ โ”€ โ”€ โ”€ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Track Objects โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ IN: 12 โ”‚ โ”‚ Frame โ”‚ โ”‚ Across Line โ”‚ โ”‚ OUT: 8 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ TOTAL: +4 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `05-smart-counter/` --- ### ๐Ÿ” Tool 6: Scene Analyzer โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/06-scene-analyzer/) Ask questions about what the camera sees. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream API โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ "Yes, there โ”‚ โ”‚ Frame โ”‚ โ”‚ /query โ”‚ โ”‚ are 3 people โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ in the room" โ”‚ โ–ฒ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ "How many โ”‚ โ”‚ people?" โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `06-scene-analyzer/` --- ### ๐Ÿšง Tool 7: Zone Monitor โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/07-zone-monitor/) Draw custom zones, get alerts when objects enter. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera View โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ ZONE A โ”‚ โ—‹ person โ”‚ โ”‚ โ”‚ (alert!) โ”‚ enters โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Webhook โ”‚โ”€โ”€โ”€โ”€โ–ถ Alert! โ”‚ Trigger โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `07-zone-monitor/` --- ### ๐ŸŽจ Tool 10: Color Matcher โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/10-color-matcher/) Match your camera's color settings to a reference image. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Reference โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream โ”‚ โ”‚ Suggested โ”‚ โ”‚ Image โ”‚ โ”‚ Analyze Both โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Adjustments: โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ WB: +200K โ”‚ โ–ฒ โ”‚ Sat: -10 โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ Exp: +0.5 โ”‚ โ”‚ Camera โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ Feed โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `10-color-matcher/` --- ### ๐Ÿ”Š Tool 12: Multimodal Fusion โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/12-multimodal-fusion/) Combine audio + video for intelligent automation. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera โ”‚โ”€โ”€โ”€โ”€โ” โ”‚ (Video) โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”œโ”€โ”€โ”€โ”€โ–ถโ”‚ Fusion Engine โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Trigger โ”‚ โ”‚ โ”‚ Video + Audio โ”‚ โ”‚ Automation โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ Confidence: 95%โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ Microphone โ”‚โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ (Speech) โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ Example: "Start meeting" + people visible = HIGH confidence โ†’ trigger ``` ๐Ÿ“ `12-multimodal-fusion/` --- ### ๐Ÿ“ธ Tool 13: Smart AI Photographer โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/13-smart-photographer/) Auto-capture photos when AI detects your target. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream API โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Target Found? โ”‚ โ”‚ Frame โ”‚ โ”‚ /detect โ”‚ โ”‚ YES โ†’ ๐Ÿ“ธ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ "person smiling"โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Photo Gallery โ”‚ โ”‚ + Download โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `13-smart-photographer/` --- ### ๐ŸŽฏ Featured: PTZ Auto-Tracker โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/PTZOptics-Moondream-Tracker/) Autonomous PTZ camera tracking using AI vision. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ PTZOptics โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream API โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Calculate โ”‚ โ”‚ Camera โ”‚ โ”‚ /detect โ”‚ โ”‚ Pan/Tilt โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ "red shirt" โ”‚ โ”‚ Commands โ”‚ โ–ฒ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”‚ PTZOptics API โ”‚โ—€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ Move Camera โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `PTZOptics-Moondream-Tracker/` --- ### โšก Tool 14: Tracking Comparison โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/14-tracking-comparison/) Compare MediaPipe (local CV) vs Moondream (cloud VLM) for PTZ tracking. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ MediaPipe โ”‚โ”€โ”€โ”€โ”€ Local: ~10ms โ”€โ”€โ”€โ”€โ” โ”‚ Frame โ”‚ โ”‚ (Browser) โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”œโ”€โ”€โ–ถ Compare! โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream โ”‚โ”€โ”€โ”€โ”€ Cloud: ~200ms โ”€โ”€โ”€โ”˜ โ”‚ (API) โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` > ๐Ÿงช **See the tradeoffs** โ€” latency, accuracy, and flexibility side-by-side. ๐Ÿ“ `14-tracking-comparison/` --- ### ๐Ÿ† Tool 4: Scoreboard Extractor โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/04-scoreboard-extractor/) Extract scores from physical scoreboards using AI vision. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Scoreboard โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream API โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ HOME: 24 โ”‚ โ”‚ Camera โ”‚ โ”‚ "Read score" โ”‚ โ”‚ AWAY: 18 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ QTR: 3 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Graphics โ”‚ โ”‚ Overlay โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `04-scoreboard-extractor/` --- ### ๐Ÿ“ Tool 4b: Scoreboard OCR โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/04b-scoreboard-ocr/) Extract scores using local Tesseract.js OCR โ€” no API key needed. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Scoreboard โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Tesseract.js โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ HOME: 24 โ”‚ โ”‚ Camera โ”‚ โ”‚ (Local OCR) โ”‚ โ”‚ AWAY: 18 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ Region-based โ”‚ โ”‚ QTR: 3 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` > ๐Ÿ”„ **Compare approaches!** Use this alongside Tool 4 to see VLM vs OCR tradeoffs. ๐Ÿ“ `04b-scoreboard-ocr/` --- ### ๐Ÿ–ผ๏ธ Tool 8: Framing Assistant โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/08-framing-assistant/) AI-powered framing suggestions for PTZ cameras. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Camera View โ”‚ โ”‚ โ”‚ โ”‚ โ”Œ โ”€ โ”€ โ”€ โ”€ โ”€ โ” โ”‚ โ”‚ โ”‚ Suggested โ”‚ โ—‹ subject โ”‚ โ”‚ โ”‚ Frame โ”‚ โ”‚ โ”‚ โ”” โ”€ โ”€ โ”€ โ”€ โ”€ โ”˜ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ "Move camera UP 5ยฐ, zoom IN 10% for better composition" ``` ๐Ÿ“ `08-framing-assistant/` --- ### ๐ŸŽ›๏ธ Tool 9: PTZ Color Tuner โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/09-ptz-color-tuner/) Direct PTZ camera color control via API with AI-assisted adjustments. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ PTZOptics โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Moondream AI โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Recommended โ”‚ โ”‚ Camera โ”‚ โ”‚ Analyze Scene โ”‚ โ”‚ Adjustments โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ–ฒ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”‚ PTZOptics API โ”‚โ—€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ Apply Settings โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `09-ptz-color-tuner/` --- ### ๐ŸŽฌ Tool 11: Multimodal Studio โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/11-multimodal-studio/) Full production automation: PTZ + OBS + Audio + AI. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ PTZOptics โ”‚โ”€โ”€โ”€โ”€โ” โ”‚ Camera โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”œโ”€โ”€โ”€โ”€โ–ถโ”‚ Multimodal โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ PTZ Move โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ Studio โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ Microphone โ”‚โ”€โ”€โ”€โ”€โ”ค โ”‚ Controller โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ OBS Scene โ”‚ โ”‚ (Voice) โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ Webhook โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ OBS โ”‚โ”€โ”€โ”€โ”€โ”˜ โ”‚ Studio โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ Voice: "Camera 2, close up" โ†’ PTZ moves + OBS switches ``` ๐Ÿ“ `11-multimodal-studio/` --- ### ๐ŸŽ™๏ธ Tool 15: Voice Triggers โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/15-voice-triggers/) Speech-to-text automation with Whisper AI running entirely in-browser. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Microphone โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Whisper AI โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ "switch to โ”‚ โ”‚ Input โ”‚ โ”‚ (In-Browser) โ”‚ โ”‚ camera two" โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ Trigger Rules โ”‚โ—€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ phrase โ†’ actionโ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Execute Action โ”‚ โ”‚ (Log/Alert/OBS)โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` **Key Features:** - **No API key needed** - Whisper runs locally via WebGPU/WASM - **~40MB model** - Downloads once, cached in browser - **Trigger rules** - Map phrases to actions - **Privacy-first** - Audio never leaves your device ๐Ÿ“ `15-voice-triggers/` --- ### ๐Ÿ”Œ OBS Plugin: Visual Reasoning AI โ€” [Try it now](https://streamgeeks.github.io/visual-reasoning-playground/obs-visual-reasoning/) Complete AI control panel as an OBS Browser Dock. ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ OBS BROWSER DOCK โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚Gestures โ”‚ Describe โ”‚ Auto-Switchโ”‚ โ† Tabs โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ Camera Preview โ”‚ โ”‚ โ”‚ โ”‚ [Gesture Detection] โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”‚ โ”‚ ๐Ÿ‘ Thumbs Up โ†’ Scene: Wide Shot โ”‚ โ”‚ ๐Ÿ‘Ž Thumbs Down โ†’ Scene: Close Up โ”‚ โ”‚ โ”‚ โ”‚ Auto-Switch Rules: โ”‚ โ”‚ "whiteboard" โ†’ Whiteboard Cam โ”‚ โ”‚ "standing" โ†’ Full Body Shot โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ OBS Studio โ”‚ โ”‚ Scene Switch โ”‚ โ”‚ Start/Stop Rec โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ๐Ÿ“ `obs-visual-reasoning/` --- ## Quick Start ### Option A: Try Online Instantly (Recommended) 1. **Get Your API Key** - Sign up at [moondream.ai](https://moondream.ai) (free tier available) 2. **Open Any Tool** - Visit the [Visual Reasoning Playground](https://streamgeeks.github.io/visual-reasoning-playground/) 3. **Enter Your API Key** - Paste it once, and you're ready to go! ### Option B: Run Locally > **Important:** Clone the **full repository** โ€” individual tool folders won't work alone because they depend on shared libraries in `shared/`. ```bash git clone https://github.com/streamgeeks/visual-reasoning-playground.git cd visual-reasoning-playground python server.py ``` Then open `http://localhost:8000` and select any tool. The included `server.py` enables CORS so sample videos work with AI detection. --- ## Use Cases Every tool includes both **business** and **personal** examples: | Tool | Business Use | Personal Use | |------|--------------|--------------| | Scene Describer | Patient fall detection | Fridge inventory for recipes | | Detection Boxes | Manufacturing QA | "Where are my keys?" | | PTZ Auto-Tracker | Speaker tracking at events | Pet cam follows your dog | | Smart Counter | Retail foot traffic analytics | Count kids going outside | | Scene Analyzer | Security: "Anyone in restricted area?" | "Is my garage door open?" | | Zone Monitor | Warehouse safety alerts | Driveway arrival notifications | | Color Assistant | Multi-cam color matching | Match YouTuber's style | | Multimodal Fusion | Smart conference room | Voice-controlled smart home | --- ## Integration Ready These tools are designed to integrate with your existing workflow: | Platform | Integration | |----------|-------------| | **OBS Studio** | WebSocket triggers, scene switching, **native Python script** | | **vMix** | HTTP API commands, input control | | **PTZOptics** | Full API 2.0 support for all PTZ cameras | | **NDI** | Works with NDI video sources | | **Webhooks** | Trigger any HTTP endpoint | | **Home Assistant** | Smart home automation | --- ## OBS Studio Plugin ### Moondream Gesture Control Script Control OBS scenes with hand gestures - runs natively inside OBS Studio! **Installation:** 1. Download [`moondream-gesture-control.py`](https://github.com/streamgeeks/visual-reasoning-playground/blob/master/03-gesture-obs/moondream-gesture-control.py) 2. In OBS: **Tools โ†’ Scripts โ†’ + โ†’ Select the .py file** 3. Configure your Moondream API key and gesture mappings 4. Enable detection and start gesturing! **Features:** - ๐Ÿ‘ Thumbs up โ†’ Switch to Scene A - ๐Ÿ‘Ž Thumbs down โ†’ Switch to Scene B - Configurable detection interval and cooldown - Debug mode for troubleshooting - No browser required - runs entirely within OBS **Requirements:** - OBS Studio 28.0 or later - Moondream API key ([get one free](https://moondream.ai)) - Webcam > ๐Ÿ’ก **Try before installing:** Use the [web demo](https://streamgeeks.github.io/visual-reasoning-playground/03-gesture-obs/) to test gesture detection before installing the OBS script. --- ## Architecture All tools follow a consistent pattern: **Video โ†’ AI โ†’ Action** **Shared utilities** in `shared/`: - `moondream-client.js` - Unified API client with detect, caption, query, point methods - `video-source-adapter.js` - Toggle between live camera and sample videos - `api-key-manager.js` - Secure API key storage and validation - `styles.css` - Consistent dark theme UI components --- ## Project Structure ``` visual-reasoning-playground/ โ”œโ”€โ”€ index.html # Landing page with all tools โ”œโ”€โ”€ server.py # Local dev server (CORS enabled) โ”œโ”€โ”€ shared/ # Reusable utilities for all tools โ”‚ โ”œโ”€โ”€ 01-scene-describer/ # Natural language scene descriptions โ”œโ”€โ”€ 02-detection-boxes/ # Bounding box visualization โ”œโ”€โ”€ 03-gesture-obs/ # Gesture-based OBS control โ”œโ”€โ”€ 04-scoreboard-extractor/ # Score extraction (VLM approach) โ”œโ”€โ”€ 04b-scoreboard-ocr/ # Score extraction (Tesseract OCR) โ”œโ”€โ”€ 05-smart-counter/ # Object counting across line โ”œโ”€โ”€ 06-scene-analyzer/ # Visual Q&A chat โ”œโ”€โ”€ 07-zone-monitor/ # Zone-based alerts โ”œโ”€โ”€ 08-framing-assistant/ # PTZ framing suggestions โ”œโ”€โ”€ 09-ptz-color-tuner/ # PTZ color control โ”œโ”€โ”€ 10-color-matcher/ # Color matching to reference โ”œโ”€โ”€ 11-multimodal-studio/ # Full PTZ+OBS+voice automation โ”œโ”€โ”€ 12-multimodal-fusion/ # Audio+video fusion engine โ”œโ”€โ”€ 13-smart-photographer/ # Auto-capture on detection โ”œโ”€โ”€ 14-tracking-comparison/ # MediaPipe vs Moondream test โ”œโ”€โ”€ 15-voice-triggers/ # Voice command automation โ”‚ โ”œโ”€โ”€ PTZOptics-Moondream-Tracker/ # Featured PTZ auto-tracking โ”œโ”€โ”€ obs-visual-reasoning/ # OBS Browser Dock plugin โ”œโ”€โ”€ 00-visual-reasoning-harness/ # Harness pattern documentation โ”‚ โ””โ”€โ”€ assets/ # Sample videos & color profiles โ”œโ”€โ”€ sample-videos/ # Demo videos for playground mode โ””โ”€โ”€ color-profiles/ # Reference images for color tool ``` > See [CONTRIBUTING.md](CONTRIBUTING.md) for details on adding new tools. --- ## API Cost Guide Moondream charges per API call. Control costs with the rate slider in each tool: | Detection Rate | API Calls/Hour | Best For | |----------------|----------------|----------| | 0.5/sec | 1,800 | Static scenes, budget-conscious | | 1.0/sec | 3,600 | General use (default) | | 2.0/sec | 7,200 | Active scenes | | 3.0/sec | 10,800 | Fast action, sports | --- ## Requirements **All Tools:** - [Moondream API Key](https://moondream.ai) (free tier available) - Modern browser (Chrome recommended) - Local web server **Tool-Specific:** - **PTZ Auto-Tracker, Framing Assistant, Color Tuner**: [PTZOptics camera](https://ptzoptics.com) with network access - **Multimodal Studio, Multimodal Fusion, Voice Triggers**: Microphone for speech recognition - **Gesture OBS Control, OBS Plugin**: [OBS Studio](https://obsproject.com) with WebSocket Server enabled --- ## Learn More ### Get the Book **[Visual Reasoning AI for Broadcast and ProAV](https://visualreasoning.ai/book)** by Paul Richards covers: - Complete theory behind Vision Language Models - Step-by-step tool building tutorials - Production deployment strategies - Industry-specific applications **Get your copy at [VisualReasoning.ai/book](https://visualreasoning.ai/book)** ### Official Resources - [VisualReasoning.ai](https://visualreasoning.ai) - Book, online course, and free tools - [Moondream Documentation](https://docs.moondream.ai) - API reference & guides - [PTZOptics API 2.0](https://ptzoptics.com/api) - Camera control documentation - [StreamGeeks Academy](https://streamgeeks.com) - Live streaming education ### Community - [StreamGeeks Discord](https://discord.gg/streamgeeks) - Get help, share projects - [PTZOptics Support](https://ptzoptics.com/support) - Camera-specific questions --- ## Contributing Found a bug? Have an idea? PRs welcome! 1. Fork this repo 2. Create a feature branch 3. Submit a pull request --- ## License MIT License - Use freely in personal and commercial projects. ---

PTZOptics     Moondream     StreamGeeks

Built by Paul Richards
Co-CEO at PTZOptics | Chief Streaming Officer at StreamGeeks