'Making Taipei City Safer Than Ever' - White Paper MiTAC intelligent CCTV system for TCPD
Document Library
Reference architectures, white papers, and solutions briefs to help build and enhance your network infrastructure, at any level of deployment.
Engagement / Document Library / Edge Vision -Language Model for Real-Time Scene Awareness
Edge Vision -Language Model for Real-Time Scene Awareness
Last Updated: May 19, 2026
Edge AI Scene Descriptor is a modular, event-driven edge AI application that captures video frames from a local media store (/srv/sr/media), processes them through an on-device Florence-based AI model (scene-descriptor, FastAPI on port 8000) to generate rich textual scene descriptions, and asynchronously routes the resulting structured events through RabbitMQ to an ingestion and API service (event-reader) that persists the data and exposes it via REST (port 8001) for external consumption — all while a background cleanup service (edge-event-data-purger) enforces data retention policies, making the entire pipeline self-contained, lightweight, and purpose-built for real-time scene intelligence at the edge.





