Document Library

Reference architectures, white papers, and solutions briefs to help build and enhance your network infrastructure, at any level of deployment.

Edge Vision -Language Model for Real-Time Scene Awareness

Last Updated: May 19, 2026

Edge AI Scene Descriptor is a modular, event-driven edge AI application that captures video frames from a local media store (/srv/sr/media), processes them through an on-device Florence-based AI model (scene-descriptor, FastAPI on port 8000) to generate rich textual scene descriptions, and asynchronously routes the resulting structured events through RabbitMQ to an ingestion and API service (event-reader) that persists the data and exposes it via REST (port 8001) for external consumption — all while a background cleanup service (edge-event-data-purger) enforces data retention policies, making the entire pipeline self-contained, lightweight, and purpose-built for real-time scene intelligence at the edge.

Download PDF