Skip to content

Architecture

Metrica is three moving parts: a CV worker at the edge, a FastAPI backend in the cloud, and React apps for owners and admins. The worker produces raw IN/OUT events; the backend stores them and derives KPIs; the apps read those KPIs.

EDGE CLOUD
┌──────────────────────┐ ┌────────────────────────────────┐
│ CV worker (per cam) │ │ FastAPI backend (FastAPI Cloud) │
│ ┌────────────────┐ │ POST │ ┌───────────┐ ┌────────────┐ │
│ │ YOLO+ByteTrack │ │ events│ │ /events │──▶│ Neon │ │
│ │ LineCounter │───┼──────▶│ │ ingestion │ │ Postgres │ │
│ │ EventPoster │ │ │ └───────────┘ │ (Frankfurt)│ │
│ │ Heartbeat/Snap │ │ │ ┌───────────┐ └────────────┘ │
│ └────────────────┘ │◀──────┼──│ /cameras │ ▲ │
└──────────────────────┘ config │ │ /config │ │ derive │
│ └───────────┘ ┌────────────┐ │
│ ┌───────────┐ │ /live │ │
Owner dashboard (React) ──────┼─▶│ Supabase │──▶│ /dashboard │ │
Admin app (React) ────────────┼─▶│ JWT auth │ │ /admin │ │
│ └───────────┘ └────────────┘ │
└────────────────────────────────┘

Each camera runs one worker/run.py process. Concerns are split into small, independently testable pieces:

Module Role
run.py The CV loop — YOLO tracking, line resolution, orchestration
line_counter.py Pure geometry: which side of the line a point is on, and crossing detection. No CV, no I/O
poster.py EventPoster — buffers events and POSTs them; retains the buffer on transient failure, drops on permanent (4xx) rejection
heartbeat.py HeartbeatSender — best-effort periodic POST /cameras/{id}/heartbeat; never raises
snapshot.py Uploads a frame so an admin can draw the counting line over it

Line crossing is a signed 2D cross product. LineCounter remembers the last side (+1/-1) per tracker_id; when the sign flips, that is a crossing. Negative-to-positive counts as IN (before invert), the opposite as OUT. A point exactly on the line is ignored so it keeps its previous side.

Line source precedence (resolve_line): an explicit CLI --line (used for calibration) wins; otherwise the worker fetches the camera’s saved config from the backend; otherwise it defaults to a horizontal line at mid-frame height. Lines are stored normalized (0..1 fractions of the frame) so they are resolution-independent; the worker denormalizes to pixels on its first frame.

FastAPI app assembled in backend/app/main.py, with routers included per concern:

Router Endpoints (selected) Auth
auth GET /me, GET /stores Supabase JWT
events POST /events Worker key
live GET /live/{store_id} Supabase JWT + ownership
dashboard GET /dashboard/{store_id} Supabase JWT + ownership
cameras GET/PUT /cameras/..., GET /cameras/{id}/config, snapshot, heartbeat Owner (reads) / worker key (config, snapshot, heartbeat)
admin /admin/stores, /admin/cameras, /admin/health Superadmin

Cross-cutting middleware: a CatchServerErrorsMiddleware that converts unhandled exceptions into a clean 500 (so the response still passes through CORS), plus the standard CORSMiddleware. Database access is async SQLAlchemy; the session commits on successful request return.

The dashboard (React 19 · Vite · TanStack Router · shadcn) reads live and period KPIs. The admin app (React · Vite · shadcn) manages tenants and cameras and watches camera health. Both talk only to the backend HTTP API.

There are two distinct callers, verified differently:

  • Owners present a Supabase-issued ES256 JWT as a Bearer token. The backend verifies the signature against the project JWKS (cached in-process for 10 minutes, refreshed on unknown kid from key rotation), and checks issuer, audience (authenticated), and expiry. A local users row mirrors the Supabase identity — see the data model for how id (our immutable key) is kept separate from auth_id (the provider subject).
  • The worker presents a shared secret in the X-Worker-Key header on ingestion endpoints. This is a service-to-service credential, not a user login.

Tenancy is enforced by owned_store(): every owner-facing read resolves the store and returns 404 (never 403, to avoid leaking existence) unless store.owner_id matches the caller. Superadmin endpoints under /admin require the caller’s email to be on the platform superadmin list.

  1. The worker’s LineCounter emits IN / OUT; EventPoster.add() buffers a payload (store ID, camera ID, tracker_id, event_type, UTC timestamp).
  2. Every ~30 frames flush() POSTs the buffer to /events with X-Worker-Key. On a 4xx the payload is dropped (retrying a rejected payload can never succeed); on a 5xx or network error the buffer is kept for the next flush.
  3. POST /events inserts one events row per crossing and returns the count.
  • GET /live/{store_id} counts today’s IN minus OUT (clamped at ≥ 0) for the store, over the local day, considering only cameras where is_counting_line is true. That is current occupancy.
  • GET /dashboard/{store_id}?period=day|week|month pulls the ordered event stream for the current window and the matching previous window, then derives KPIs.

The system is layered so each concern can fail or be recomputed without breaking the next:

EVENTS ──(business rules)──▶ SESSIONS ──(rollup)──▶ METRICS
(CV output, derive a visit aggregate to
what's stored) (entry→exit, dwell) daily/hourly KPIs
  • Events are the source of truth — the only thing the worker writes and the only footfall data currently persisted.
  • Sessions (a full visit: entry, exit, dwell time) are a Phase 1.5 concept.
  • Metrics (precomputed daily/hourly rollups) are a Phase 2 optimization for when on-the-fly queries get slow.

Today, sessions and metrics are not materialized. Instead services/metrics.py::compute_window() derives them directly from the event stream using Little’s Law: average visit time = area under the occupancy curve ÷ entries. It walks the ordered events, keeping a running occupancy balance (IN minus OUT, clamped at ≥ 0), accumulates the time-weighted area (including the tail from the last event to now for visitors still inside), and buckets entries by hour (for day) or by date (for week/month). It returns total_entries, total_exits, peak_occupancy, avg_visit_seconds, a series, and a peak bucket — plus the same four headline numbers for the previous period for comparison.

All timestamps are stored in UTC. They are converted to the store’s timezone only when rolling up daily/hourly windows and computing peak_hour. services/timewindows.py computes the UTC instant of local midnight (and of the start of the local week/month, and the previous period) so day/week/month filters line up with what the shop owner calls “today”. Getting this wrong would put peak-hour analytics in the wrong bucket.

Concern Runs at the edge (worker) Runs in the cloud (backend)
Video decode, detection, tracking ✅ YOLO + ByteTrack
Line-crossing logic LineCounter
Raw frames leave the store ❌ never (only events + calibration snapshots)
Event persistence events table
KPI derivation compute_window
Auth / tenancy / admin

Performance target: 5–8 FPS per camera (CPU is enough for a single entrance camera), with 5–10 s of acceptable end-to-end latency — the dashboard polls and the edge batches events, so this is near-real-time rather than streaming.