Skip to content

Product overview

Metrica is a real-time retail analytics platform that runs computer vision on a store’s existing CCTV cameras to measure foot traffic, dwell time, and store performance. It is privacy-safe by design — anonymous counts only, no face recognition, no identity storage — and built Tunisia/MENA-first.

Retail owners — especially in Tunisia and similar markets — run on intuition and sales receipts. They have no visibility into the traffic side of the business:

  • Foot traffic — how many people actually enter
  • Peak hours — when the rush really happens
  • Dwell time — how long customers stay
  • Staffing decisions — who to schedule, and when

Sales data tells them what was bought, not how many walked in, browsed, and left without buying. That gap is where staffing waste and missed sales hide.

Metrica turns cameras a store already owns into a traffic sensor. Using computer vision, it:

  • Counts visitors entering and exiting the store
  • Measures dwell time per visit
  • Serves real-time dashboards
  • Generates actionable retail KPIs

No new hardware sits on the customer’s cameras, and in the MVP there is no hardware dependency beyond a processing box.

The pipeline runs from a camera stream through to a dashboard:

Camera Stream (RTSP / Video)
Frame Processor
YOLO Object Detection (detect people, filter non-humans)
Multi-Object Tracking (ByteTrack / BoT-SORT — stable IDs)
Event Generator (IN / OUT at a virtual line)
Backend API (FastAPI)
Database (PostgreSQL)
Frontend Dashboard (React)

Step by step:

  1. Video input — the CV worker reads an RTSP stream from an IP camera or NVR. It also accepts a recorded video file (MVP fallback, and the demo/sales path).
  2. Detection — a YOLO model detects people frame by frame and filters out non-human objects.
  3. Tracking — each person gets a temporary track ID held stable across frames, bridging short occlusions.
  4. Counting — a virtual line at the entrance detects direction of travel and emits an IN or OUT event.
  5. Dwell — one IN followed by one OUT forms a session; dwell_time = exit_time − entry_time.
  6. Storage & API — anonymous events (entry timestamp, exit timestamp, dwell time, track ID) are sent to a FastAPI backend and stored in PostgreSQL.
  7. Dashboard — a React web app shows live counters and historical charts.

Primary users

  • Small and medium retail store owners
  • Local shops in Tunisia
  • Franchise stores

Secondary users

  • Shopping malls
  • Supermarkets
  • Security system integrators

The wedge is deliberately narrow: the ICP is a store owner who wants to know how many people came in and when the rush is — a number they cannot get today. Dwell time, heatmaps, and multi-store rollups are upsell, not the opening pitch.

Tier Metrics
Real-time People currently inside the store; current inflow rate
Historical Daily visitor count; peak traffic hours; average dwell time; hourly heatmaps
Future / advanced Staff vs customer estimation; conversion rate (via POS integration); returning-visitor tracking

Privacy is a product feature and a trust line in a surveillance-wary market:

  • No face recognition
  • No personal identity storage
  • Only anonymized tracking data — a track ID is temporary and is not a real identity
  • With edge processing, footage never leaves the store; only tiny JSON events go upstream

In scope

  • Single-camera support
  • YOLO-based person detection
  • Object tracking (ByteTrack / BoT-SORT or equivalent)
  • Entry/exit counting at a virtual line
  • Basic dashboard (live counters + hourly/daily charts)

Out of scope (later phases)

  • Staff recognition AI
  • Multi-camera fusion
  • Face recognition
  • Predictive analytics
  • POS integrations
  • Performance — 5–8 FPS per camera is enough for entrance counting; CPU is sufficient for one camera. (The dashboard-facing spec is deliberately modest; see principles on why chasing 10–15 FPS wastes money.)
  • Latency — 5–10 seconds is acceptable. The dashboard polls REST and the edge batches events over HTTP, so the whole transport layer stays simple request/response.
  • Reliability — recover from camera disconnection and avoid double counting after reconnect (requires a local event queue on the edge box).
  • Scalability — multi-store support is a later phase; the design is horizontal-scaling ready.
  • Cameras are fixed at store entrances.
  • Camera quality and angle are sufficient for person detection.
  • Internet is available, or an edge device can be used.
  • Store owners grant access to the camera stream.

The MVP is successful if:

  • 1–3 pilot stores are deployed
  • Accuracy is >85% in a controlled environment
  • The real-time dashboard is operational
  • There is at least 1 paying customer
Phase Focus
Phase 1 — MVP Single-store deployment; basic analytics dashboard
Phase 2 Multi-store SaaS platform; edge device support
Phase 3 AI insights (predictive analytics); staff behavior modeling; business recommendations engine
  • Glossary — every technical term used in the system
  • Principles — product principles and the honest constraints and risks