Skip to content

Product principles

The principles below are what Metrica optimizes for. They are paired with the honest constraints and risks surfaced in the pre-build review — the things that decide whether Metrica becomes a business, most of which are not stack decisions at all.

The core cost and install advantage. Metrica runs on CCTV the store already owns — no new hardware on their cameras. This is a hard line in the sales pitch: incumbents (RetailNext, V-Count, Xovis) require expensive dedicated sensors; Metrica undercuts them by reusing existing infrastructure.

Anonymous counts only. No face recognition, no personal identity storage, only anonymized tracking data. With edge processing, footage never leaves the shop — only tiny JSON events go upstream. In a surveillance-wary market this privacy line is also a trust line, and it belongs on slide one.

The CV worker runs on a box inside the store. This is a product principle, not just an optimization — it is what makes the unit economics work (see the margin trap below). Video stays local; cloud cost stays roughly flat as stores grow.

4. Near-real-time is good enough — keep the plumbing boring

Section titled “4. Near-real-time is good enough — keep the plumbing boring”

A latency budget of seconds-to-minutes collapses the entire transport layer. The dashboard polls REST every few seconds and the edge batches events over HTTP POST. That deletes WebSockets and Redis from the MVP.

The wedge is “how many people came in, and when’s your rush.” Don’t pitch AI analytics. Dwell time, heatmaps, and multi-store are upsell later. Narrow wins.

Small retailers may not value data for its own sake. Sell dollars, not dashboards: “you staffed 3 during a dead hour and 1 during your rush — that’s labor cost and missed sales.”

Metrica is a decision tool for store owners, not a consumer app. Simple, understandable dashboards beat feature breadth.

The pre-build review graded the core choices as sound. What follows is the verified verdict, not marketing.

Area Verdict Note
CV model Keep YOLO11-s + BoT-SORT. Use s, not nano, and quantize.
Backend Keep FastAPI → Render. Same language as the CV worker.
Database + Auth Keep Neon (DB) + Supabase (Auth). Resolved.
Where CV runs Change Edge box, not cloud. Docs that say “on the API host” are wrong.
Real-time layer Change Poll + batched POST. No WebSockets, no Redis in the MVP.
#1 product risk Watch Camera angle — no stack decision fixes it.

This is the risk nobody’s docs mention, and no infrastructure decision fixes it.

Mitigations, cheapest first:

  1. Curate pilots to cameras with a sane angle.
  2. Use head / overhead-oriented weights if the angle is steep.
  3. Fine-tune on a few hundred labeled frames of the real store.

That third mitigation is where the “>85% accuracy” claim is actually won — which is why the first real store clip is the most important asset in the project.

The margin trap: cloud CV prices you out of your own market

Section titled “The margin trap: cloud CV prices you out of your own market”

Edge processing is the answer: because Metrica ships only events, marginal cloud cost per camera is ~$0 and the whole cloud stack runs on free tiers through the first several pilots.

Reliability: the edge box must buffer events

Section titled “Reliability: the edge box must buffer events”

Store internet will drop. Without a local event queue, counts are lost and the “no double-counting after reconnect” requirement breaks. A reconnect-safe local queue on the edge box is the one piece of real-time engineering that matters early.

Don’t over-spec speed — you pay for it per store

Section titled “Don’t over-spec speed — you pay for it per store”

Chasing 10–15 FPS directly raises hardware cost on every box shipped, for near-zero gain.

Lever Naïve / doc What’s actually needed Why it’s fine
FPS 10–15 5–8 A person crosses a doorway over 1–2s; BoT-SORT bridges frames. 15 FPS is 2× compute for ~0 gain.
Resolution 1080p 640px The entrance is close-range; 640 is YOLO’s native size. Full-res is wasted cycles.
Model yolo-n, fp32 yolo-s + INT8 OpenVINO/TensorRT quantization gives 2–4× speedup — turns a $400 box into a $150 one.
Latency <5s target trivially met Count is computed on the box instantly; only edge→cloud push adds sub-second.

Counting should use a two-line tripwire or thin entry zone, not a single line — it kills the loiter/group failure modes cheaply.

Edge hardware: Jetson is the reflex answer and usually wrong

Section titled “Edge hardware: Jetson is the reflex answer and usually wrong”

For entrance counting in a price-sensitive market, per-box cost hits margin directly.

Option ≈ Cost / store Verdict
Pi 5 + Hailo-8 NPU $120–200 Best $/perf. ~13 TOPS at 5 W — plenty, twice over. Default choice.
Mini-PC (Intel N100 + OpenVINO) $150–350 Familiar, easy remote support.
Jetson Orin Nano $250–500 Headroom you won’t use; eats margin.
Store’s own PC $0 Support nightmare. Avoid.

Technical

  • Occlusion — people blocking each other
  • ID switching in tracking
  • Poor camera angles / lighting conditions

Business

  • Low awareness of value in the small-retail market — mitigate by tying the number to money
  • Privacy concerns from store owners — mitigate with the “video never leaves the shop” posture
  • Difficulty accessing CCTV systems

MEA retail analytics is ~$1.26B in 2025, growing ~10%/yr, with an explicit trend toward affordable solutions for smaller retailers — but real connectivity/IT gaps and low data-literacy in less-developed markets. That shapes a narrow, money-anchored, local-support motion.

Pricing to the customer: a hardware/install fee (covers the box) plus $20–50/store/mo SaaS. The upfront box is the adoption barrier — kill it by renting the box bundled into the subscription, or subsidizing it against a 12-month contract.