Product principles

The principles below are what Metrica optimizes for. They are paired with the honest constraints and risks surfaced in the pre-build review — the things that decide whether Metrica becomes a business, most of which are not stack decisions at all.

Principles

1. Existing cameras, not new sensors

The core cost and install advantage. Metrica runs on CCTV the store already owns — no new hardware on their cameras. This is a hard line in the sales pitch: incumbents (RetailNext, V-Count, Xovis) require expensive dedicated sensors; Metrica undercuts them by reusing existing infrastructure.

2. Privacy by design

Anonymous counts only. No face recognition, no personal identity storage, only anonymized tracking data. With edge processing, footage never leaves the shop — only tiny JSON events go upstream. In a surveillance-wary market this privacy line is also a trust line, and it belongs on slide one.

3. Edge-first, not cloud-centralized

The CV worker runs on a box inside the store. This is a product principle, not just an optimization — it is what makes the unit economics work (see the margin trap below). Video stays local; cloud cost stays roughly flat as stores grow.

4. Near-real-time is good enough — keep the plumbing boring

A latency budget of seconds-to-minutes collapses the entire transport layer. The dashboard polls REST every few seconds and the edge batches events over HTTP POST. That deletes WebSockets and Redis from the MVP.

5. Sell one number, not a “platform”

The wedge is “how many people came in, and when’s your rush.” Don’t pitch AI analytics. Dwell time, heatmaps, and multi-store are upsell later. Narrow wins.

6. Tie the number to money

Small retailers may not value data for its own sake. Sell dollars, not dashboards: “you staffed 3 during a dead hour and 1 during your rush — that’s labor cost and missed sales.”

7. Operational clarity over features

Metrica is a decision tool for store owners, not a consumer app. Simple, understandable dashboards beat feature breadth.

The stack is (mostly) right

The pre-build review graded the core choices as sound. What follows is the verified verdict, not marketing.

Area	Verdict	Note
CV model	Keep	YOLO11-s + BoT-SORT. Use s, not nano, and quantize.
Backend	Keep	FastAPI → Render. Same language as the CV worker.
Database + Auth	Keep	Neon (DB) + Supabase (Auth). Resolved.
Where CV runs	Change	Edge box, not cloud. Docs that say “on the API host” are wrong.
Real-time layer	Change	Poll + batched POST. No WebSockets, no Redis in the MVP.
#1 product risk	Watch	Camera angle — no stack decision fixes it.

Honest constraints and risks

The #1 product risk: camera angle

This is the risk nobody’s docs mention, and no infrastructure decision fixes it.

Mitigations, cheapest first:

Curate pilots to cameras with a sane angle.
Use head / overhead-oriented weights if the angle is steep.
Fine-tune on a few hundred labeled frames of the real store.

That third mitigation is where the “>85% accuracy” claim is actually won — which is why the first real store clip is the most important asset in the project.

The margin trap: cloud CV prices you out of your own market

Edge processing is the answer: because Metrica ships only events, marginal cloud cost per camera is ~$0 and the whole cloud stack runs on free tiers through the first several pilots.

Reliability: the edge box must buffer events

Store internet will drop. Without a local event queue, counts are lost and the “no double-counting after reconnect” requirement breaks. A reconnect-safe local queue on the edge box is the one piece of real-time engineering that matters early.

Don’t over-spec speed — you pay for it per store

Chasing 10–15 FPS directly raises hardware cost on every box shipped, for near-zero gain.

Lever	Naïve / doc	What’s actually needed	Why it’s fine
FPS	10–15	5–8	A person crosses a doorway over 1–2s; BoT-SORT bridges frames. 15 FPS is 2× compute for ~0 gain.
Resolution	1080p	640px	The entrance is close-range; 640 is YOLO’s native size. Full-res is wasted cycles.
Model	yolo-n, fp32	yolo-s + INT8	OpenVINO/TensorRT quantization gives 2–4× speedup — turns a $400 box into a $150 one.
Latency	<5s target	trivially met	Count is computed on the box instantly; only edge→cloud push adds sub-second.

Counting should use a two-line tripwire or thin entry zone, not a single line — it kills the loiter/group failure modes cheaply.

Edge hardware: Jetson is the reflex answer and usually wrong

For entrance counting in a price-sensitive market, per-box cost hits margin directly.

Option	≈ Cost / store	Verdict
Pi 5 + Hailo-8 NPU	$120–200	Best $/perf. ~13 TOPS at 5 W — plenty, twice over. Default choice.
Mini-PC (Intel N100 + OpenVINO)	$150–350	Familiar, easy remote support.
Jetson Orin Nano	$250–500	Headroom you won’t use; eats margin.
Store’s own PC	$0	Support nightmare. Avoid.

Business & technical risks (from the PRD)

Technical

Occlusion — people blocking each other
ID switching in tracking
Poor camera angles / lighting conditions

Business

Low awareness of value in the small-retail market — mitigate by tying the number to money
Privacy concerns from store owners — mitigate with the “video never leaves the shop” posture
Difficulty accessing CCTV systems

Market context

MEA retail analytics is ~$1.26B in 2025, growing ~10%/yr, with an explicit trend toward affordable solutions for smaller retailers — but real connectivity/IT gaps and low data-literacy in less-developed markets. That shapes a narrow, money-anchored, local-support motion.

Pricing to the customer: a hardware/install fee (covers the box) plus $20–50/store/mo SaaS. The upfront box is the adoption barrier — kill it by renting the box bundled into the subscription, or subsidizing it against a 12-month contract.

Product overview — problem, solution, and MVP scope
Glossary — every technical term used in the system