ProVideo QA Lab · release discipline

CHAOS.

One Go engine, many consumers — vancanalyze grows into the full QA suite, into production probes, into an AI analyst over MCP. Analysis lives once, in the engine. The lab runs on a five-phase release loop, applied to every build.

shipped planned (v1.3) AI analyst external hardware
The discipline — five phases, run as a loop on every release
C
Capture
Every customer game is a regression asset. Session bundles record manifest · segments · marks · events · network pcap · SNMP snapshot. Real games become permanent tests.
Ph 37
H
Harness
Three-tier client farm — real iPads · simulators · headless Go workers — across a two-segment stadium-realistic network. A scripted fault catalog drives the impairment matrix. The environment we test in.
Ph 34 · 36 · 39
A
Assert
Recovery-SLA bounds · VMAF thresholds · frame-loss limits · A/V-sync drift · clip-timing accuracy. Quantified pass/fail — the build fails the gate if any assertion breaches. A bar, not a dashboard.
Ph 38 · 40 · 41
O
Observe
Six correlated layers — Prometheus · JSONL · OpenTelemetry · pcap · VMAF · scenario reports — on one NTP/PTP-synced timeline. Not "what broke?" but "which layer broke first?"
Ph 35
S
Ship
The loop's purpose. CHAOS exists to enable releases, not block them. A test that doesn't tell you whether you can ship isn't a test — it's a number on a dashboard.
Ph 41 · 42
↻ Run on every release — Capture → Harness → Assert → Observe → Ship — and shipped games re-enter as new Capture assets, so the corpus grows with every game. Naming: CHAOS (the discipline) contains ChaosService (the Ph 36 netem / fault-injection service powering Harness).
Shipping timeline — v1.3 QA-lab milestone
34Harness skeleton + control-planeshipped
gRPC contract (proto/qa/v1) + buf breaking gate · ScenarioService YAML-DAG runner · qa-client 500-client headless Go fleet · known-bad corpus.
35Observability + field-test readinessshipped
Six-layer correlated telemetry on a <1 ms NTP-synced timeline; snmp_exporter; field-kit provisioning. Observe pillar.
36ChaosService + Pi netem bridgeshipped
gemodel netem profiles · per-direction IFB · NET-03 NIC gate · CBS350 SPAN/port-toggle · fault catalog · runner-enforced idempotent lab reset. Hardware proofs deferred to Ph 41 HIL.
▸ now — Phase 37 next
37Session bundle + ReportServiceplanned
Portable per-game bundle (manifest + SHAs · segments · marks · ws-events · pcap) — the Capture keystone + replay's input.
38Level-1 metadata replayplanned
Deterministic monotonic-T0 replay through real APIs; 20+ run flake-hunt. The replay keystone.
39iPad QA probe + DeviceControlplanned
ProVideoQAClient (grpc-swift) · DEBUG+admin QA menu · standalone Probe app · Osprey Talon4k control.
40Level-2 SRT replay + VMAFplanned
SRT re-feed through ingest; SHA-256 passthrough / vmaf_v0.6.1neg scoring; per-team report card.
41Release gate + CI + scenario suiteplanned
8-category suite · hosted-smoke vs self-hosted-HIL CI · soak · MCP capstone. The Ship gate.
42QA reporting — TestRail + Jiraplanned
Thin reporters off the verdict stream: TestRail run/result push + Jira auto-defect with dedup lifecycle. Engine stays tool-agnostic.
One engine, many consumers — none re-analyze
① Consumers — humans via gRPC · AI analyst via MCP
shipped
WinUI 3 · Windows
VancUi (27.3); extend with QA panels.
planned
SwiftUI · macOS
grpc-swift; EmbeddedServerManager spawn.
opt
Web / Wails
zero-install cross-platform shell.
capstone
AI analyst
Watches the event stream, explains anomalies, auto-investigates via MCP tools, recommends gated actions. Above the data plane, never in it.
🔁 Flywheel: the deterministic engine catches the known → the agent reasons over the unknown residue → its diagnosis becomes a new fingerprint the engine catches deterministically next time. Replay (Ph 38) trains/validates the agent on real-game bundles offline before it's trusted live.
▲ gRPC (shells) · MCP (agent) ▼
② API surfaces — one engine, two contracts
gRPC · proto/qa/v1 — for shells
VancServiceScenarioServiceTelemetryServiceChaosServiceDeviceControlServiceReplayServiceReportService
MCP surface — for the agent (atop existing video-mcp)
query_eventspull_telemetry_windowrun_tshark_extractfetch / replay_bundlelookup_fingerprintpropose_action · gated
The same investigate-loop a human runs, handed to the agent as tools. Orange chips = shipped services atop VancService.
▲ in-process ▼
③ Go engine — vanc-server + pkg/vancanalyze + QA packages
shipped
Analysis core
vancanalyze — accumulator · triage · events · fingerprint dictionary · report. Source plugins: decklink · file · udp. Agent-authored fingerprints land here, growing the deterministic layer.
34·35·36
Orchestrator · Telemetry · Chaos
Scenario → timed steps → pass/fail · encoder+capture+net+analysis correlated to one timeline · netem (tc) + Cisco port-toggle (SNMP/SSH) + process-kill.
37–40
Replay · Reporter · Device control
Record / deterministic playback · results + VMAF + bundle export + recovery-SLA assertions · Osprey REST as test steps.
40
Quality + ref decoder
Decode self-describing markers (frame# / QR / LTC) → frame-accuracy · A/V-sync ms · clip-trim; freeze / black / macroblock; LUFS / dropout / channel-map; multi-cam drift.
▲ drive / capture ▼
④ Sources · devices · network — lab (scripted) + production (passive)
hw
AJA Corvid 44
SDI playout w/ VANC + capture.
go
talon-sim / aja-vanc-source
synthetic SEI/VANC feeds.
★ built
QA reference source
Self-describing signal: burned-in frame#/clock + 1 Hz flash↔beep now; QR-luma index + LTC next. Synthetic OR real-footage overlay.
hw
Osprey Talon4k
API-controlled; SRT/UDP out.
hw
Pi netem bridge
latency / loss / jitter, per-direction.
hw
Cisco CBS350 ×2
port up/down + SPAN/mirror; two-segment topology.
prod
Pi field probe
Real-game passive capture on SPAN/tap → Telemetry correlator. Wired analog of the Ph 39 iPad probe.
Outputs / artifacts
Session bundle
pcapng + correlated metrics + events + report — the portable record of a run or a real game.
→ Wireshark
manual deep-dive on the same packets; engine hands you the window + a ready-made display filter. Downstream consumer, never wired in.
→ Replay
deterministic playback; trains/validates the agent + regression-tests fingerprints offline (Ph 38 keystone).
→ TestRail / Jira
thin reporters off the verdict stream (scenario_run_id · PASS/FAIL · evidence): TestRail run/result push + Jira auto-defect with dedup lifecycle. Engine stays tool-agnostic. (Ph 42)
Quality asserted — "was the output good?", not just "did the pipe stay up?"
Video integrity
freeze · black · macroblock (no-reference)
Audio
LUFS loudness · dropouts/silence · channel-map
A/V sync
flash↔beep + TC↔LTC, in ms
Multi-cam sync
cross-camera frame-offset under stress
Frame / clip accuracy
frames lost/dup/reordered · clip-trim offset
Recovery SLAs
reconnect · clip-gap · time-to-slate — bounded + asserted
Trust layer: golden-reference ground truth · "test the tester" (inject a known fault, assert it's detected) · game-length soak · trend tracking across runs · CI tiering (software gate vs hardware-in-the-loop).
Capability → engine service → phase
CapabilityEngine service / packagePhase
VANC / trigger analyzeVancService · pkg/vancanalyze27.x ✓
Scenario control-plane (run / headless)ScenarioService + Orchestrator34 ✓
Observability / correlated metricsTelemetryService + correlator35 ✓
Network impairment + packet forensicsChaosService (netem / tshark)36 ✓
Session bundle export (pcapng + report)ReplayService (record) + ReportService37
Metadata replay (keystone)ReplayService (playback)38
iPad QA probeprobe → TelemetryService source39
SRT video replay + quality scoreReplayService + ReportService (VMAF)40
Release-gate CI scenario suiteScenarioService (headless / CI)41
QA reporting / defect integrationTestRail + Jira reporters off the verdict stream42
Encoder control (Osprey)DeviceControlService · osprey-ctlfeeds scenarios
On-the-fly AI analysis / diagnosisAgent over MCP (events + tools)capstone
Self-describing reference signalQA reference source + Quality/ref decoder★ built
A run / a real game = one correlated timeline
configure encoderDeviceControl
start capture + analyze
signal / live feedAJA / probe
fault @ T1injected (lab) / real (game)
⚡ agent flags + diagnoses
recover @ T2
stop → reportpass / fail
encoder counters + capture metrics + network/pcap events + VANC analysis, time-aligned. Engine auto-flags known patterns; the agent reasons over the rest in seconds.
Differentiated value = combination + correlation + repeatability, plus an AI analyst that turns flagged events into plain-language diagnoses — not control of any single box. Discipline that keeps it sound: analysis lives once in the engine; the deterministic core stands alone (the agent is enhancement, never a dependency — which matters when the stadium network is the thing failing); the agent is advisory-by-default live, autonomous only for proven low-risk actions; Wireshark + Pi probes consume the engine's artifacts, they aren't wired into it.