Skip to content

Control Surface Guide

Guide Maps

graph LR
  params["params.yaml"] --> split["split"]
  params --> training["training"]
  params --> decision["decision"]
  split --> compare["Comparable review"]
  training --> compare
  decision --> compare
flowchart LR
  question["What changed and is it still comparable?"] --> params["Read the control surface"]
  params --> metrics["Compare metrics with the declared change"]
  metrics --> publish["Escalate to publish review only if the comparison stays meaningful"]

Use this guide when params.yaml and metrics.json are visible but their review meaning is still too implicit. The goal is to make the capstone's declared control surface and its comparable outcomes explicit.

Declared control surface

Section Meaning Why it matters
split deterministic train/eval partition rules keeps evaluation rows comparable over time
training optimization behavior for the reference model explains why the learned scoring function changed
decision threshold for converting probabilities into escalated predictions explains why promoted predictions and confusion totals changed

Review rules

  • compare metrics only after naming the parameter change that produced them
  • treat threshold changes as decision-policy changes, not model-quality changes alone
  • treat split changes as population changes that can weaken baseline comparability
  • treat training changes as model-behavior changes that still need release-boundary evidence before promotion

Use make threshold-review when the decision threshold is the main pressure and you want to inspect borderline promoted predictions instead of only aggregate metrics.

Best companion guides

  • read EXPERIMENT_GUIDE.md when the next question is how to run and compare a changed candidate
  • read RELEASE_REVIEW_GUIDE.md when the next question is whether a changed candidate deserves downstream trust
  • read PUBLISH_CONTRACT.md when the next question is which control surfaces are preserved in publish/v1/