Skip to content

Proof Routes, Selftests, and Clean-Room Confirmation

Page Maps

graph LR
  family["Reproducible Research"]
  program["Deep Dive Snakemake"]
  section["Production Operations Policy Boundaries"]
  page["Proof Routes, Selftests, and Clean-Room Confirmation"]
  capstone["Capstone evidence"]

  family --> program --> section --> page
  page -.applies in.-> capstone
flowchart LR
  orient["Orient on the page map"] --> read["Read the main claim and examples"]
  read --> inspect["Inspect the related code, proof, or capstone surface"]
  inspect --> verify["Run or review the verification path"]
  verify --> apply["Apply the idea back to the module and capstone"]

Production confidence should not come from one feeling:

it worked on my machine once.

Module 03 replaces that feeling with proof routes. The idea is simple:

  • choose the smallest honest command for the question
  • keep stronger routes available when the claim grows
  • let the repository explain why each route exists

That is how operational trust becomes teachable.

Not every question needs the strongest proof

A common bad habit is to jump straight to the heaviest confirmation command for every question.

That wastes time and teaches the wrong lesson. The real skill is proportion:

  • dry-run for planning questions
  • profile comparison for policy questions
  • executed verification for contract questions
  • clean-room confirmation for stewardship questions

This module is about learning that ladder, not only memorizing the top rung.

A practical proof ladder for Module 03

1. Dry-run

Use this when the question is:

  • what would run
  • how does one profile differ from another at planning time
  • has the workflow meaning stayed stable at the DAG level

Typical commands:

snakemake --profile profiles/local -n
snakemake --profile profiles/ci -n

2. Profile audit

Use this when the question is:

  • which settings differ across local, CI, and scheduler contexts
  • are those differences policy or semantic drift

Typical command:

make profile-audit

This is a human-review route, not just a run route.

3. Executed verification

Use this when the question is:

  • does the workflow execute and leave behind the expected evidence
  • are the published artifacts complete and aligned

Typical commands in the capstone include make verify and make verify-report.

4. Clean-room confirmation

Use this when the question is:

  • can the repository still prove itself through its strongest built-in route
  • would another maintainer trust the workflow as a working specimen

Typical command:

make confirm

This is not the first proof route. It is the strongest one.

Why selftests matter

A selftest is the repository testing its own workflow contract, not only the scientific or analytical code inside one rule.

Good selftests usually check things like:

  • stable outputs across different core counts
  • workflow lint cleanliness
  • drift detection surfaces
  • basic contract alignment of published artifacts

That makes selftest a workflow-level proof surface.

The capstone selftest teaches the right habit

The capstone selftest compares a normalized published summary across different core counts.

That is valuable because it asks a real production question:

does the workflow keep the same meaning when the execution context changes in an allowed way?

That is stronger than merely asking whether both runs exited successfully.

One helpful mental model

flowchart TD
  question["What are you trying to prove?"] --> dryrun["dry-run or profile diff"]
  dryrun --> audit["profile audit"]
  audit --> verify["executed verification"]
  verify --> confirm["clean-room confirmation"]

The direction here is important:

  • move upward only when the claim changes
  • do not use the strongest route as a substitute for understanding the smaller one

Common proof-route mistakes

Mistake Why it hurts Better repair
using confirm for every small question slows review and hides which claim is actually under test start with the narrowest honest route
trusting dry-run for publish integrity planning alone cannot prove executed contract surfaces escalate to executed verification when the claim requires it
treating selftest as only a program test workflow-level drift stays unchecked include workflow invariants such as determinism or contract alignment
collecting evidence without a review question bundles grow but comprehension stays weak name the question before picking the route
comparing local and CI behavior only by intuition policy drift becomes anecdotal package the difference in a profile-audit surface

The explanation a reviewer trusts

Strong explanation:

dry-run answers the planning question, make profile-audit answers the policy-difference question, make verify answers the executed-contract question, and make confirm answers the strongest clean-room confidence question; each route exists because it proves a different claim.

Weak explanation:

we usually just run the biggest target so we know everything is fine.

The strong version teaches proportion. The weak version teaches ritual.

End-of-page checkpoint

Before leaving this page, you should be able to:

  • name one question that only needs dry-run
  • name one question that needs a profile audit bundle
  • explain why selftests are workflow proofs rather than only program tests
  • describe when clean-room confirmation is proportionate and when it is not