Module 06: Publishing and Downstream Contracts¶
Page Maps¶
graph LR
family["Reproducible Research"]
program["Deep Dive Snakemake"]
section["Publishing Downstream Contracts"]
page["Module 06: Publishing and Downstream Contracts"]
capstone["Capstone evidence"]
family --> program --> section --> page
page -.applies in.-> capstone
flowchart LR
orient["Orient on the page map"] --> read["Read the main claim and examples"]
read --> inspect["Inspect the related code, proof, or capstone surface"]
inspect --> verify["Run or review the verification path"]
verify --> apply["Apply the idea back to the module and capstone"]
A workflow run is not automatically a downstream contract.
That distinction matters because a workflow usually produces many files that help it run, debug, or explain itself, but only a smaller set of files should be safe for another human, notebook, or pipeline to trust as published outputs.
This module is about drawing that line on purpose.
You will learn how to:
- separate internal workflow state from public deliverables
- version a publish boundary so downstream expectations stay stable
- use manifests, checksums, reports, and provenance without confusing their jobs
- review publish drift before downstream trust is damaged
The capstone corroboration surface for this module is the versioned bundle under
publish/v1/, especially summary.json, summary.tsv, report/index.html,
manifest.json, provenance.json, and discovered_samples.json.
Why this module exists¶
Many workflows fail at the exact moment they seem most useful: when someone tries to use their outputs downstream.
Typical failure patterns look like this:
- downstream consumers read from
results/because no publish contract is visible - report files are treated as the contract even though they are meant for humans
- manifests exist but do not clearly defend the bundle
- published paths drift without a deliberate version change
This module repairs those problems by teaching publish surfaces as contracts, not as accidental folders.
Study route¶
flowchart LR
overview["Overview"] --> core1["Core 1: internal vs public"]
core1 --> core2["Core 2: versioned publish boundary"]
core2 --> core3["Core 3: manifest and integrity evidence"]
core3 --> core4["Core 4: machine vs human publish surfaces"]
core4 --> core5["Core 5: publish drift review"]
core5 --> example["Worked example"]
example --> practice["Exercises and answers"]
practice --> glossary["Glossary"]
Read the module in that order if the publish boundary still feels fuzzy.
If you already know the basic problem, use this shortcut:
- open Core 2 if your question is mostly about versioning and compatibility
- open Core 3 if your question is mostly about manifests, checksums, and validation
- open Core 5 if your question is mostly about review and downstream risk
Module map¶
| Page | Purpose |
|---|---|
| Overview | explains the module promise and study route |
| Internal Results versus Public Contracts | teaches the first and most important split |
| Versioned Publish Boundaries and Compatible Change | teaches versioning, compatibility, and path stability |
| Manifests, Checksums, and Bundle Integrity | teaches how a publish bundle defends itself |
| Reports, File APIs, and Human versus Machine Surfaces | teaches which artifacts are for machines, humans, or both |
| Reviewing Publish Drift and Downstream Risk | teaches how to review publish changes before trust is lost |
| Worked Example: Promoting Results into a Versioned Publish Bundle | walks through a concrete publish-boundary design |
| Exercises | gives five mastery exercises |
| Exercise Answers | explains model answers and review logic |
| Glossary | keeps the module vocabulary stable |
What should be clear by the end¶
By the end of this module, you should be able to explain:
- why
results/andpublish/v1/are different promises - when a publish change requires a version change
- why a manifest is not the same thing as a report
- how provenance and checksums support downstream trust
- how to review a published bundle as a contract rather than a convenience folder
Capstone route¶
Use the capstone only after the local module ideas are already legible.
Best corroboration surfaces for this module:
capstone/workflow/rules/summarize_report.smkcapstone/workflow/rules/publish.smkcapstone/workflow/contracts/FILE_API.mdcapstone/publish/v1/- Capstone Review Worksheet
Useful proof route:
snakemake -n
snakemake publish/v1/manifest.json
python scripts/verify_publish.py --publish publish/v1
The point of that route is not just to run the workflow. It is to inspect whether the published bundle still deserves downstream trust.