Exercise Answers¶
Page Maps¶
graph LR
family["Reproducible Research"]
program["Deep Dive Snakemake"]
section["Publishing Downstream Contracts"]
page["Exercise Answers"]
capstone["Capstone evidence"]
family --> program --> section --> page
page -.applies in.-> capstone
flowchart LR
orient["Orient on the page map"] --> read["Read the main claim and examples"]
read --> inspect["Inspect the related code, proof, or capstone surface"]
inspect --> verify["Run or review the verification path"]
verify --> apply["Apply the idea back to the module and capstone"]
These answers are model explanations, not the only acceptable wording.
What matters is whether the reasoning protects downstream trust.
Answer 1: Separate internal state from the public bundle¶
What should remain internal:
results/sample-a/qc.jsonresults/sample-a/trim.jsonresults/sample-b/qc.jsonlogs/summarize.log
Why:
These files support workflow execution, detailed inspection, or debugging. They are not yet a stable downstream-facing surface.
What could belong in the published contract:
publish/summary.tsvpublish/report.html
What is still missing:
- a versioned publish boundary such as
publish/v1/ - a manifest and checksums
- provenance
- a clear explanation of artifact roles and stability
The main lesson is that copying a few files into publish/ is not enough by itself.
Answer 2: Decide whether a change is compatible¶
Likely compatible within v1:
- adding a new optional field to
summary.json
Likely versioning discussion:
- renaming the file to
published-summary.json - removing an existing field that consumers may rely on
Why:
An additive field can preserve the old contract if existing consumers still work. Path changes and field removals often alter the public promise directly.
Answer 3: Diagnose a weak integrity surface¶
What the bundle cannot answer well:
- whether the bundle is complete
- whether files were altered or corrupted
- what software context produced the outputs
Why the report is not enough:
- it is a human-readable interpretation surface, not an inventory or identity surface
What to add first:
manifest.jsonwith published paths- checksums for the published files
provenance.jsonfor software-context evidence
The report should remain useful, but it should not be the entire trust story.
Answer 4: Clarify artifact roles¶
A strong response would say:
Scraping HTML is a weak machine contract because presentation markup is not a stable API. Machine consumers should read a structured artifact such as
summary.jsonorsummary.tsv. The report should help humans inspect and interpret the run, not serve as the only machine-facing surface.
Why:
- machine consumers need structured, stable data
- reports are valuable for humans but brittle as programmatic interfaces
Answer 5: Review publish drift¶
Risks:
- the report path change may break downstream users
- the manifest is no longer aligned with the published bundle
- downstream code reading from
results/means the public boundary is not doing enough work
What should be repaired before approval:
- either restore the stable report path or treat the rename as a contract change
- update the manifest so it inventories the real bundle
- repair the publish contract so downstream consumers do not depend on
results/
Does this require versioning discussion, repair, or both:
- both
The manifest mismatch is a repair inside the current review. The path rename and internal file dependency both raise versioning and contract adequacy questions.
Self-check¶
If your answers consistently explain:
- what is internal versus public
- what changed in the contract
- which artifact carries which responsibility
- what evidence supports downstream trust
then you are using the module correctly.