Software Provenance, Drift, and Rebuild Evidence¶
Page Maps¶
graph LR
family["Reproducible Research"]
program["Deep Dive Snakemake"]
section["Software Boundaries Reproducible Rules"]
page["Software Provenance, Drift, and Rebuild Evidence"]
capstone["Capstone evidence"]
family --> program --> section --> page
page -.applies in.-> capstone
flowchart LR
orient["Orient on the page map"] --> read["Read the main claim and examples"]
read --> inspect["Inspect the related code, proof, or capstone surface"]
inspect --> verify["Run or review the verification path"]
verify --> apply["Apply the idea back to the module and capstone"]
Software boundaries are only half the story.
The other half is proving what changed when the software changes.
That matters because a workflow can become non-reproducible even when the file graph still looks correct.
If a helper script, package function, environment file, or container image changes, the repository needs a credible way to explain:
- what changed
- which outputs are now questionable
- what should be rebuilt
- what evidence will travel with the rebuilt results
That is the job of software provenance.
Drift does not begin only with input files¶
Learners often first meet Snakemake through file timestamps and declared dependencies.
That is a useful start, but it is incomplete.
Drift can also come from:
- edits to workflow scripts
- changes in package code under
src/ - runtime declaration changes in environment files
- container image revisions
- tool upgrades that alter behavior without changing rule syntax
This is why software boundaries must stay explicit. You cannot review or rebuild what you cannot name.
Reproducibility needs change evidence¶
A strong repository can answer questions like:
- did a software change invalidate published outputs?
- which rules are affected by a helper-code edit?
- what runtime changed between two runs?
- can we explain the software surface that produced this artifact?
Those are not luxury questions. They are the practical questions of trust.
One useful review loop¶
flowchart LR
change["software change"] --> detect["detect affected rules or outputs"]
detect --> rebuild["rebuild what is no longer trustworthy"]
rebuild --> record["record provenance with the new artifact"]
record --> review["review what changed and why"]
This loop is what turns reproducibility from a slogan into a working practice.
What counts as software evidence¶
Good software evidence often includes:
- the git revision or release identifier of the repository
- the environment or container declaration used for execution
- tool or interpreter versions that materially affect behavior
- a provenance artifact that travels with published results
In the capstone, workflow/scripts/provenance.py points toward this idea. A workflow step
can emit an artifact that explains the software context of the publication outputs.
That is much stronger than relying on memory or a comment in a pull request.
File freshness is not enough¶
Weak assumption:
if the inputs are unchanged, the outputs are still trustworthy.
That assumption fails when:
- a script's transformation logic changes
- a library upgrade alters ordering, formatting, or statistical behavior
- a package helper fixes a bug that changes output meaning
The files may look fresh while the meaning is stale.
That is drift.
A stronger practice¶
Stronger shape:
- keep software surfaces explicit enough that a reviewer can name them
- treat code and runtime changes as reasons to re-evaluate output trust
- generate provenance artifacts for outputs that will be shared or published
- use rebuild-oriented review commands when software changes are suspected
In practice, this is where commands such as --list-changes code or broader provenance
checks become valuable. They help the team connect software edits to output risk.
A simple example¶
Imagine this sequence:
src/capstone/reporting.pychanges how summary statistics are rounded.- No input files change.
- Published tables still exist from the earlier run.
If the team only checks input freshness, those tables may appear valid.
If the team treats software as part of workflow meaning, it asks a better question:
which outputs were produced with the old implementation, and where is the evidence for the new build?
That question is the difference between accidental and intentional reproducibility.
Common failure modes¶
| Failure mode | What goes wrong | Better repair |
|---|---|---|
| provenance only records input files | software drift stays invisible | record software context for important outputs |
| helper-code edits are treated as “internal only” | stale publications survive longer than they should | review software edits as output-affecting changes |
| environment updates are merged without rebuild thinking | runs become incomparable | connect runtime changes to rebuild and release review |
| publication artifacts omit software identity | external readers cannot trace the producing surface | emit provenance next to important deliverables |
| teams trust memory over recorded evidence | review becomes anecdotal | prefer generated evidence and versioned declarations |
The explanation a reviewer trusts¶
Strong explanation:
this output was rebuilt because the reporting helper changed under
src/, and the new publication includes updated provenance that records the runtime and repository state.
Weak explanation:
the code changed a bit, but the files looked current.
The first explanation defends trust. The second confuses absence of file churn with absence of semantic drift.
End-of-page checkpoint¶
Before leaving this page, you should be able to:
- explain how software drift differs from input drift
- name at least three kinds of software evidence worth keeping
- describe why published outputs need provenance that includes software context
- explain why unchanged input files do not guarantee trustworthy outputs