Hidden State and Undeclared Inputs¶
Page Maps¶
graph LR
family["Reproducible Research"]
program["Deep Dive DVC"]
section["Reproducibility Failures Real Teams"]
page["Hidden State and Undeclared Inputs"]
capstone["Capstone evidence"]
family --> program --> section --> page
page -.applies in.-> capstone
flowchart LR
orient["Orient on the page map"] --> read["Read the main claim and examples"]
read --> inspect["Inspect the related code, proof, or capstone surface"]
inspect --> verify["Run or review the verification path"]
verify --> apply["Apply the idea back to the module and capstone"]
Most fragile workflows do not fail because people forgot the main script.
They fail because influential inputs were present, mattered, and never became part of the recorded story.
That is hidden state.
What counts as hidden state¶
Hidden state is any input, condition, or side effect that influences the result without being made explicit enough for others to inspect and recover.
Typical examples include:
- local data copies that differ from what the repository implies
- environment versions that changed quietly
- notebook cells run in a particular order
- parameters typed into commands but never saved
- helper files or preprocessing steps performed outside the visible workflow
- random seeds, filesystem order, or machine-specific behavior
The details vary. The pattern is stable.
Why hidden state hurts teams¶
Hidden state makes teams ask the wrong question:
what changed?
The harder truth is:
what was influencing the result all along without being recorded?
That is why reproducibility failure often feels mysterious. The workflow is not always breaking in front of you. Sometimes it is only revealing that the original run was never fully described.
A helpful way to sort inputs¶
flowchart TD
obvious["obvious inputs<br/>data files<br/>config files<br/>script arguments"] --> run["workflow run"]
hidden["hidden inputs<br/>environment<br/>manual edits<br/>implicit defaults<br/>machine state"] --> run
run --> result["result and metrics"]
You do not need to fear this diagram. You need to use it.
Every real workflow has both categories at first. The engineering work is deciding which hidden inputs must become explicit.
A small example¶
Imagine this repository:
train.pyconfig.yamldata/raw.csvREADME.mdwith the command
That looks respectable.
But the real run may also depend on:
pandasandscikit-learnversions- a local preprocessing notebook that cleaned the CSV once
- a remembered seed the author passed via CLI
- an untracked feature-selection file under
/tmp - the order in which a folder of inputs was enumerated
The repo did not lie exactly. It simply did not tell the whole story.
Common hidden inputs to look for first¶
Start with these categories:
| Category | Common example |
|---|---|
| data identity | a file path exists, but the exact bytes are not controlled |
| parameters | command-line flags were used but not captured |
| preprocessing | a one-off step happened outside the recorded pipeline |
| environment | library versions or system tools changed |
| execution state | notebook order, temp files, caches, or machine-local defaults |
This list is intentionally ordinary. Hidden state usually lives in mundane places.
Why notebooks amplify the problem¶
Notebooks are valuable, but they make hidden state easier to accumulate:
- cells can run out of order
- intermediate values can live in memory
- edits can happen without a clear input-output contract
- saved outputs can look authoritative even when the path to them is unclear
This is not an anti-notebook argument.
It is an argument for being more honest about the state notebooks often carry.
What "undeclared input" means in practice¶
An undeclared input is not only a missing file in a manifest.
It can also be:
- a parameter that exists only in shell history
- a default threshold buried inside code
- a path assumed by convention but never tracked
- a cloud or local dataset that multiple people call "the same" without byte identity
Once you start seeing this pattern, later DVC concepts have a place to land.
A good first inventory question¶
Ask of any result you care about:
- what files influenced this
- what settings influenced this
- what environment influenced this
- what manual steps influenced this
If any answer starts with "well, usually..." or "we just know that...", the workflow still contains hidden state worth surfacing.
Keep this standard¶
Do not define a workflow only by the script you can see.
Define it by the full set of things that had the power to change the result.
That shift is what lets DVC later act on something real instead of on a simplified story that never matched the work.