Recurring Workflow Antipatterns and Recovery¶
Page Maps¶
graph LR
family["Reproducible Research"]
program["Deep Dive Snakemake"]
section["Governance Migration Tool Boundaries"]
page["Recurring Workflow Antipatterns and Recovery"]
capstone["Capstone evidence"]
family --> program --> section --> page
page -.applies in.-> capstone
flowchart LR
orient["Orient on the page map"] --> read["Read the main claim and examples"]
read --> inspect["Inspect the related code, proof, or capstone surface"]
inspect --> verify["Run or review the verification path"]
verify --> apply["Apply the idea back to the module and capstone"]
One sign that a workflow team is maturing is that it stops treating every painful problem as a one-off exception.
Many of the same failures return again and again. Module 10 calls those what they are: recurring anti-patterns.
Why anti-pattern language matters¶
If a repository problem is described only as a local annoyance, it is easy to "fix" the symptom and keep the underlying habit.
Anti-pattern language helps you say:
- this kind of shortcut keeps producing the same trust problem
- this is bigger than one file or one pull request
- recovery means changing the review habit, not only the implementation detail
That is a governance skill, not only a debugging skill.
Five anti-pattern families worth stopping early¶
| Anti-pattern | What it usually looks like | Why it keeps hurting |
|---|---|---|
| hidden inputs | helper code or shell state influences outputs without declared inputs | reruns, trust, and migration reasoning become unreliable |
| policy leaks | profiles or deployment-specific settings alter workflow meaning | context changes become semantic changes in disguise |
| contract drift | downstream users depend on internal results or unstable published paths | migration and compatibility review become guesswork |
| invisible complexity | helper packages or wrappers own more logic than the visible workflow suggests | review moves from repository evidence to insider knowledge |
| evidence suppression | logs, benchmarks, or verification routes disappear when the team feels pressure | the workflow gets quieter and less trustworthy at the same time |
These are worth memorizing because they recur across repositories.
Hidden inputs¶
This anti-pattern appears when real dependencies live outside declared rule inputs:
- helper scripts read undeclared config files
- wrappers inspect environment variables nobody documented
- shell commands depend on working-directory state or side files
The immediate damage is rerun confusion.
The bigger damage is migration confusion, because nobody can confidently say what behavior must be preserved.
Recovery:
- declare the input if it is real
- move the hidden state into visible config or file contracts
- add a proof route that exposes the dependency in review
Policy leaks¶
This happens when profiles or operating context begin to own semantics:
- one profile filters samples differently
- one environment changes published path meaning
- one scheduler-specific setting quietly changes analytical behavior
The repository may still run, but its meaning now depends on context in a way reviewers cannot safely ignore.
Recovery:
- move semantic choices back into workflow or config boundaries
- keep profiles focused on execution policy
- use profile-audit surfaces to compare contexts honestly
Contract drift¶
This anti-pattern appears when the public contract is not enforced strongly enough:
- notebooks read
results/instead ofpublish/v1/ - reports are scraped because structured publish artifacts are unclear
- files get added to published outputs without documentation or verification updates
Recovery:
- strengthen the file API
- repair verification and manifests
- move consumers back onto the public contract
This is often a slower kind of breakage, which makes it easy to ignore until migration time.
Invisible complexity¶
Repositories sometimes stay tidy on the surface by pushing meaning into places reviewers do not naturally inspect:
- a helper package owns discovery logic nobody can see from the rule files
- wrappers become mini-frameworks
- checkpoints and helper layers hide rather than explain workflow shape
Recovery:
- surface the ownership boundary
- document or expose the critical artifacts the helper creates
- simplify the visible route from
Snakefileto public outputs
Invisible complexity is dangerous because the repository can still look clean.
Evidence suppression¶
This is the anti-pattern many teams call "cleanup":
- benchmark files are removed because they look noisy
- verify routes are skipped because the migration is in progress
- logs are shortened until they stop answering real questions
- the team stops generating comparison artifacts because they feel temporary
Recovery:
- restore the smallest honest evidence route
- decide which review question each artifact answers
- remove noise only after a stronger replacement exists
Quieter is not the same thing as clearer.
A small example¶
Imagine a repository where:
- downstream users read
results/ - one cluster profile changes sample filtering
- report generation moved into a package nobody reviews directly
- verification is run only before releases
That is not four unrelated nuisances.
It is one repository with contract drift, policy leaks, invisible complexity, and weak evidence discipline.
The right response is not one patch. It is a recovery plan that tackles the anti-pattern family by family.
Recovery should change the habit, not only the file¶
For each anti-pattern, ask two questions:
- what technical repair is needed now
- what review rule will stop this from quietly coming back
That second question is what turns a fix into stewardship.
Keep this standard¶
When a workflow problem repeats, stop describing it as bad luck.
Name the anti-pattern.
Once you can name the pattern, you can repair both the repository and the review habit that allowed it to persist.