Resource Boundaries and Executor-Proof Workflow Design¶
Page Maps¶
graph LR
family["Reproducible Research"]
program["Deep Dive Snakemake"]
section["Scaling Workflows Interface Boundaries"]
page["Resource Boundaries and Executor-Proof Workflow Design"]
capstone["Capstone evidence"]
family --> program --> section --> page
page -.applies in.-> capstone
flowchart LR
orient["Orient on the page map"] --> read["Read the main claim and examples"]
read --> inspect["Inspect the related code, proof, or capstone surface"]
inspect --> verify["Run or review the verification path"]
verify --> apply["Apply the idea back to the module and capstone"]
As repositories grow, teams often start talking about resources as if they were only a scheduler problem.
That is too weak for a course about workflow boundaries.
Module 04 treats resources as part of the design question:
can the workflow explain its own resource assumptions without being secretly tied to one executor story?
That is what executor-proof workflow design means here.
Resources are a boundary between workflow intent and runtime policy¶
Resource declarations matter because they describe what a rule expects:
- threads
- memory
- disk or temporary space
- runtime shape that another context must still understand
The exact scheduler mapping may vary by executor. The workflow-side meaning should remain coherent.
That is why resource thinking belongs in this module, not only in operations.
Executor-proof does not mean executor-agnostic in every detail¶
The repository is allowed to run in:
- local mode
- CI
- a scheduler-backed context
Those contexts will not look identical.
Executor-proof means something narrower and more useful:
- the workflow still explains what one rule needs
- the interface does not collapse when the executor changes
- policy layers adapt the context without rewriting workflow meaning
That is a much more realistic standard than pretending executors do not matter.
Resource declarations should help humans review the graph¶
A good resource story lets a reviewer say:
- this rule is heavier than that one
- this module boundary aggregates many per-sample jobs
- this executor-facing policy is adapting a known workflow-side claim
If resources live only in one scheduler template or one maintainer's shell habits, the workflow becomes harder to scale safely.
One healthy model¶
flowchart LR
rule["workflow rule"] --> need["threads / memory / temp needs"]
need --> policy["executor or profile policy"]
policy --> run["context-specific execution"]
This picture matters because the workflow should still explain the left side even when the right side changes.
Weak resource design¶
Weak shape:
- one rule has hidden heavy behavior
- no resource story is visible at the workflow boundary
- the scheduler layer compensates through tribal settings
That may run for a while. It does not scale reviewably.
Stronger resource design¶
Stronger shape:
- rule families expose which work is light versus heavy
- the workflow names resource-relevant distinctions where they matter
- executor or profile policy adapts those distinctions for context
This keeps the workflow-side contract readable while still allowing operational variation.
Why this belongs with scaling¶
As repositories grow, resource misunderstandings create the same symptoms as bad modular boundaries:
- surprising failures
- hidden coupling
- context-specific breakage
- fear of changing the repository
That is why resource declarations should be reviewable alongside rule splits, file APIs, and gates.
Common failure modes¶
| Failure mode | What it looks like | Better repair |
|---|---|---|
| rule resource needs live only in scheduler policy | the workflow graph hides which work is expensive | surface workflow-side resource distinctions more clearly |
| resource settings are copied blindly across unrelated rule families | one module boundary stops matching real work shape | name resource-relevant differences per concern |
| executor changes require workflow rewrites | the repository is too tied to one runtime story | keep executor adaptation in policy layers where possible |
| reviewers cannot tell which jobs are heavy | scaling discussions become anecdotal | make heavy boundaries visible in rule organization or contract docs |
| resource tweaks bypass review surfaces | operational drift becomes harder to explain | treat resource changes like other scaling-boundary changes |
The explanation a reviewer trusts¶
Strong explanation:
the workflow distinguishes lightweight orchestration from heavier per-sample processing, and executor-facing policy adapts those resource needs by context; the repository can therefore scale across contexts without hiding which rule families are actually expensive.
Weak explanation:
resources are handled somewhere in the cluster setup, so the workflow does not need to say much.
The strong version keeps the contract visible. The weak version pushes the boundary out of review.
End-of-page checkpoint¶
Before leaving this page, you should be able to:
- explain what executor-proof means in this course
- describe why resource declarations are part of workflow design rather than only scheduler configuration
- name one sign that a repository is hiding resource assumptions badly
- explain how resource review fits beside modularity and interface review