Skip to content

Module 03: Execution Environments as Reproducible Inputs

Page Maps

graph LR
  family["Reproducible Research"]
  program["Deep Dive DVC"]
  section["Execution Environments Reproducible Inputs"]
  page["Module 03: Execution Environments as Reproducible Inputs"]
  capstone["Capstone evidence"]

  family --> program --> section --> page
  page -.applies in.-> capstone
flowchart LR
  orient["Orient on the page map"] --> read["Read the main claim and examples"]
  read --> inspect["Inspect the related code, proof, or capstone surface"]
  inspect --> verify["Run or review the verification path"]
  verify --> apply["Apply the idea back to the module and capstone"]

By Module 03, you already know that stable data identity matters.

The next surprise is harder:

even when code and data look stable, results can still drift because the execution environment is part of the input surface.

This module is about making that invisible boundary visible:

  • environment is not background weather
  • honest runs can diverge without anyone being careless
  • DVC can record some environment-adjacent evidence, but it does not own environment management by itself
  • lockfiles, containers, and CI all solve different parts of the problem

The capstone corroboration surface for this module is the set of files and commands that make the runtime boundary legible: make platform-report, params.yaml, capstone/docs/experiment-guide.md, capstone/docs/architecture.md, and the install/runtime surface in the capstone Makefile.

Why this module exists

Many workflow teams say:

  • the code did not change
  • the data did not change
  • the parameters did not change

and still get different outcomes.

That is where reproducibility gets more honest. The environment is often the missing input surface hiding in plain sight.

This module exists so you stop treating runtime as accidental background and start treating it as part of the system you need to reason about.

Study route

flowchart LR
  overview["Overview"] --> core1["Core 1: environment as input"]
  core1 --> core2["Core 2: determinism is a spectrum"]
  core2 --> core3["Core 3: what DVC records and what it does not"]
  core3 --> core4["Core 4: lockfiles, containers, and CI"]
  core4 --> core5["Core 5: diagnosing environment drift"]
  core5 --> example["Worked example"]
  example --> practice["Exercises and answers"]
  practice --> glossary["Glossary"]

Read the module in that order the first time.

If the problem is already partly clear, use this shortcut:

  • open Core 1 when the main confusion is "why does runtime count as input?"
  • open Core 2 when the main confusion is "how can honest runs still diverge?"
  • open Core 3 when the main confusion is "what part of this is DVC actually helping with?"
  • open Core 4 when the main confusion is "which environment strategy fits which pressure?"
  • open Core 5 when the main confusion is "how do I review or diagnose drift sanely?"

Module map

Page Purpose
Overview explains the module promise and study route
Execution Environment as Part of the Input Surface teaches why runtime belongs in the workflow story
Determinism Is a Spectrum, Not a Switch teaches why divergence can be honest rather than mysterious
What DVC Records Indirectly and What It Does Not Manage teaches DVC's boundary around environments
Lockfiles, Containers, and CI as Environment Strategies teaches the main environment-control approaches and their tradeoffs
Reviewing Environment Drift and Runtime Evidence teaches how to inspect and diagnose runtime differences without guesswork
Worked Example: Explaining a Local versus CI Drift walks through one realistic environment drift story
Exercises gives five mastery exercises
Exercise Answers explains model answers and review logic
Glossary keeps the module vocabulary stable

What should be clear by the end

By the end of this module, you should be able to explain:

  • why the environment is part of the input surface
  • why determinism is often conditional rather than absolute
  • which runtime facts DVC helps surface and which it does not manage directly
  • how lockfiles, containers, and CI differ as environment strategies
  • how to review and diagnose environment drift without superstition

Commands to keep close

These commands form the evidence loop for Module 03:

make -C capstone platform-report
make -C capstone verify
make -C capstone walkthrough

The point is not to collect more output. The point is to tie environment discussion to concrete evidence instead of vague intuition.

Capstone route

Use the capstone only after the runtime boundary already feels concrete.

Best corroboration surfaces for this module:

  • capstone/Makefile
  • capstone/params.yaml
  • capstone/docs/experiment-guide.md
  • capstone/docs/architecture.md
  • make -C capstone platform-report

Useful proof route:

make -C capstone platform-report
make -C capstone walkthrough
make -C capstone verify

The point of that route is to make the capstone's declared runtime surface inspectable, not to pretend the capstone eliminates every environment question by itself.