Module 06: Generated Files, Multi-Output Rules, and Pipeline Boundaries¶

Module Position¶

flowchart TD
  family["Reproducible Research"] --> program["Deep Dive Make"]
  program --> module["Module 06: Generated Files, Multi-Output Rules, and Pipeline Boundaries"]
  module --> lessons["Lesson pages and worked examples"]
  module --> checkpoints["Exercises and closing criteria"]
  module --> capstone["Related capstone evidence"]

flowchart TD
  purpose["Start with the module purpose and main questions"] --> lesson_map["Use the lesson map to choose reading order"]
  lesson_map --> study["Read the lessons and examples with one review question in mind"]
  study --> proof["Test the idea with exercises and capstone checkpoints"]
  proof --> close["Move on only when the closing criteria feel concrete"]

Read the first diagram as a placement map: this page sits between the course promise, the lesson pages listed below, and the capstone surfaces that pressure-test the module. Read the second diagram as the study route for this page, so the diagrams point you toward the Lesson map, Exercises, and Closing criteria instead of acting like decoration.

Modules 01–05 teach you to write correct builds. Module 06 turns that correctness into a safe content pipeline: generators, manifests, generated headers, bundled outputs, and stage boundaries that keep rebuild behavior truthful even when one command produces many files.

The core question is simple: how do you keep code generation from becoming an excuse for lying to Make?

Capstone exists here as corroboration. The generator playgrounds in this module should make the stale-file story understandable before you inspect the reference build.

Before You Begin¶

This module works best after Modules 01-05, especially the parts on convergence, determinism, and multi-output semantics.

Use this module if you need to learn how to:

introduce generators without hiding truth from the graph
model one command that publishes several coupled outputs
decide when a manifest or stamp is a real boundary instead of a shortcut

At a glance¶

Focus	Learner question	Capstone timing
generators and generated headers	"What exactly makes a generated file stale?"	start after you can already explain single-output rules
multi-output producers	"How do I prevent one command from running twice?"	inspect after the local playground is working
manifests and stamps	"When is a boundary honest instead of a shortcut?"	use capstone only to confirm, not discover, the idea

Proof loop for this module:

make --trace all
make -j2 all
make -q all

Capstone corroboration:

inspect generated-header flow in capstone/Makefile
inspect boundary modeling in capstone/mk/stamps.mk
run make PROGRAM=reproducible-research/deep-dive-make test

If the capstone feels too large here, return to the small generator playground and make the stale-file story legible there first.

1) Table of Contents¶

Table of Contents
Learning Outcomes
How to Use This Module
Core 1 — Generated Files as First-Class Graph Nodes
Core 2 — Multi-Output Producers Without Duplicate Execution
Core 3 — Manifests, Stamps, and Change Boundaries
Core 4 — Code Generation Pipelines and Publication Contracts
Core 5 — Repairing Real Generator Failure Modes
Capstone Sidebar
Exercises
Closing Criteria

2) Learning Outcomes¶

By the end of this module, you can:

model generated files as ordinary graph nodes instead of magical side effects
choose grouped targets, manifests, or principled stamp fallbacks for multi-output work
define a clear publication boundary for code generators and report builders
diagnose stale generated files as graph defects rather than “generator weirdness”
prove that a generator runs exactly when its declared inputs change

Back to top

3) How to Use This Module¶

Build a local generator playground with three surfaces:

project/
  Makefile
  data/
    schema.json
  scripts/
    gen_header.py
    gen_manifest.py
  src/
    main.c
  build/

Use it to practice three cases:

a single generated header
one command that emits multiple files
a manifest or stamp that captures a hidden generator input

The objective is not “it generated files.” The objective is that the graph tells the truth about when those files become stale.

Back to top

4) Core 1 — Generated Files as First-Class Graph Nodes¶

Generated files are not special to Make. They are just targets with inputs, recipes, and publication rules. The most common beginner-to-intermediate mistake is treating generator execution as background behavior instead of graph behavior.

Your contract:

every generated file has a declared producer
every producer declares its semantic inputs
every consumer depends on generated outputs, not on “the generator happened to run”

If dynamic.h depends on schema.json and gen_header.py, declare both. If it also depends on an environment flag, model that with a manifest or convergent stamp.

Back to top

5) Core 2 — Multi-Output Producers Without Duplicate Execution¶

One invocation that creates multiple files is the moment many builds become dishonest. Naive multi-target rules look fine in serial runs and then double-run the producer under pressure.

Use this decision table:

Situation	Correct tool	Why
One invocation produces several coupled outputs	grouped targets `&:`	preserves single-invocation semantics
GNU Make feature not available or outputs are awkward to consume directly	explicit manifest or stamp	models completion as one published fact
Outputs are independent in truth	separate rules	avoids coupled invalidation

The question is never “how do I make this concise?” It is “what is the truthful unit of publication?”

Back to top

6) Core 3 — Manifests, Stamps, and Change Boundaries¶

Not every generator input maps neatly to a file you want downstream targets to consume. Flags, tool versions, schema fingerprints, and selected environment knobs often belong in a modeled boundary file.

Use a manifest or stamp when:

the input affects generator meaning
the downstream consumer should rebuild when that meaning changes
the boundary is easier to reason about as “generator contract changed”

Do not use stamps as an escape hatch for missing edges. A good stamp narrows truth. A bad stamp hides it.

Back to top

7) Core 4 — Code Generation Pipelines and Publication Contracts¶

Generated outputs should appear only after the pipeline succeeds. That rule matters just as much for codegen as it does for binaries.

Publication checklist:

write into a temporary location
validate or complete the full generation step
atomically publish the final output or manifest
clean partial outputs on failure

Once a generator becomes multi-stage, you should be able to point at the exact boundary where downstream targets are allowed to trust the result.

Back to top

8) Core 5 — Repairing Real Generator Failure Modes¶

Failure signatures to practice on purpose:

generated header exists but does not rebuild after a schema change
one command produces two files and runs twice under -j
manifest rebuilds every run because it records unstable data
partial generated output survives a failed invocation
a consumer depends on the generating script instead of the published output

The repair loop is always the same:

reproduce with --trace
identify the missing or dishonest edge
move the truth to a target, manifest, or grouped rule
re-run convergence and serial/parallel checks

Back to top

Use the capstone to inspect:

generated headers under scripts/ and build/include
multi-output modeling decisions in the main Makefile
manifest and contract helpers under mk/stamps.mk and mk/contract.mk
repros that demonstrate generated-header and grouped-output failure modes

Back to top

10) Exercises¶

Build a generator that emits a header plus a manifest, then prove it converges.
Break a multi-output rule so it double-runs under -j, then repair it with grouped targets or a principled stamp.
Add one modeled non-file input, such as a version flag, without forcing unrelated rebuilds.
Prove partial generated files are removed after an induced failure.

Back to top

11) Closing Criteria¶

You pass this module only if you can demonstrate all of the following:

one generator with fully declared semantic inputs
one multi-output producer that runs exactly once per logical change
one manifest or stamp that narrows a real change boundary instead of hiding a dependency
one failure run that leaves no trusted partial outputs behind

Back to top

Directory glossary¶

Use Glossary when you want the recurring language in this module kept stable while you move between lessons, exercises, and capstone checkpoints.

Module 06: Generated Files, Multi-Output Rules, and Pipeline Boundaries¶

Module Position¶

Before You Begin¶

At a glance¶

1) Table of Contents¶

2) Learning Outcomes¶

3) How to Use This Module¶

4) Core 1 — Generated Files as First-Class Graph Nodes¶

5) Core 2 — Multi-Output Producers Without Duplicate Execution¶

6) Core 3 — Manifests, Stamps, and Change Boundaries¶

7) Core 4 — Code Generation Pipelines and Publication Contracts¶

8) Core 5 — Repairing Real Generator Failure Modes¶

9) Capstone Sidebar¶

10) Exercises¶

11) Closing Criteria¶

Directory glossary¶