Discovery, Namespacing, and Repository Growth¶

Page Maps¶

graph LR
  family["Reproducible Research"]
  program["Deep Dive Make"]
  section["Build Architecture Layered Includes Apis"]
  page["Discovery, Namespacing, and Repository Growth"]
  capstone["Capstone evidence"]

  family --> program --> section --> page
  page -.applies in.-> capstone

flowchart LR
  orient["Orient on the page map"] --> read["Read the main claim and examples"]
  read --> inspect["Inspect the related code, proof, or capstone surface"]
  inspect --> verify["Run or review the verification path"]
  verify --> apply["Apply the idea back to the module and capstone"]

Small Make builds can survive with informal naming and casual discovery.

Larger builds usually cannot.

As a repository grows, the first architectural failures often look like this:

source discovery order changes unexpectedly
target names collide across subsystems
output paths stop communicating ownership
every component invents its own naming conventions
the build slowly drifts toward disconnected local truths

This page is about the rules that keep a growing repository from becoming structurally ambiguous.

The sentence to keep¶

When a build starts growing, ask:

if another component adds files tomorrow, will the current discovery and naming rules keep the graph stable and the ownership legible?

That question is what turns growth into architecture instead of improvisation.

Discovery is an architectural choice¶

Discovery often begins innocently:

SRCS := $(wildcard src/*.c)

That may be enough for a tiny project. As soon as the tree grows, discovery rules become a policy decision:

which directories count as source roots
which file patterns are included
whether the result order is stable
how multiple components avoid stepping on each other

This is why Module 07 treats discovery as architecture rather than convenience.

Rooted discovery beats wandering discovery¶

One of the healthiest habits for larger repositories is rooted discovery:

APP_SRCS := $(sort $(wildcard src/app/*.c))
LIB_SRCS := $(sort $(wildcard src/lib/*.c))

This is better than broad, wandering discovery because it says:

these directories are part of the build contract
files outside them do not accidentally become build inputs
order is stabilized intentionally

That makes the build easier to review and easier to extend.

Sorting is part of graph stability¶

Discovery order matters more than many people expect.

If source lists affect:

object ordering
link ordering
generated manifests
grouped publication sets

then unstable ordering can change the graph or the artifact shape even when the repository meaning has not changed.

That is why this module keeps reinforcing the same habit:

SRCS := $(sort $(wildcard src/**/*.c))

The sort is not cosmetic. It is a small determinism fence.

Namespacing prevents target collisions¶

As repositories grow, collisions become a real risk:

two components both want build/main.o
two generated reports both want summary.txt
two packages both want dist.tar.gz

This is a naming problem before it becomes a runtime problem.

Healthy namespacing often looks like:

APP_OBJS := $(patsubst src/app/%.c,build/app/%.o,$(APP_SRCS))
LIB_OBJS := $(patsubst src/lib/%.c,build/lib/%.o,$(LIB_SRCS))

Now the path communicates ownership:

build/app/... belongs to the app component
build/lib/... belongs to the library component

That makes both debugging and code review easier.

Output paths should tell a story¶

A good output path often answers one or more of these questions:

which component owns it
which stage produced it
whether it is generated, packaged, or final

For example:

build/app/main.o
build/generated/api.h
dist/app-linux.tar.gz

These names are useful because they communicate structure without extra commentary.

That is a big architectural advantage over flat output directories where every target must be disambiguated mentally.

One repository does not require many disconnected DAGs¶

When growth becomes painful, teams often reach for recursive separation too early:

make -C app
make -C lib
make -C tools

Sometimes that boundary is real. Often it is a way of avoiding the harder work of naming and discovery discipline inside one coherent DAG.

This page is not saying recursion is always wrong. It is saying:

if the repository still wants one coherent build truth, then namespacing and rooted discovery are usually safer first moves than splitting reality into disconnected local graphs.

A small growth example¶

Suppose the repository grows from:

src/
  main.c
  util.c

to:

src/
  app/main.c
  app/util.c
  lib/main.c
  lib/io.c

The old object mapping:

OBJS := $(patsubst src/%.c,build/%.o,$(SRCS))

may now create collisions or confusing paths.

A healthier growth-aware mapping is:

APP_SRCS := $(sort $(wildcard src/app/*.c))
LIB_SRCS := $(sort $(wildcard src/lib/*.c))

APP_OBJS := $(patsubst src/app/%.c,build/app/%.o,$(APP_SRCS))
LIB_OBJS := $(patsubst src/lib/%.c,build/lib/%.o,$(LIB_SRCS))

The mapping grew with the repository instead of pretending the old flat model still explains ownership.

Discovery policy should be visible, not accidental¶

The build should make it easy to answer:

where do sources come from
why are these files included
how would a new component join the system

If the answer is "search around until you notice the right wildcard pattern," the architecture is already too implicit.

This is why some repositories keep discovery policy in a dedicated layer such as mk/objects.mk or mk/discovery.mk. It gives the rule a home and a review boundary.

Naming conventions are part of the public architecture¶

Developers often think namespacing is only a low-level path concern. It also affects the human-facing architecture:

target names
output directories
bundle names
generated artifact names

Good conventions reduce explanation cost. Bad conventions create translation work in every conversation.

For example, a repository is easier to extend if components follow one target naming model instead of each subsystem inventing its own build vocabulary.

Failure signatures worth recognizing¶

"The same target name means different things in different places"¶

That usually means the naming system is too flat or too informal.

"Adding a new component changed the order of unrelated outputs"¶

That often points to unstable discovery or unsorted lists.

"We had to split into sub-builds because naming collisions got out of hand"¶

That may indicate architecture pressure that namespacing could have solved earlier.

"A newcomer cannot tell where to add files for a new subsystem"¶

That means discovery policy is not visible enough.

A review question that improves growth discipline¶

Take one growing repository and ask:

where discovery is rooted
whether discovered lists are sorted
how output paths communicate ownership
which target names might collide as more components are added
whether one coherent DAG is still possible with better naming discipline

If those answers are weak, the repository is likely to get harder to reason about quickly.

What to practice from this page¶

Choose one Make-based repository that has more than one subsystem and write down:

the discovery roots
the naming pattern for outputs
one likely future collision
one namespacing improvement
one place where a broad wildcard should become a rooted, sorted rule

If you can do that cleanly, you are thinking about growth as an architectural problem instead of a cleanup chore.

End-of-page checkpoint¶

Before leaving this lesson, make sure you can explain:

why discovery rules are part of architecture
why rooted and sorted discovery protects stability
how namespacing keeps ownership visible
why output paths should communicate component or stage identity
how repository growth can often be handled inside one coherent DAG before jumping to disconnected builds