Closures & Partials¶
Page Maps¶
graph LR
family["Python Programming"]
program["Python Functional Programming"]
section["Data First Apis Expression Style"]
page["Closures & Partials"]
capstone["Capstone evidence"]
family --> program --> section --> page
page -.applies in.-> capstone
flowchart LR
orient["Orient on the page map"] --> read["Read the main claim and examples"]
read --> inspect["Inspect the related code, proof, or capstone surface"]
inspect --> verify["Run or review the verification path"]
verify --> apply["Apply the idea back to the module and capstone"]
This lesson matters when a team wants several variants of the same pipeline and the first instinct is to hide the choice in globals, environment reads, or mutable defaults. The diagram shows the safer move: build the variant once, keep the captured configuration visible, and review the resulting function as a normal value.
Start With the Design Smell¶
The hard part is usually not the syntax of partial. The hard part is knowing when a configurator is cleaner than "just set a variable somewhere and call the function later."
- If changing chunk size means editing module-level state, the design is already harder to test.
- If two pipeline variants share code but not configuration, the missing concept is usually a configurator.
- If a reviewer cannot tell what behavior is fixed at construction time, the capture is too implicit.
Keep This Question In View¶
Core question:
How do closures and partial application create pure configurators that capture immutable config to produce reusable, deterministic variants of RAG pipelines without globals or mutable defaults?
This lesson introduces pure configurators in the practical sense you need:
- treat configuration as explicit immutable data that can be captured once and inspected later
- build variants by returning new callables instead of mutating shared state
- keep the core deterministic so the configured function still behaves like a normal pure function
The running FuncPipe examples make one idea concrete: configuration should become a value that shapes behavior, not a hidden force that changes behavior from a distance.
Use this when you have pure pipelines that need configurable variants such as different chunk sizes or keep rules, but globals or mutable defaults still feel tempting.
Outcome:
1. Spot globals or mutable defaults in config and explain why they break determinism.
2. Refactor a configurable impure function to pure using closures/partials.
3. Write a Hypothesis property providing strong evidence of equivalence to Module 1, including a shrinking example.
Runnability Note (Module 01 Snapshot vs Module 02 End-State)¶
This core includes two kinds of snippets:
1) Runnable against the end-of-Module-02 codebase (this checkout)
These use the real APIs in capstone/src/funcpipe_rag/ (e.g., RagConfig, make_rag_fn, full_rag_api_docs, iter_rag_core).
2) Hypothetical pre-refactor snippets (illustration only)
These are intentionally “bad” or “in-between” states used to teach refactoring. They are not meant to match a real snapshot 1:1. They are labeled as Hypothetical pre-refactor and are refactored into the real Module 02 API across this module.
If you want a real, runnable Module 01 codebase, refresh the generated history route first:
make PROGRAM=python-programming/python-functional-programming history-refresh- Module 01 path:
capstone/_history/worktrees/module-01/ - Import path for Module 01:
capstone/_history/worktrees/module-01/src/(usePYTHONPATHwhen running examples there)
Module 01 uses the same import name (import funcpipe_rag), so run it from the Module 01 worktree (or set PYTHONPATH) to avoid mixing versions.
We refactor the hypothetical pre-refactor shapes into the real Module 02 API across this module.
1. Conceptual Foundation¶
1.1 The One-Sentence Rule¶
Use closures/partials for pure configurators with immutable capture only; avoid globals, env vars, or mutable defaults to preserve pipeline determinism and equivalence.
1.2 Closures & Partial Application in One Precise Sentence¶
Closures capture immutable config to produce pure customized functions; partial application fixes arguments for reusable variants—ensuring deterministic, composable behavior from explicit data only.
1.3 Why This Matters Now¶
Module 01 showed how to protect purity inside one function. This lesson shows how to preserve that discipline when the same behavior needs several named variants. Without configurators, teams often add a global setting, a mutable default, or an environment lookup. The code still looks small, but reasoning gets worse because behavior now depends on hidden history instead of just inputs.
1.4 Configurators as Values in 5 Lines¶
The key move is simple: build the configured function once, then pass that function around like data.
from functools import partial
from collections.abc import Callable
from funcpipe_rag import CleanDoc, ChunkWithoutEmbedding, RagEnv
def chunk_doc(doc: CleanDoc, env: RagEnv) -> list[ChunkWithoutEmbedding]:
text = doc.abstract
step = env.chunk_size
return [
ChunkWithoutEmbedding(
doc.doc_id,
text[i: i + step],
i,
i + len(text[i: i + step]),
)
for i in range(0, len(text), step)
]
def make_env(chunk_size: int) -> RagEnv:
return RagEnv(chunk_size)
variants: dict[str, Callable[[CleanDoc], list[ChunkWithoutEmbedding]]] = {
"small": partial(chunk_doc, env=make_env(256)),
"medium": partial(chunk_doc, env=make_env(512)),
"large": partial(chunk_doc, env=make_env(1024)),
}
def run_variant(key: str, doc: CleanDoc) -> list[ChunkWithoutEmbedding]:
return variants[key](doc)
Because the partial is pure (immutable env, no globals), we can safely store it in a dict, pass it around, and test it in isolation—just like data.
2. Mental Model: Globals vs Closures/Partials¶
2.1 One Picture¶
Globals (Hidden Deps) Closures/Partials (Explicit Capture)
+-----------------------+ +------------------------------+
| global ENV / CFG | | make_rag_fn(env, cfg) |
| ↓ | | ↓ |
| rag(docs) → chunks | | rag_fn(docs) → chunks |
| (hidden config) | | (fixed config) |
+-----------------------+ +------------------------------+
↑ Flaky / Non-deterministic ↑ Deterministic / Testable
2.2 Contract Table¶
| Aspect | Globals / Mutable Defaults | Closures / Partials |
|---|---|---|
| Dependencies | Hidden globals, env vars | Explicit, immutable capture |
| Determinism | Breaks (outputs vary) | Safe (same config = same) |
| Testing | Flaky (depends on history) | Local reasoning (properties) |
| Composability | Races / scattered config | Freedom from shared state |
| Mutable Defaults in Partials | Breaks Determinism | Use frozen dataclasses or immutable types for configs |
Note on Globals Choice: Rarely, for fixed scripts with no variants, use explicit params; but prefer configurators for reuse.
3. Running Project: FuncPipe RAG Builder¶
Our running project (from m02-rag.md) is extending the pure RAG pipeline from Module 1 to add configurability.
- Dataset: 10k arXiv CS abstracts (arxiv_cs_abstracts_10k.csv).
- Goal: Configurable cleaning, chunking, etc., into Chunk list (dedup added later).
- Start: Hypothetical pre-refactor with globals (core1_start.py, illustration only).
- End (this core): Pure configurator with explicit capture, preserving equivalence.
3.1 Types (End-of-Module-02, Used by Configurators)¶
In the end-of-Module-02 codebase, cleaner configuration is represented as frozen data:
from funcpipe_rag import CleanConfig, make_cleaner
cfg = CleanConfig(rule_names=("strip", "lower", "collapse_ws"))
cleaner = make_cleaner(cfg)
3.2 Impure Start (Anti-Pattern)¶
This is a hypothetical pre-refactor example used for contrast. It is intentionally not meant to match a real snapshot 1:1, and it is not intended to be run in the end-of-Module-02 checkout.
# core1_start.py (hypothetical pre-refactor; illustration only)
from funcpipe_rag import RawDoc, CleanDoc, Chunk, RagEnv
from funcpipe_rag import chunk_doc, embed_chunk, structural_dedup_chunks
GLOBAL_ENV = RagEnv(512) # BAD: hidden dependency (breaks determinism contract)
MUTABLE_CFG = {"rules": [str.strip, str.lower]} # BAD: shared mutable default (breaks determinism contract)
def impure_full_rag(docs: list[RawDoc], cfg: dict = MUTABLE_CFG) -> list[Chunk]:
global GLOBAL_ENV
def impure_cleaner(d: RawDoc) -> CleanDoc:
abstract = d.abstract
for r in cfg["rules"]:
abstract = r(abstract)
return CleanDoc(d.doc_id, d.title, abstract, d.categories)
cleaned = [impure_cleaner(d) for d in docs]
chunked = [c for doc in cleaned for c in chunk_doc(doc, GLOBAL_ENV)]
embedded = [embed_chunk(c) for c in chunked]
return structural_dedup_chunks(embedded)
# Usage: Non-deterministic due to globals
docs: list[RawDoc] = [RawDoc("cs-123", "Title", "Abstract text...", "cs.AI")]
chunks1 = impure_full_rag(docs)
chunks2 = impure_full_rag(docs)
# May differ if GLOBAL_ENV mutated externally
Smells: Globals (GLOBAL_ENV), mutable defaults (MUTABLE_CFG as shared dict).
Problem: impure_full_rag(docs) depends on hidden state; can't substitute without replaying globals.
4. Refactor to Pure: Explicit Capture¶
4.1 Pure Configurator¶
Pass all config explicitly; capture in closures/partials.
# Pure refactor: Explicit capture
from funcpipe_rag import CleanConfig, make_rag_fn
# `make_rag_fn` is the canonical configurator in this repo:
# it captures frozen config in a closure (no globals).
cfg = CleanConfig()
rag_fn = make_rag_fn(chunk_size=512, clean_cfg=cfg)
chunks1, obs1 = rag_fn(docs)
chunks2, obs2 = rag_fn(docs)
assert chunks1 == chunks2 and obs1 == obs2
Wins: No globals/mutables; explicit capture. Matches Module 1 when defaults used.
Note: Defaults are expressed as data (DEFAULT_RULES, DEFAULT_CLEAN_CONFIG) and captured immutably into RagConfig.
4.2 Before-and-After Refactoring Snippet¶
To cement the transition from globals to closures, here's an explicit mini-example showing the "ugly before" with a global and the "clean after" using a closure:
# Before: Ugly global config
from funcpipe_rag import CleanDoc, ChunkWithoutEmbedding, RagEnv
GLOBAL_ENV = RagEnv(chunk_size=512) # BAD: hidden dependency (breaks determinism contract)
def chunk_doc(doc: CleanDoc) -> list[ChunkWithoutEmbedding]:
# Implicitly uses global
text = doc.abstract
step = GLOBAL_ENV.chunk_size
chunks: list[ChunkWithoutEmbedding] = []
for start in range(0, len(text), step):
segment = text[start:start + step]
chunks.append(ChunkWithoutEmbedding(doc.doc_id, segment, start, start + len(segment)))
return chunks
# After: Pure closure with explicit config
def make_chunk_doc(env: RagEnv) -> Callable[[CleanDoc], list[ChunkWithoutEmbedding]]:
def chunk_doc(doc: CleanDoc) -> list[ChunkWithoutEmbedding]:
text = doc.abstract
step = env.chunk_size
chunks: list[ChunkWithoutEmbedding] = []
for start in range(0, len(text), step):
segment = text[start:start + step]
chunks.append(ChunkWithoutEmbedding(doc.doc_id, segment, start, start + len(segment)))
return chunks
return chunk_doc
# Usage: Deterministic and testable
chunk_fn = make_chunk_doc(RagEnv(chunk_size=512))
This refactor eliminates hidden dependencies, making the function pure and easier to test—same inputs always yield the same outputs.
4.3 Pure Partial for Rules¶
# RulesConfig configurator (end-of-Module-02)
from funcpipe_rag import All, LenGt, RulesConfig, StartsWith, make_rag_fn
def keep_categories(prefix: str) -> RulesConfig:
"""Pure configurator: capture a prefix into an immutable RulesConfig."""
return RulesConfig(keep_pred=StartsWith("categories", prefix))
cs_keep = keep_categories("cs.")
cs_long_keep = RulesConfig(keep_pred=All((StartsWith("categories", "cs."), LenGt("abstract", 500))))
# Usage: Variant with filtering (RulesConfig is the canonical keep type in Module 02)
rag_filtered = make_rag_fn(chunk_size=512, clean_cfg=cfg, keep=cs_keep)
filtered_chunks, _ = rag_filtered(docs)
Wins: Fixes prefix explicitly; composable. Enables configurable filtering without globals.
5. Equational Reasoning: Substitution Exercise¶
Hand Exercise: Replace expressions in make_rag_fn.
1. Inline env = RagEnv(512) → fixed value.
2. Substitute into partial → fixed call.
3. Result: rag_fn(docs) = fixed value for fixed inputs.
Bug Hunt: In impure version, substitution fails (depends on globals).
6. Property-Based Testing: Providing Strong Evidence of Equivalence (Advanced, Optional)¶
Use Hypothesis to provide strong evidence that the refactor preserved behavior. Verify all patterns with Hypothesis—examples provided show how to detect impurities like globals or non-determinism.
6.1 Custom Strategy (RAG Domain)¶
From capstone/tests/conftest.py (as in Module 1).
6.2 Equivalence Property¶
# capstone/tests/test_rag_api.py
from hypothesis import given
import hypothesis.strategies as st
from tests.conftest import doc_list_strategy
from funcpipe_rag import (
RagEnv,
clean_doc,
embed_chunk,
iter_chunk_doc,
structural_dedup_chunks,
CleanConfig,
make_rag_fn,
)
def baseline_full_rag(docs, env):
embedded = [embed_chunk(c) for d in docs for c in iter_chunk_doc(clean_doc(d), env)]
return structural_dedup_chunks(embedded)
@given(docs=doc_list_strategy(), chunk_size=st.integers(128, 1024))
def test_configurator_parity(docs, chunk_size):
rag_fn = make_rag_fn(chunk_size=chunk_size, clean_cfg=CleanConfig())
new_chunks, _ = rag_fn(docs)
old_chunks = baseline_full_rag(docs, RagEnv(chunk_size))
assert new_chunks == old_chunks
Note: Property focuses on equivalence (same chunks); assumes no rules/taps.
6.3 Determinism Property¶
@given(docs=doc_list_strategy(), chunk_size=st.integers(128, 1024))
def test_configurator_deterministic(docs, chunk_size):
rag_fn = make_rag_fn(chunk_size=chunk_size, clean_cfg=CleanConfig())
assert rag_fn(docs) == rag_fn(docs)
6.4 Shrinking Demo: Catching a Bug¶
Bad refactor (uses global):
from collections.abc import Callable
from funcpipe_rag import CleanConfig, RagEnv, clean_doc, embed_chunk, iter_chunk_doc, structural_dedup_chunks
def bad_make_rag_fn(
chunk_size: int, clean_cfg: CleanConfig
) -> Callable[[list], list]:
global GLOBAL_ENV
GLOBAL_ENV = RagEnv(chunk_size)
cleaner = make_cleaner(clean_cfg)
def run(docs):
embedded = [embed_chunk(c) for d in docs for c in iter_chunk_doc(cleaner(d), GLOBAL_ENV)]
return structural_dedup_chunks(embedded)
return run
Property (swapped to bad_make_rag_fn):
@given(
docs=doc_list_strategy(),
chunk_size1=st.integers(128, 1024),
chunk_size2=st.integers(128, 1024),
)
def test_bad_configurator_env_sensitive(docs, chunk_size1, chunk_size2):
global GLOBAL_ENV
GLOBAL_ENV = RagEnv(chunk_size1) # Simulate prior state
rag_fn = bad_make_rag_fn(chunk_size2, CleanConfig())
pure = make_rag_fn(chunk_size=chunk_size2, clean_cfg=CleanConfig())
pure_chunks, _ = pure(docs)
assert rag_fn(docs) == pure_chunks # Fails when chunk_size1 != chunk_size2 (global sensitivity)
Hypothesis failure trace (run to verify; example with differing sizes):
Falsifying example: test_bad_configurator_env_sensitive(
docs=[RawDoc(doc_id='a', title='', abstract='a', categories='')],
chunk_size1=128,
chunk_size2=129,
)
AssertionError
- Shrinks to minimal doc; catches reliance on global because the test models external state changes. This deliberately reintroduces the global GLOBAL_ENV and shows how Hypothesis exposes the hidden dependency, by comparing against the pure make_rag_fn.
7. When Configurators Aren't Worth It¶
Rarely, for one-off calls with fixed config (e.g., scripts), pass params explicitly; but use configurators for variants or reuse.
8. Pre-Core Quiz¶
f = partial(g, x=1); f() == f()? → Yes, if pure.- Substitute closure call? → Safe (fixed capture).
- Global in configurator? → Hidden dep → impure.
- Mutable default? → Breaks determinism.
- Cache configured fn? → Safe if pure.
9. Post-Core Reflection & Exercise¶
Reflect: In your code, find one configurable func (global/default). Refactor to pure configurator; add Hypothesis equiv.
Project Exercise: Apply to RAG; run properties on sample data.
Continue with: Expression-Oriented Python
Further Reading: For more on closures in Python, see 'Fluent Python' by Luciano Ramalho. Explore toolz for advanced partials once comfortable.