Skip to content

Closures & Partials

Page Maps

graph LR
  family["Python Programming"]
  program["Python Functional Programming"]
  section["Data First Apis Expression Style"]
  page["Closures & Partials"]
  capstone["Capstone evidence"]

  family --> program --> section --> page
  page -.applies in.-> capstone
flowchart LR
  orient["Orient on the page map"] --> read["Read the main claim and examples"]
  read --> inspect["Inspect the related code, proof, or capstone surface"]
  inspect --> verify["Run or review the verification path"]
  verify --> apply["Apply the idea back to the module and capstone"]

This lesson matters when a team wants several variants of the same pipeline and the first instinct is to hide the choice in globals, environment reads, or mutable defaults. The diagram shows the safer move: build the variant once, keep the captured configuration visible, and review the resulting function as a normal value.

Start With the Design Smell

The hard part is usually not the syntax of partial. The hard part is knowing when a configurator is cleaner than "just set a variable somewhere and call the function later."

  • If changing chunk size means editing module-level state, the design is already harder to test.
  • If two pipeline variants share code but not configuration, the missing concept is usually a configurator.
  • If a reviewer cannot tell what behavior is fixed at construction time, the capture is too implicit.

Keep This Question In View

Core question:
How do closures and partial application create pure configurators that capture immutable config to produce reusable, deterministic variants of RAG pipelines without globals or mutable defaults?

This lesson introduces pure configurators in the practical sense you need:

  • treat configuration as explicit immutable data that can be captured once and inspected later
  • build variants by returning new callables instead of mutating shared state
  • keep the core deterministic so the configured function still behaves like a normal pure function

The running FuncPipe examples make one idea concrete: configuration should become a value that shapes behavior, not a hidden force that changes behavior from a distance.

Use this when you have pure pipelines that need configurable variants such as different chunk sizes or keep rules, but globals or mutable defaults still feel tempting. Outcome:
1. Spot globals or mutable defaults in config and explain why they break determinism.
2. Refactor a configurable impure function to pure using closures/partials.
3. Write a Hypothesis property providing strong evidence of equivalence to Module 1, including a shrinking example.


Runnability Note (Module 01 Snapshot vs Module 02 End-State)

This core includes two kinds of snippets:

1) Runnable against the end-of-Module-02 codebase (this checkout)
These use the real APIs in capstone/src/funcpipe_rag/ (e.g., RagConfig, make_rag_fn, full_rag_api_docs, iter_rag_core).

2) Hypothetical pre-refactor snippets (illustration only)
These are intentionally “bad” or “in-between” states used to teach refactoring. They are not meant to match a real snapshot 1:1. They are labeled as Hypothetical pre-refactor and are refactored into the real Module 02 API across this module.

If you want a real, runnable Module 01 codebase, refresh the generated history route first:

  • make PROGRAM=python-programming/python-functional-programming history-refresh
  • Module 01 path: capstone/_history/worktrees/module-01/
  • Import path for Module 01: capstone/_history/worktrees/module-01/src/ (use PYTHONPATH when running examples there)

Module 01 uses the same import name (import funcpipe_rag), so run it from the Module 01 worktree (or set PYTHONPATH) to avoid mixing versions.

We refactor the hypothetical pre-refactor shapes into the real Module 02 API across this module.

1. Conceptual Foundation

1.1 The One-Sentence Rule

Use closures/partials for pure configurators with immutable capture only; avoid globals, env vars, or mutable defaults to preserve pipeline determinism and equivalence.

1.2 Closures & Partial Application in One Precise Sentence

Closures capture immutable config to produce pure customized functions; partial application fixes arguments for reusable variants—ensuring deterministic, composable behavior from explicit data only.

1.3 Why This Matters Now

Module 01 showed how to protect purity inside one function. This lesson shows how to preserve that discipline when the same behavior needs several named variants. Without configurators, teams often add a global setting, a mutable default, or an environment lookup. The code still looks small, but reasoning gets worse because behavior now depends on hidden history instead of just inputs.

1.4 Configurators as Values in 5 Lines

The key move is simple: build the configured function once, then pass that function around like data.

from functools import partial
from collections.abc import Callable
from funcpipe_rag import CleanDoc, ChunkWithoutEmbedding, RagEnv


def chunk_doc(doc: CleanDoc, env: RagEnv) -> list[ChunkWithoutEmbedding]:
    text = doc.abstract
    step = env.chunk_size
    return [
        ChunkWithoutEmbedding(
            doc.doc_id,
            text[i: i + step],
            i,
            i + len(text[i: i + step]),
        )
        for i in range(0, len(text), step)
    ]


def make_env(chunk_size: int) -> RagEnv:
    return RagEnv(chunk_size)


variants: dict[str, Callable[[CleanDoc], list[ChunkWithoutEmbedding]]] = {
    "small": partial(chunk_doc, env=make_env(256)),
    "medium": partial(chunk_doc, env=make_env(512)),
    "large": partial(chunk_doc, env=make_env(1024)),
}


def run_variant(key: str, doc: CleanDoc) -> list[ChunkWithoutEmbedding]:
    return variants[key](doc)

Because the partial is pure (immutable env, no globals), we can safely store it in a dict, pass it around, and test it in isolation—just like data.


2. Mental Model: Globals vs Closures/Partials

2.1 One Picture

Globals (Hidden Deps)                   Closures/Partials (Explicit Capture)
+-----------------------+               +------------------------------+
| global ENV / CFG      |               |   make_rag_fn(env, cfg)      |
|        ↓              |               |        ↓                     |
| rag(docs) → chunks    |               |   rag_fn(docs) → chunks      |
| (hidden config)       |               |   (fixed config)             |
+-----------------------+               +------------------------------+
   ↑ Flaky / Non-deterministic             ↑ Deterministic / Testable

2.2 Contract Table

Aspect Globals / Mutable Defaults Closures / Partials
Dependencies Hidden globals, env vars Explicit, immutable capture
Determinism Breaks (outputs vary) Safe (same config = same)
Testing Flaky (depends on history) Local reasoning (properties)
Composability Races / scattered config Freedom from shared state
Mutable Defaults in Partials Breaks Determinism Use frozen dataclasses or immutable types for configs

Note on Globals Choice: Rarely, for fixed scripts with no variants, use explicit params; but prefer configurators for reuse.


3. Running Project: FuncPipe RAG Builder

Our running project (from m02-rag.md) is extending the pure RAG pipeline from Module 1 to add configurability.
- Dataset: 10k arXiv CS abstracts (arxiv_cs_abstracts_10k.csv).
- Goal: Configurable cleaning, chunking, etc., into Chunk list (dedup added later).
- Start: Hypothetical pre-refactor with globals (core1_start.py, illustration only).
- End (this core): Pure configurator with explicit capture, preserving equivalence.

3.1 Types (End-of-Module-02, Used by Configurators)

In the end-of-Module-02 codebase, cleaner configuration is represented as frozen data:

from funcpipe_rag import CleanConfig, make_cleaner

cfg = CleanConfig(rule_names=("strip", "lower", "collapse_ws"))
cleaner = make_cleaner(cfg)

3.2 Impure Start (Anti-Pattern)

This is a hypothetical pre-refactor example used for contrast. It is intentionally not meant to match a real snapshot 1:1, and it is not intended to be run in the end-of-Module-02 checkout.

# core1_start.py (hypothetical pre-refactor; illustration only)
from funcpipe_rag import RawDoc, CleanDoc, Chunk, RagEnv
from funcpipe_rag import chunk_doc, embed_chunk, structural_dedup_chunks

GLOBAL_ENV = RagEnv(512)  # BAD: hidden dependency (breaks determinism contract)
MUTABLE_CFG = {"rules": [str.strip, str.lower]}  # BAD: shared mutable default (breaks determinism contract)


def impure_full_rag(docs: list[RawDoc], cfg: dict = MUTABLE_CFG) -> list[Chunk]:
    global GLOBAL_ENV

    def impure_cleaner(d: RawDoc) -> CleanDoc:
        abstract = d.abstract
        for r in cfg["rules"]:
            abstract = r(abstract)
        return CleanDoc(d.doc_id, d.title, abstract, d.categories)

    cleaned = [impure_cleaner(d) for d in docs]
    chunked = [c for doc in cleaned for c in chunk_doc(doc, GLOBAL_ENV)]
    embedded = [embed_chunk(c) for c in chunked]
    return structural_dedup_chunks(embedded)


# Usage: Non-deterministic due to globals
docs: list[RawDoc] = [RawDoc("cs-123", "Title", "Abstract text...", "cs.AI")]
chunks1 = impure_full_rag(docs)
chunks2 = impure_full_rag(docs)
# May differ if GLOBAL_ENV mutated externally

Smells: Globals (GLOBAL_ENV), mutable defaults (MUTABLE_CFG as shared dict).
Problem: impure_full_rag(docs) depends on hidden state; can't substitute without replaying globals.


4. Refactor to Pure: Explicit Capture

4.1 Pure Configurator

Pass all config explicitly; capture in closures/partials.

# Pure refactor: Explicit capture
from funcpipe_rag import CleanConfig, make_rag_fn

# `make_rag_fn` is the canonical configurator in this repo:
# it captures frozen config in a closure (no globals).
cfg = CleanConfig()
rag_fn = make_rag_fn(chunk_size=512, clean_cfg=cfg)

chunks1, obs1 = rag_fn(docs)
chunks2, obs2 = rag_fn(docs)
assert chunks1 == chunks2 and obs1 == obs2

Wins: No globals/mutables; explicit capture. Matches Module 1 when defaults used.
Note: Defaults are expressed as data (DEFAULT_RULES, DEFAULT_CLEAN_CONFIG) and captured immutably into RagConfig.

4.2 Before-and-After Refactoring Snippet

To cement the transition from globals to closures, here's an explicit mini-example showing the "ugly before" with a global and the "clean after" using a closure:

# Before: Ugly global config
from funcpipe_rag import CleanDoc, ChunkWithoutEmbedding, RagEnv

GLOBAL_ENV = RagEnv(chunk_size=512)  # BAD: hidden dependency (breaks determinism contract)


def chunk_doc(doc: CleanDoc) -> list[ChunkWithoutEmbedding]:
    # Implicitly uses global
    text = doc.abstract
    step = GLOBAL_ENV.chunk_size
    chunks: list[ChunkWithoutEmbedding] = []
    for start in range(0, len(text), step):
        segment = text[start:start + step]
        chunks.append(ChunkWithoutEmbedding(doc.doc_id, segment, start, start + len(segment)))
    return chunks


# After: Pure closure with explicit config
def make_chunk_doc(env: RagEnv) -> Callable[[CleanDoc], list[ChunkWithoutEmbedding]]:
    def chunk_doc(doc: CleanDoc) -> list[ChunkWithoutEmbedding]:
        text = doc.abstract
        step = env.chunk_size
        chunks: list[ChunkWithoutEmbedding] = []
        for start in range(0, len(text), step):
            segment = text[start:start + step]
            chunks.append(ChunkWithoutEmbedding(doc.doc_id, segment, start, start + len(segment)))
        return chunks

    return chunk_doc


# Usage: Deterministic and testable
chunk_fn = make_chunk_doc(RagEnv(chunk_size=512))

This refactor eliminates hidden dependencies, making the function pure and easier to test—same inputs always yield the same outputs.

4.3 Pure Partial for Rules

# RulesConfig configurator (end-of-Module-02)
from funcpipe_rag import All, LenGt, RulesConfig, StartsWith, make_rag_fn


def keep_categories(prefix: str) -> RulesConfig:
    """Pure configurator: capture a prefix into an immutable RulesConfig."""
    return RulesConfig(keep_pred=StartsWith("categories", prefix))


cs_keep = keep_categories("cs.")
cs_long_keep = RulesConfig(keep_pred=All((StartsWith("categories", "cs."), LenGt("abstract", 500))))

# Usage: Variant with filtering (RulesConfig is the canonical keep type in Module 02)
rag_filtered = make_rag_fn(chunk_size=512, clean_cfg=cfg, keep=cs_keep)
filtered_chunks, _ = rag_filtered(docs)

Wins: Fixes prefix explicitly; composable. Enables configurable filtering without globals.


5. Equational Reasoning: Substitution Exercise

Hand Exercise: Replace expressions in make_rag_fn.
1. Inline env = RagEnv(512) → fixed value.
2. Substitute into partial → fixed call.
3. Result: rag_fn(docs) = fixed value for fixed inputs.
Bug Hunt: In impure version, substitution fails (depends on globals).


6. Property-Based Testing: Providing Strong Evidence of Equivalence (Advanced, Optional)

Use Hypothesis to provide strong evidence that the refactor preserved behavior. Verify all patterns with Hypothesis—examples provided show how to detect impurities like globals or non-determinism.

6.1 Custom Strategy (RAG Domain)

From capstone/tests/conftest.py (as in Module 1).

6.2 Equivalence Property

# capstone/tests/test_rag_api.py
from hypothesis import given
import hypothesis.strategies as st
from tests.conftest import doc_list_strategy
from funcpipe_rag import (
    RagEnv,
    clean_doc,
    embed_chunk,
    iter_chunk_doc,
    structural_dedup_chunks,
    CleanConfig,
    make_rag_fn,
)

def baseline_full_rag(docs, env):
    embedded = [embed_chunk(c) for d in docs for c in iter_chunk_doc(clean_doc(d), env)]
    return structural_dedup_chunks(embedded)


@given(docs=doc_list_strategy(), chunk_size=st.integers(128, 1024))
def test_configurator_parity(docs, chunk_size):
    rag_fn = make_rag_fn(chunk_size=chunk_size, clean_cfg=CleanConfig())
    new_chunks, _ = rag_fn(docs)
    old_chunks = baseline_full_rag(docs, RagEnv(chunk_size))
    assert new_chunks == old_chunks

Note: Property focuses on equivalence (same chunks); assumes no rules/taps.

6.3 Determinism Property

@given(docs=doc_list_strategy(), chunk_size=st.integers(128, 1024))
def test_configurator_deterministic(docs, chunk_size):
    rag_fn = make_rag_fn(chunk_size=chunk_size, clean_cfg=CleanConfig())
    assert rag_fn(docs) == rag_fn(docs)

6.4 Shrinking Demo: Catching a Bug

Bad refactor (uses global):

from collections.abc import Callable
from funcpipe_rag import CleanConfig, RagEnv, clean_doc, embed_chunk, iter_chunk_doc, structural_dedup_chunks

def bad_make_rag_fn(
    chunk_size: int, clean_cfg: CleanConfig
) -> Callable[[list], list]:
    global GLOBAL_ENV
    GLOBAL_ENV = RagEnv(chunk_size)
    cleaner = make_cleaner(clean_cfg)
    def run(docs):
        embedded = [embed_chunk(c) for d in docs for c in iter_chunk_doc(cleaner(d), GLOBAL_ENV)]
        return structural_dedup_chunks(embedded)
    return run

Property (swapped to bad_make_rag_fn):

@given(
    docs=doc_list_strategy(),
    chunk_size1=st.integers(128, 1024),
    chunk_size2=st.integers(128, 1024),
)
def test_bad_configurator_env_sensitive(docs, chunk_size1, chunk_size2):
    global GLOBAL_ENV
    GLOBAL_ENV = RagEnv(chunk_size1)  # Simulate prior state
    rag_fn = bad_make_rag_fn(chunk_size2, CleanConfig())
    pure = make_rag_fn(chunk_size=chunk_size2, clean_cfg=CleanConfig())
    pure_chunks, _ = pure(docs)
    assert rag_fn(docs) == pure_chunks  # Fails when chunk_size1 != chunk_size2 (global sensitivity)

Hypothesis failure trace (run to verify; example with differing sizes):

Falsifying example: test_bad_configurator_env_sensitive(
    docs=[RawDoc(doc_id='a', title='', abstract='a', categories='')], 
    chunk_size1=128,
    chunk_size2=129,
)
AssertionError
  • Shrinks to minimal doc; catches reliance on global because the test models external state changes. This deliberately reintroduces the global GLOBAL_ENV and shows how Hypothesis exposes the hidden dependency, by comparing against the pure make_rag_fn.

7. When Configurators Aren't Worth It

Rarely, for one-off calls with fixed config (e.g., scripts), pass params explicitly; but use configurators for variants or reuse.


8. Pre-Core Quiz

  1. f = partial(g, x=1); f() == f()? → Yes, if pure.
  2. Substitute closure call? → Safe (fixed capture).
  3. Global in configurator? → Hidden dep → impure.
  4. Mutable default? → Breaks determinism.
  5. Cache configured fn? → Safe if pure.

9. Post-Core Reflection & Exercise

Reflect: In your code, find one configurable func (global/default). Refactor to pure configurator; add Hypothesis equiv.
Project Exercise: Apply to RAG; run properties on sample data.

Continue with: Expression-Oriented Python

Further Reading: For more on closures in Python, see 'Fluent Python' by Luciano Ramalho. Explore toolz for advanced partials once comfortable.