Reader Pattern¶

Page Maps¶

graph LR
  family["Python Programming"]
  program["Python Functional Programming"]
  section["Monadic Flow Explicit Context"]
  page["Reader Pattern"]
  capstone["Capstone evidence"]

  family --> program --> section --> page
  page -.applies in.-> capstone

flowchart LR
  orient["Orient on the page map"] --> read["Read the main claim and examples"]
  read --> inspect["Inspect the related code, proof, or capstone surface"]
  inspect --> verify["Run or review the verification path"]
  verify --> apply["Apply the idea back to the module and capstone"]

Reader should feel like a dependency-visibility tool, not a clever container. The real gain is simple: configuration and services stop hiding in globals and closures and become explicit in the compositional shape of the pipeline.

Start With the Hidden Dependency Problem¶

By this point you can chain fallible steps well, but you may still be smuggling models, tokenizers, or config through closure capture. Reader matters when that hidden context starts making tests and reviews harder.

If dependencies are captured invisibly, the call site no longer tells the truth about what the pipeline needs.
If swapping one config or service requires rebuilding a whole chain manually, the dependency story is too implicit.
If you cannot tell which parts of the environment a step actually reads, the abstraction is still hiding too much.

Core question
How do you completely eliminate closure-captured variables and globals from monadic pipelines by making configuration an explicit, typed, injectable dependency — giving you pure, testable, refactor-safe code that scales from 3 lines to 300 without ever hiding a dependency again?

This lesson introduces Reader as the explicit-context version of patterns you have already seen:

keep the pipeline pure while still depending on shared configuration
expose the environment in the type and combinator structure
make swapping environments a call-site concern instead of a hidden construction trick

The earlier closure examples matter because Reader is easiest to understand as a disciplined replacement for patterns you already use.

Use this when you have tasted the power of .and_then chains but are still fighting hidden dependencies that break tests and refactors.

Outcome 1. You will write every config-dependent pipeline as a pure Reader[Config, T]. 2. You will swap entire configurations (dev/prod/debug) at the call-site with .run(new_config). 3. You will have mechanical proof that your Reader compositions satisfy the monad laws — meaning refactoring is always safe.

Why Reader Is the Final Piece – Three Patterns Compared¶

Pattern	Visibility	Testability	Reconfigurability	Refactor Safety	Verdict
Manual threading	Explicit	Good	Poor	Poor	Verbose, error-prone
Closure capture	Hidden	Bad	Bad	Bad	Works until it doesn't
Reader (this core)	Explicit	Strong	Strong	Strong	Best fit for this core

Reader is best understood here as “explicit, lawful closure capture” rather than as a mysterious new runtime mechanism.

A Quick Warrant Check¶

Reader is worth the extra structure when the same environment is shared across several steps and you want that dependency to stay visible at the call site. If one ordinary function argument is enough, pass the argument directly. The goal is clearer dependency surfaces, not container ceremony.

1. Laws & Invariants (machine-checked in CI)¶

Law	Formal Statement	Why it matters
Left Identity	`pure(x).and_then(f) == f(x)`	Safe to lift plain values
Right Identity	`r.and_then(pure) == r`	Safe to extract sub-pipelines
Associativity	`r.and_then(f).and_then(g) == r.and_then(lambda x: f(x).and_then(g))`	Grouping never changes meaning
Ask Identity	`ask().map(lambda c: c) == ask()`	Reading config is a no-op
Local Composition	`local(f, local(g, r)) == local(lambda c: f(g(c)), r)`	Local modifications compose predictably

All laws verified with Hypothesis. A single counterexample breaks CI.

2. Public API – Reader is a one-field dataclass (mypy --strict clean)¶

# capstone/src/funcpipe_rag/fp/effects/reader.py – end-of-Module-06 (mypy --strict clean target)

from __future__ import annotations
from dataclasses import dataclass
from typing import Generic, Callable, TypeVar

C = TypeVar("C")   # Config / Environment
T = TypeVar("T")
U = TypeVar("U")

@dataclass(frozen=True)
class Reader(Generic[C, T]):
    run: Callable[[C], T]

    def map(self, f: Callable[[T], U]) -> "Reader[C, U]":
        return Reader(lambda cfg: f(self.run(cfg)))

    def and_then(self, f: Callable[[T], "Reader[C, U]"]) -> "Reader[C, U]":
        return Reader(lambda cfg: f(self.run(cfg)).run(cfg))

# Core primitives
def pure(x: T) -> Reader[C, T]:
    return Reader(lambda _: x)

def ask() -> Reader[C, C]:
    return Reader(lambda cfg: cfg)

def asks(selector: Callable[[C], T]) -> Reader[C, T]:
    return Reader(lambda cfg: selector(cfg))

def local(modify: Callable[[C], C], r: Reader[C, T]) -> Reader[C, T]:
    return Reader(lambda cfg: r.run(modify(cfg)))

That's it. No more primitives needed.

3. Canonical Style – The Way You Will Actually Write 99% of Reader Pipelines¶

@dataclass(frozen=True)
class Config:
    model_name: str
    chunk_size: int
    temperature: float = 0.0

def embed_chunk(chunk: Chunk) -> Reader[Config, Result[EmbeddedChunk, ErrInfo]]:
    def run(cfg: Config) -> Result[EmbeddedChunk, ErrInfo]:
        # NOTE: get_tokenizer / load_model are impure boundaries.
        # They will be pushed behind ports in Module 7.
        tokenizer = get_tokenizer(cfg.model_name)
        model     = load_model(cfg.model_name)

        tokens = tokenizer(chunk.text.content)[:cfg.chunk_size]
        vec    = model.encode(tokens, temperature=cfg.temperature)

        # Real failures (e.g. OOM, network) will be added later.
        # We use Result now so the type is stable when we do.
        return Ok(replace(chunk, embedding=Embedding(vec, cfg.model_name)))
    return Reader(run)

# Usage – swap entire behaviour with one line
dev_result  = embed_chunk(chunk).run(dev_config)
prod_result = embed_chunk(chunk).run(prod_config)
test_result = embed_chunk(chunk).run(mock_config)  # perfect for unit tests

This is the style you will use every day.
Pure, linear, no closures, no globals, instantly testable.

4. Composition When You Need It (optional, for reusable steps)¶

def get_tokenizer_r() -> Reader[Config, Tokenizer]:
    # Wrap the existing impure get_tokenizer(model_name) in a Reader
    return asks(lambda cfg: get_tokenizer(cfg.model_name))

def get_model() -> Reader[Config, Model]:
    return asks(lambda cfg: load_model(cfg.model_name))

def embed_chunk_composed(chunk: Chunk) -> Reader[Config, EmbeddedChunk]:
    return (
        pure(chunk.text.content)
        .and_then(lambda text: get_tokenizer_r().map(lambda tok: tok(text)))
        .and_then(lambda tokens: get_model().map(lambda model: model.encode(tokens)))
        .and_then(
            lambda vec: ask().map(
                lambda cfg: replace(chunk, embedding=Embedding(vec, cfg.model_name))
            )
        )
    )

Both styles are valid. The def run(cfg): version is the daily driver; the composed version is for when the sub-steps are reusable enough to justify their own Reader helpers.

5. Before → After – The Same Pipeline¶

# BEFORE – closure soup (from earlier cores)
def embed_chunk(chunk: Chunk) -> Result[EmbeddedChunk, ErrInfo]:
    text = chunk.text.content[:config.chunk_size]      # config from where?
    tokens = tokenizer(text)                           # tokenizer from where?
    vec = model.encode(tokens, temperature=config.temperature)
    return Ok(replace(chunk, embedding=Embedding(vec, config.model_name)))

# AFTER – pure, explicit, testable
def embed_chunk(chunk: Chunk) -> Reader[Config, Result[EmbeddedChunk, ErrInfo]]:
    def run(cfg: Config) -> Result[EmbeddedChunk, ErrInfo]:
        tokenizer = get_tokenizer(cfg.model_name)
        model     = load_model(cfg.model_name)
        tokens    = tokenizer(chunk.text.content)[:cfg.chunk_size]
        vec       = model.encode(tokens, temperature=cfg.temperature)
        return Ok(replace(chunk, embedding=Embedding(vec, cfg.model_name)))
    return Reader(run)

Zero closures. Zero globals. Full type safety. Instant testability.

6. Property-Based Proofs (capstone/tests/test_reader_laws.py)¶

from hypothesis import given
import strategies as st  # your Hypothesis strategies for Readers

@given(x=st.integers())
def test_reader_left_identity(x):
    f = lambda n: Reader(lambda cfg: n + cfg.inc)
    cfg = test_config()
    assert pure(x).and_then(f).run(cfg) == f(x).run(cfg)

@given(r=st.readers())
def test_reader_associativity(r):
    f = lambda a: Reader(lambda cfg: a + cfg.inc)
    g = lambda b: Reader(lambda cfg: b * cfg.mul)
    cfg = test_config()
    assert r.and_then(f).and_then(g).run(cfg) == r.and_then(lambda x: f(x).and_then(g)).run(cfg)

7. Anti-Patterns & Immediate Fixes¶

Anti-Pattern	Symptom	Fix
Closure-captured config	Hidden dependencies, untestable	Use `def run(cfg):` + Reader
Global config	Impossible to mock/swap	Inject via `.run(cfg)`
Manual config threading	Signatures explode	Reader composes automatically

8. Pre-Core Quiz¶

Reader replaces…? → Closure-captured dependencies
You read config with…? → ask() or asks(selector)
You temporarily override config with…? → local
You run a Reader with…? → .run(config)
The golden rule? → Never capture config in a closure again

9. Post-Core Exercise¶

Take your largest closure-heavy pipeline and rewrite it using the def run(cfg): style inside a Reader.
Add a debug flag that enables extra validation — implement with local.
Write a test that runs the same pipeline with two different configs and asserts different behaviour.

Continue with: Explicit State Threading

You have now completely eliminated closure-captured variables from your monadic pipelines. Configuration is now a first-class, typed, injectable dependency — and your pipelines are pure, composable, and proven correct by Hypothesis. The final core removes the last remaining effect: local mutable state.