Python Data Model as Design Surface – Iteration, Containers, Context, Numeric¶
Concept Position¶
flowchart TD
family["Python Programming"] --> program["Python Object-Oriented Programming"]
program --> module["Module 01: Object Semantics and the Python Data Model"]
module --> concept["Python Data Model as Design Surface – Iteration, Containers, Context, Numeric"]
concept --> capstone["Capstone pressure point"]
flowchart TD
problem["Start with the design or failure question"] --> example["Study the worked example and trade-offs"]
example --> boundary["Name the boundary this page is trying to protect"]
boundary --> proof["Carry that question into code review or the capstone"]
Read the first diagram as a placement map: this page is one concept inside its parent module, not a detached essay, and the capstone is the pressure test for whether the idea holds. Read the second diagram as the working rhythm for the page: name the problem, study the example, identify the boundary, then carry one review question forward.
Introduction¶
This core surveys the Python data model as a deliberate design interface, focusing on intentional implementation of dunder methods for iteration (__iter__), container traits (__len__, __getitem__, __contains__), context management (__enter__, __exit__), and numeric operations (__add__ et al.). Extending encapsulation from M01C04 and equality protocols from M01C05, we delineate when to render types "container-like" for ecosystem interoperability versus maintaining opacity to preserve domain semantics. Selective adoption enhances usability without bloating interfaces, aligning objects with expectations while guarding invariants.
The layered structure persists: language-level semantics outline guarantees, CPython notes detail optimizations, design semantics guide modeling choices, and practical guidelines furnish prescriptive rules. This framework yields a portable model for protocol extension, resilient across implementations.
Cross-references link to prerequisites: attribute interception in M01C02; value/entity opacity in M01C01. Proficiency here enables objects that integrate gracefully, choosing protocols that amplify intent without overexposure.
1. Language-Level Model¶
Python's data model exposes protocols as optional hooks for operator and builtin overloads, enabling polymorphic behavior without inheritance. Implementation is explicit: undefined methods raise TypeError when invoked.
Iteration Protocol¶
Guarantees:
- __iter__(self) returns an iterator (object with __next__); invoked by for obj in iterable or iter(obj).
- __next__(self) yields next item or raises StopIteration.
- Fallback: If __iter__ missing, iter(obj) attempts the old sequence protocol: repeated __getitem__(i) from 0 until IndexError.
Container Protocols: Sequence and Mapping Traits¶
Guarantees:
- __len__(self) returns non-negative integer for len(obj).
- __getitem__(self, key) supports indexing/sllicing for obj[key]; raises IndexError/KeyError.
- __contains__(self, item) returns boolean for item in obj; falls back to __iter__ + __next__ for iterables.
- Sequence-like: __getitem__ with integer keys in a contiguous range starting at 0;
optional __setitem__, __delitem__, __reversed__.
- Mapping-like: __getitem__ with arbitrary keys; optional __setitem__, __delitem__.
- Implementing sequence/mapping traits does not give you element-wise == automatically:
if you want list-/dict-like equality semantics for your own type, you must define __eq__.
Context Management Protocol¶
Guarantees:
- __enter__(self) returns resource (often self) for with obj as alias:.
- __exit__(self, exc_type, exc_val, exc_tb) handles cleanup; returns True to suppress exceptions.
- Supports nesting; invoked explicitly via with.
Numeric Protocols¶
Guarantees:
- Binary operators (+, *) invoke __add__ etc.; unary (-) via __neg__.
- Return NotImplemented for delegation: Left operand first, then reflected (e.g., b.__radd__(a)). No algebraic assumptions enforced.
- Optional: __bool__ for truthiness (if obj:; defaults to bool(len(obj)) if __len__ defined); __index__ for integer coercion (e.g., slices).
These protocols integrate with builtins/operators; omission raises TypeError.
2. Implementation Notes (CPython, non-normative)¶
CPython dispatches protocols via C slots (e.g., tp_iter, mp_length).
- Iteration:
__iter__yieldsPyObject*; fallback sequence uses incremental__getitem__untilIndexError. - Containers:
__getitem__viasq_item/mp_subscript;__contains__iterates if no direct slot. - Context:
withusesPyContextManager;__exit__receives tuple(exc_type, exc_val, exc_tb). - Numeric/Truthiness: Operator slots dispatch;
__bool__prefers explicit, else__len__;__index__coerces toPyLong. - Performance Nuances: O(1) dispatch;
__getitem__enables fast slicing;__bool__short-circuits.
These optimize ecosystem hooks but expose no privacy—protocols probe state.
3. Design Semantics¶
Protocols extend the value/entity lens (M01C01): value-like objects adopt container/numeric traits for interoperability (e.g., iterable tuples); entity-like ones remain opaque to encapsulate lifecycle (e.g., no __len__ for sessions). Intentional choice avoids "protocol bloat," where over-implementation dilutes focus.
- Iteration/Container Choice: Implement
__iter__+__getitem__for sequence-like values (e.g., bounded history); use mapping traits (__getitem__with keys) for dict-like. If reusable,__iter__returns fresh iterator—returningselfdefines an iterator, not a collection. Omit if iteration ambiguous/expensive—prefer explicit methods (e.g.,.pages()). - When Not to Implement:
__len__if unbounded/meaningless (e.g., streams);__bool__if truthiness ambiguous (__len__default misleads for connections);__index__only for numeric coercion. - Context for Resources: Use for RAII entities (e.g.,
__enter__acquires lock); abstain for stateless—withimplies ownership. Anti-pattern: cosmeticwithwithout cleanup. - Numeric Restraint: Implement only well-defined ops (e.g.,
__add__for vectors); explicit with reflection for asymmetry. No avoidance for non-commutative—commit to semantics.
Do/Don't Examples (Monitoring Domain):
- Do: MetricWindow as sequence (__getitem__(i) for timestamp access, __iter__ yields metrics)—fluent for for m in window:.
- Don't: DBSession as iterable (__iter__ over queries)—hides I/O; use .query() instead.
- Do: ThresholdRule numeric (__lt__(other) for comparison)—domain-natural.
- Don't: Alert with __bool__ defaulting to __len__(violations)—empty alerts aren't "false."
Choosing Protocols: Query: Does fluency aid (container-like) or mislead (opaque)? For MetricSeries: sequence by time? Mapping by label? Explicit .values()?
Interaction with Equality: Container __eq__ explicit; iterable equality element-wise if implemented.
4. Practical Guidelines¶
- Intentional Adoption: Implement only needed (e.g.,
__iter__+__getitem__for sequences); test (e.g.,list(obj),obj[key]). - Container Discipline:
__len__+__getitem__for O(1);__contains__fast.__reversed__for bidirectional. - Context Safety:
__exit__all paths (KeyboardInterrupt); returnFalseunless suppressing. Nesting-safe. - Truthiness/Numeric Hygiene:
__bool__explicit over__len__;__index__for slice/range. Reflect numeric; document precedence. - Opacity Principle: Omit if misleading (e.g., no
__iter__for remote streams); explicit methods over protocols.
Impacts on Design and Interoperability:
- Design: Selective enhances fluency; over-adoption invites misuse (e.g., len(session) implies fixed).
- Interoperability: Traits integrate; opacity preserves abstraction.
Exercises for Mastery¶
- Implement
__iter__+__getitem__forMetricWindow(bounded sequence); testlist(window)and slicing. - Add mapping traits to
LabelSet(__getitem__by key,__contains__); verifylabelset['cpu']and'cpu' in labelset. - Design
DBSessioncontext manager (__enter__/__exit__acquire/release); override__bool__for "active".
This core leverages the data model for elegant extension. Next, M01C09 critiques OOP applicability.