Data Contracts¶
The package's data contracts are filesystem contracts.
Contracted Shapes¶
- each supported source owns a stable subtree under
data/<source>/ - raw and normalized outputs are separated so source fidelity and repository friendliness are both inspectable
- collection summaries and source-specific normalized outputs must remain reproducible from one repository state
Key Contract Modules¶
data_downloader/contracts.pydata_downloader/data_layout.pydata_downloader/pipeline/summary_writer.py
Migration Warning¶
Renaming source directories or normalized filenames is a high-friction change. It ripples into docs, report publishing, tests, and reviewer expectations.
Purpose¶
This page explains the package's stable contracts for tracked data outputs.