Capstone File Guide¶
Guide Fit¶
flowchart TD
family["Reproducible Research"] --> program["Deep Dive Snakemake"]
program --> pressure["A concrete learner or reviewer question"]
pressure --> guide["Capstone File Guide"]
guide --> next["Modules, capstone, and reference surfaces"]
flowchart TD
question["Name the exact question you need answered"] --> skim["Skim only the sections that match that pressure"]
skim --> crosscheck["Open the linked module, proof surface, or capstone route"]
crosscheck --> next_move["Leave with one next decision, page, or command"]
Read the first diagram as a timing map: this guide is for a named pressure, not for wandering the whole course-book. Read the second diagram as the guide loop: arrive with a concrete question, use only the matching sections, then leave with one smaller and more honest next move.
This page explains which capstone files matter first and what responsibility each one holds.
Use it when the repository feels understandable at a directory level but not yet at a file level.
Start With These Files¶
| File | Why it matters |
|---|---|
| Capstone Guide | defines the repository contract and the teaching route through the workflow |
capstone/Snakefile |
shows the orchestration entrypoint and how rule families are assembled |
capstone/workflow/rules/common.smk |
establishes shared functions, path logic, and workflow-wide conventions |
capstone/workflow/rules/preprocess.smk |
contains discovery and per-sample processing contracts |
capstone/workflow/rules/publish.smk |
defines the stable publish boundary and integrity evidence |
| Publish Review Guide | documents which files are public contracts and which remain internal |
capstone/Makefile |
exposes the learner-facing proof and verification commands |
| Capstone Walkthrough | explains the repository as a guided review surface rather than only a runnable workflow |
Directory Responsibilities¶
| Path | Responsibility |
|---|---|
capstone/data/raw/ |
committed toy inputs that begin the discovery story |
capstone/data/reference/ and capstone/data/panel/ |
small reference assets used by the screening rules |
capstone/config/ |
workflow configuration and schema validation surfaces |
capstone/workflow/rules/ |
Snakemake rule families and orchestration boundaries |
capstone/workflow/modules/ |
modular rule bundles used to keep repository growth legible |
capstone/workflow/scripts/ |
helper scripts that belong beside workflow orchestration rather than the Python package |
capstone/src/capstone/ |
reusable Python implementation for data-processing steps |
capstone/profiles/ |
operating-context policy for local, CI, and SLURM execution |
capstone/tests/ |
unit and workflow-level checks that defend repository truth |
Best Reading Order¶
- Capstone Guide
capstone/Snakefilecapstone/workflow/rules/common.smkcapstone/workflow/rules/preprocess.smkcapstone/workflow/rules/summarize_report.smkcapstone/workflow/rules/publish.smk- Publish Review Guide
capstone/Makefilecapstone/profiles/capstone/tests/
That order keeps the learner anchored in contract first, workflow meaning second, operational proof third, and published evidence last.
Common Wrong Reading Order¶
Avoid starting with:
- helper Python files before reading the workflow contract
- published artifacts before understanding how discovery and preprocessing work
- profile files before you know which workflow behavior must remain stable
- tests before you know what the repository is promising to downstream users
That route teaches fragments without the boundary story that makes them useful.