Skip to content

Data System Overview

The data system in bijux-pollenomics is designed to keep different kinds of evidence visible instead of merging everything into one vague export. You should be able to tell whether you are looking at pollen context, archaeological context, boundary framing, fieldwork documentation, public review surfaces, or ancient DNA sample evidence.

The governing stance is pollenomics-first: the repository is built to explain the broader pollen and environmental evidence system clearly, while keeping animal ancient-DNA recovery visible as one important but still partial slice of that larger product.

The Basic Shape

flowchart TB
    sources["source datasets and papers"]
    tracking["tracked source intake"]
    normalization["normalized evidence files"]
    publication["reports and atlas views"]

    sources --> tracking
    tracking --> normalization
    normalization --> publication

That structure matters because the repository has to support two kinds of use. Sometimes you want the public answer first. Sometimes you want to inspect the governing evidence directly. The system needs to support both without forcing you to decode an internal file tree before understanding what the repository is doing.

What This Overview Should Clarify

  • which evidence families exist and why they are kept separate
  • what kind of question each family can answer
  • which surfaces are evidence, which are framing, and which are publication
  • why a map, report, or country bundle should never outrank its narrower governing files

Main Data Families

Family Role in the repository Main location Current publication posture
Pollen context environmental and paleoecological context data/landclim/, data/neotoma/ first-class pollenomics context
Archaeology context broader settlement and environmental archaeology layers data/sead/, data/raa/ contextual support layers
Boundary framing country filtering and regional map framing data/boundaries/ framing layer, not scientific evidence
Animal ancient DNA sample-backed contextual evidence from papers and supplements data/adna/ partial recovery program
Fieldwork direct visit and observation records docs/public/fieldwork/ narrow but explicit record surface

What Each Family Contributes

  • pollen context helps explain environmental setting, vegetation history, and broader landscape change
  • archaeology context helps explain settlement and material activity around the same geographies
  • boundary layers make filtering and regional framing readable, but they are not themselves scientific evidence
  • human ancient DNA gives release-based historical population context
  • animal ancient DNA provides sample-level domestication and movement clues when source recovery is strong enough
  • fieldwork gives a narrow, explicit ground-level record rather than a claim of regional completeness

Main Repository Surfaces

  • data/ keeps repository-owned source material, normalized records, and review artifacts.
  • docs/report/ keeps the generated country bundles, atlas assets, and public review surfaces.
  • docs/public/pollenomics-data/ explains how those tracked files fit together.
  • data/source_family_contracts.json and data/source_family_evidence_stage_matrix.json keep the stage model explicit instead of forcing you to infer it from directory names.
  • data/source_fact_ownership_registry.json names the governing surface for recurring concepts such as project inventory, sample identity, and atlas candidates.

Why The Separation Matters

  • a source page tells you what entered the repository and why
  • an evidence page tells you what claim is currently being governed
  • a review surface tells you what is blocked, thin, or refused
  • a publication surface tells you what the repository is prepared to show publicly

If those roles blur together, the site becomes easy to browse but hard to trust.

Where To Go Next