Skip to content

Flagship Challenge Corpus Catalog

bijux-proteomics-core now publishes one product-owned challenge root alongside the public benchmark packages:

packages/bijux-proteomics-core/benchmark-assets/flagship-challenge-corpora/

The point of this catalog is to keep the punishing evidence visible, not merely to celebrate that benchmark packages exist. Every challenge root records what was frozen, what was perturbed, what measured deltas followed, and whether the workflow, comparator, or review posture survived, weakened, or collapsed.

Shared Machine-Readable Surface

Every challenge root is governed through:

  • challenge_registry.json
  • one challenge_manifest.json
  • one revealed report:
  • blinded_holdout_report.json, or
  • perturbation_report.json
  • one local README.md

The registry is the first machine-readable proof check:

packages/bijux-proteomics-core/benchmark-assets/flagship-challenge-corpora/challenge_registry.json

Blinded Holdouts

The current blinded holdout roots are:

  • dda_blinded_holdout
  • dia_blinded_holdout
  • lfq_blinded_holdout
  • ptm_blinded_holdout

These roots freeze the main reviewer-facing surfaces first and reveal the hidden family-transfer truth only after the workflow, comparator, and recommendation posture is already fixed.

Current holdout limit:

  • multiplex and targeted do not yet have blinded holdout roots here, so the challenge corpus is still incomplete even though their perturbation roots now exist.

Perturbation Roots

The current perturbation roots are:

  • dda_calibration_decoy_perturbation
  • dia_library_dropout_perturbation
  • lfq_missingness_drift_perturbation
  • multiplex_reference_bleed_perturbation
  • ptm_ambiguity_occupancy_perturbation
  • targeted_interference_carryover_perturbation

Each perturbation root publishes measured reaction deltas instead of generic severity prose. The current catalog explicitly covers:

  • dda: accepted-target loss, accepted-decoy intrusion, contaminant promotion
  • dia: accepted-precursor dropout, shared-peptide loss, library-conditioned comparator shrinkage
  • lfq: missing-value inflation, batch-drift pressure, differential narrative reshuffle
  • multiplex: reference dropout, channel bleed, carrier-conditioned ratio compression
  • ptm: localization ambiguity, occupancy instability, targetability collapse
  • targeted: calibrant drift, transition interference, carryover-blocked follow-up

What This Catalog Settles

  • whether a workflow family only looks strong on its easiest package
  • whether family-transfer claims survive blinded reveal instead of convenient hindsight
  • whether review posture collapses quickly once calibration, missingness, bleed, ambiguity, or carryover pressure is made explicit

What This Catalog Does Not Settle

  • broad outsider-grade authority across every flagship family
  • more than one blinded holdout root for the currently covered families
  • multiplex or targeted holdout discipline
  • long-run acceptance bars or release gating by challenge-failure history

First Proof Check

  • packages/bijux-proteomics-core/src/bijux_proteomics/benchmarks/flagship_challenge_corpora.py
  • packages/bijux-proteomics-core/src/bijux_proteomics/benchmarks/flagship_challenge_assets.py
  • packages/bijux-proteomics-core/tests/benchmarks/test_flagship_challenge_corpora_surface.py
  • packages/bijux-proteomics-core/benchmark-assets/flagship-challenge-corpora

Refresh the copied challenge roots with uv run --group dev python -m bijux_proteomics.benchmarks.flagship_challenge_assets refresh.

Neighbor Surfaces

Open Flagship Public Benchmark Catalog when you need the paired public package roots that these challenges are trying to break.

Open Flagship Benchmark Assets when you need the copied-source contract, citation discipline, rebuild command, freshness report, and obsolescence audit behind the public benchmark roots.