"A clinical AI system that cannot quantify its own uncertainty has no business informing treatment decisions."
| Attribute | Value |
|---|---|
| Status | Incubating |
| Maturity | Active Development |
| License | Apache-2.0 |
| Part of | Evidence Commons |
| Mission Pillar | Pillar 2 (Uncertainty Quantification & Conformal Prediction) |
Clinical prediction models that report point estimates without uncertainty intervals are incomplete at best and dangerous at worst. Conformal prediction (Vovk et al. 2005) provides distribution-free coverage guarantees -- prediction intervals that are valid regardless of the underlying data distribution, unlike Bayesian uncertainty estimation which requires distributional assumptions. UncertaintyOS is designed to provide a focused toolkit for conformal prediction, multiplicity correction, and calibration validation in clinical AI, with explicit support for abstention (models refusing to predict when uncertainty exceeds clinical thresholds).
The parent codebase (evidenceos-research/evidenceos-conformal) contains three production-grade modules: MondrianCQR, PIMA, and CVoE. These are integrated into the EvidenceOS research pipeline and have been validated with passing test suites. This repository is intended to hold extracted, standalone versions of these modules for use outside the EvidenceOS ecosystem. No code has been extracted to this repo yet.
| Component | Description | Parent Code Exists | LOC | Tests |
|---|---|---|---|---|
mondrian/ |
Mondrian Conformalized Quantile Regression -- distribution-free prediction intervals with group-conditional coverage | Yes (production) | Part of conformal package | Yes |
pima/ |
Post-selection Inference for Multiple Analysis -- Bonferroni TDP bounds for multiverse multiplicity correction | Yes (production) | 491 | 23 (all passing) |
cvoe/ |
Conformal Validation of Evidence -- integrated into main pipeline | Yes (production) | 271 | Yes |
abstention/ |
Selective prediction with clinically-defined uncertainty thresholds | Partial | -- | -- |
calibration/ |
Calibration diagnostics (calibration-in-the-large, slope, plots) | Partial | -- | -- |
What exists in the parent codebase:
- MondrianCQR: Mondrian Conformalized Quantile Regression providing distribution-free prediction intervals with group-conditional coverage guarantees. Production-grade, integrated into the conformal prediction module
- PIMA: 491 LOC, 23 tests all passing. Implements Bonferroni TDP bounds for multiverse multiplicity correction. This is a PIMA-inspired approximation using Bonferroni bounds, not the full Girardi closed testing procedure. The sign-flipping test is exact. Integrated into the manuscript assembler
- CVoE: 271 LOC, production, integrated into the main research pipeline for conformal validation of evidence artifacts
- Five conformal prediction modules operational in
evidenceos-research/evidenceos-conformal - Abstention workflow: models can refuse to predict when uncertainty exceeds clinical thresholds
What does not exist yet:
- Standalone packaging of any module for use outside the research pipeline
- Unified API surface across MondrianCQR, PIMA, and CVoE
- Calibration diagnostic utilities as a reusable module
- Documentation and examples for clinical deployment contexts
- Extract MondrianCQR as a standalone module with clear API for generating prediction intervals on arbitrary sklearn-compatible models
- Package PIMA as an independent multiplicity correction library, documenting the Bonferroni approximation and its relationship to full closed testing
- Extract CVoE with its pipeline integration points abstracted behind a clean interface
- Build unified calibration diagnostics module (calibration-in-the-large, calibration slope, calibration plots, Hosmer-Lemeshow)
- Create example notebooks demonstrating each module on synthetic clinical data (no real patient data)
graph LR
A[MondrianCQR] --> B[UncertaintyOS]
C[PIMA<br/>491 LOC] --> B
D[CVoE<br/>271 LOC] --> B
B --> E[BRIDGE-TBI<br/>prediction intervals]
B --> F[Lab-in-a-Box<br/>calibration enforcement]
style B fill:#2A9D8F,stroke:#1E3A8A,color:#fff
UncertaintyOS is designed to provide uncertainty-aware predictions for BRIDGE-TBI (conformal intervals on outcome predictions), calibration enforcement for Lab-in-a-Box (every generated manuscript must report calibration metrics), and uncertainty benchmarking for Clinical Arena. Canonical source: evidenceos-research/evidenceos-conformal.
This project is incubating. The core modules are production-grade in the parent codebase but have not been extracted for standalone use. Contributions to API design, documentation, and calibration utilities are the most immediately useful. See CONTRIBUTING.md.
Apache-2.0 -- see LICENSE for details.