Conformal & Uncertainty-Aware Routing¶
UncertaintyRouter turns a calibrated hallucination interval into a downstream
action. Where RiskRouter routes inputs to a scoring backend, this router
acts on the output: it consumes the conformal PredictionInterval over
hallucination probability and applies documented risk bounds.
| Condition | Action |
|---|---|
interval upper ≤ allow_upper |
allow (confidently low-risk) |
interval lower ≥ reject_lower |
reject (confidently high-risk) |
uncertain and width ≥ escalate_human_width (or calibration unreliable) |
escalate_human |
| uncertain and narrower | escalate_model (LLM judge / ensemble) |
The router is side-effect free and deterministic; each UncertaintyDecision
records the bounds it used, so the routing rationale is auditable. Dispatching
the action — to a review queue for escalate_human, to a stronger model for
escalate_model — is the caller's job.
Online calibration¶
ConformalPredictor.add_observation(score, correct_label) folds one human
verdict into the calibration set and refreshes the conformal quantile, so the
intervals tighten as feedback accumulates. correct_label=True marks a correct
(non-hallucinated) response.
from director_ai.core.calibration.conformal import ConformalPredictor
from director_ai.core.routing import UncertaintyRouter
predictor = ConformalPredictor(coverage=0.9, min_samples=30)
for score, correct in feedback_history:
predictor.add_observation(score, correct_label=correct)
router = UncertaintyRouter(allow_upper=0.2, reject_lower=0.8, escalate_human_width=0.5)
decision = router.route(predictor.predict(coherence_score))
if decision.action == "escalate_human":
review_queue.submit(...)
elif decision.action == "escalate_model":
llm_judge.adjudicate(...)
ProductionGuard integration¶
ProductionGuard wires both together. After enable_calibration(), call
enable_uncertainty_routing(); every check() then populates
GuardResult.uncertainty_action from the conformal interval. Until calibration
is reliable, the action is escalate_human — uncertainty defers to a person.
guard.enable_calibration(alpha=0.1)
guard.enable_uncertainty_routing()
result = guard.check(prompt, response)
result.uncertainty_action # "allow" | "reject" | "escalate_human" | "escalate_model"
Full API¶
director_ai.core.routing.uncertainty_router.UncertaintyDecision
dataclass
¶
UncertaintyDecision(action: UncertaintyAction, point_estimate: float, lower: float, upper: float, width: float, is_reliable: bool, reason: str)
One uncertainty-routing outcome with the bounds that produced it.
director_ai.core.routing.uncertainty_router.UncertaintyRouter
¶
UncertaintyRouter(*, allow_upper: float = 0.2, reject_lower: float = 0.8, escalate_human_width: float = 0.5)
Map a conformal interval to an allow/reject/escalate action.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
allow_upper
|
float
|
Interval upper bound at or below which the response is allowed. Default 0.2. |
0.2
|
reject_lower
|
float
|
Interval lower bound at or above which the response is rejected.
Must be strictly greater than |
0.8
|
escalate_human_width
|
float
|
In the uncertain band, intervals at least this wide go to human review; narrower ones go to a stronger model. Default 0.5. |
0.5
|
route
¶
Return the routing decision for one conformal interval.
director_ai.core.calibration.conformal.ConformalPredictor
¶
Split conformal prediction for hallucination probability.
Uses nonconformity scores derived from (guardrail_score, human_label) pairs to construct prediction intervals.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
coverage
|
float
|
Target coverage probability (e.g., 0.95 for 95% intervals). |
0.95
|
min_samples
|
int
|
Minimum calibration samples for reliable intervals. Below this, intervals are returned but marked unreliable. |
30
|
calibrate
¶
Calibrate from (score, label) pairs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scores
|
list[float]
|
Guardrail coherence scores (higher = more coherent). |
required |
labels
|
list[bool]
|
True if the response was actually a hallucination (human-verified). |
required |
add_observation
¶
Add one human-labelled observation and refresh calibration.
correct_label=True means the checked response was correct, while
the conformal label stores whether it was actually a hallucination.
calibrate_from_feedback
¶
Calibrate from a FeedbackStore instance.
Reads all entries where human_label is not None and uses (score, human_label) as calibration data.
predict
¶
Predict hallucination probability interval for a new score.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
score
|
float
|
Guardrail coherence score for the new response. |
required |
Returns:
| Type | Description |
|---|---|
PredictionInterval
|
Calibrated interval with coverage guarantee. |
predict_interval
¶
Return the interval tuple expected by ProductionGuard.
route
¶
Route a score using calibrated uncertainty bounds.