Skip to content

Adaptive Threshold Learning

director_ai.core.calibration.adaptive_threshold.AdaptiveThresholdLearner

AdaptiveThresholdLearner(*, candidate_thresholds: list[float] | tuple[float, ...], current_threshold: float, min_samples: int = 20, min_expected_lift: float = 0.01, max_false_positive_rate: float = 1.0, max_false_negative_rate: float = 1.0, alpha_prior: float = 1.0, beta_prior: float = 1.0, random_seed: int | None = None)

Offline Thompson-sampling threshold recommender.

The learner is intentionally side-effect free with respect to runtime scorer configuration. Production deployments should route the returned recommendation through human review/change-management and keep the rollback threshold recorded with the approved overlay.

arm

arm(threshold: float) -> AdaptiveThresholdArm

Return the candidate arm for a validated threshold.

observe

observe(score: float, human_approved: bool) -> AdaptiveThresholdReport

Replay one labelled score across all candidate thresholds.

observe_batch

observe_batch(feedback: list[ThresholdFeedback] | tuple[ThresholdFeedback, ...]) -> AdaptiveThresholdReport

Replay a batch of labelled feedback across all candidate thresholds.

report

report() -> AdaptiveThresholdReport

Return the current replay summary without making a recommendation.

recommend

recommend() -> AdaptiveThresholdRecommendation

Return a human-gated threshold recommendation from replayed evidence.

director_ai.core.calibration.adaptive_threshold.ThresholdFeedback dataclass

ThresholdFeedback(score: float, human_approved: bool, weight: float = 1.0, metadata: dict[str, Any] = dict())

One human-labelled score used for threshold replay.

__post_init__

__post_init__() -> None

Validate feedback score and replay weight.

director_ai.core.calibration.adaptive_threshold.AdaptiveThresholdArm dataclass

AdaptiveThresholdArm(threshold: float, alpha_prior: float = 1.0, beta_prior: float = 1.0, pulls: int = 0, successes: int = 0, true_positives: int = 0, false_positives: int = 0, true_negatives: int = 0, false_negatives: int = 0)

Posterior state and replay metrics for one candidate threshold.

failures property

failures: int

Return the number of replayed samples this arm misclassified.

alpha property

alpha: float

Return posterior alpha after successful classifications.

beta property

beta: float

Return posterior beta after failed classifications.

posterior_mean property

posterior_mean: float

Return the posterior expected success probability.

accuracy property

accuracy: float

Return empirical accuracy across replayed feedback.

false_positive_rate property

false_positive_rate: float

Return the false-positive rate for human-rejected samples.

false_negative_rate property

false_negative_rate: float

Return the false-negative rate for human-approved samples.

__post_init__

__post_init__() -> None

Validate arm threshold and Beta prior parameters.

observe

observe(feedback: ThresholdFeedback) -> None

Replay one labelled score against this threshold arm.

sample_success_probability

sample_success_probability(rng: Random) -> float

Sample a Thompson posterior success probability for this arm.

to_dict

to_dict() -> dict[str, Any]

Serialise this arm's posterior and confusion metrics.

director_ai.core.calibration.adaptive_threshold.AdaptiveThresholdReport dataclass

AdaptiveThresholdReport(total_feedback: int, current_threshold: float, best_observed_threshold: float | None, arms: tuple[AdaptiveThresholdArm, ...])

Snapshot after replaying feedback across threshold arms.

to_dict

to_dict() -> dict[str, Any]

Serialise the threshold replay report.

director_ai.core.calibration.adaptive_threshold.AdaptiveThresholdRecommendation dataclass

AdaptiveThresholdRecommendation(current_threshold: float, recommended_threshold: float | None, expected_success_probability: float, current_success_probability: float, expected_lift: float, reason: str, requires_human_approval: bool = True, rollback_threshold: float | None = None, safety_constraints: dict[str, float] = dict())

Human-review-gated threshold recommendation.

to_profile_overlay

to_profile_overlay(*, profile: str = 'adaptive') -> dict[str, object]

Return a profile overlay that can be reviewed before promotion.

Safety Boundary

AdaptiveThresholdLearner is an offline recommender. It replays human-labelled score feedback across fixed candidate thresholds, estimates each candidate with Beta-Bernoulli posteriors, applies false-positive and false-negative safety constraints, and returns a recommendation object.

It does not mutate CoherenceScorer, DirectorConfig, profile files, or live runtime thresholds. Apply the returned profile overlay only after operator approval and keep the rollback threshold in change-management records.

from director_ai.core import AdaptiveThresholdLearner, ThresholdFeedback

learner = AdaptiveThresholdLearner(
    candidate_thresholds=[0.3, 0.4, 0.5, 0.6],
    current_threshold=0.4,
    max_false_negative_rate=0.05,
)
learner.observe_batch(
    [
        ThresholdFeedback(score=0.82, human_approved=True),
        ThresholdFeedback(score=0.28, human_approved=False),
    ]
)
recommendation = learner.recommend()

if recommendation.recommended_threshold is not None:
    overlay = recommendation.to_profile_overlay(profile="candidate")

For regulated deployments, use this together with HumanReviewQueue, OnlineCalibrator, and drift reports. Treat candidate thresholds as a controlled change, not as autonomous production policy.