Guardrail Forensics¶
The KPI layer says whether the guardrail is healthy. The forensics layer explains reviewed misses without exposing raw prompt, response, or evidence text.
build_forensics_report() consumes tenant-safe eval records, usually from
eval_trace, joined with reviewer labels:
from director_ai.core.observability import build_forensics_report
records = [
{
"director.eval.answer_id": "case-1",
"director.eval.approved": True,
"director.eval.score": 0.82,
"director.eval.threshold": 0.60,
"director.eval.scorer": "nli",
"director.eval.model": "customer-model",
"director.eval.evidence_count": 0,
"label": "hallucination",
}
]
report = build_forensics_report(records)
The report classifies each reviewed case as:
| Outcome | Meaning |
|---|---|
false_negative |
Reviewer labelled a hallucination that the guard allowed. |
false_positive |
Reviewer labelled a grounded answer that the guard halted. |
correct_halt |
Reviewer confirmed a halted hallucination. |
correct_allow |
Reviewer confirmed an allowed grounded answer. |
unlabelled_allow / unlabelled_halt |
Eval record has no reviewer label yet. |
For every case it records the scorer, model, model revision when supplied,
domain, threshold margin, knowledge-state summary, reason, and recommended
operator action. Examples include refresh_or_add_governed_facts,
add_counterexample_and_recalibrate_scorer, and
review_retrieval_source_mapping.
CLI¶
director-ai forensics reads either a JSON array of records or an object with a
records array:
The json output includes:
- top-level miss counts;
- misses grouped by scorer, model, and domain;
- per-case action recommendations;
- a privacy block confirming that raw prompt, response, and evidence text are not included.
This is the core file/export surface. The richer safety dashboard remains the UI/operations packet around halt rates, drift alerts, controls, and compliance exports.
API¶
director_ai.core.observability.forensics.ForensicsCase
dataclass
¶
ForensicsCase(case_id: str, outcome: str, approved: bool, expected_label: str, score: float, threshold: float, margin: float, scorer: str, model: str, model_revision: str, domain: str, knowledge_state: str, evidence_count: int, unsupported_claims: int, reason: str, recommended_action: str)
One tenant-safe reviewed guard decision for operator forensics.
to_dict
¶
Return a JSON-compatible tenant-safe case payload.
director_ai.core.observability.forensics.ForensicsReport
dataclass
¶
ForensicsReport(total_records: int, labelled_records: int, misses_total: int, false_negatives: int, false_positives: int, missed_by_scorer: dict[str, int], missed_by_model: dict[str, int], missed_by_domain: dict[str, int], cases: tuple[ForensicsCase, ...])
Tenant-safe scorer-miss report for a reviewed decision window.
director_ai.core.observability.forensics.build_forensics_report
¶
Build a scorer-miss report from tenant-safe eval/reviewer records.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
records
|
Sequence[Mapping[str, object]]
|
Eval-trace records or JSON objects containing at least approval, score,
threshold, scorer/model metadata, and optionally a reviewer label. The
function accepts both |
required |
director_ai.core.observability.forensics.render_forensics_markdown
¶
Render a Markdown scorer-miss report for operator review.
director_ai.core.observability.forensics.render_forensics_text
¶
Render a plain-text scorer-miss report for CLI output.