Guardrail Forensics¶

The KPI layer says whether the guardrail is healthy. The forensics layer explains reviewed misses without exposing raw prompt, response, or evidence text.

build_forensics_report() consumes tenant-safe eval records, usually from eval_trace, joined with reviewer labels:

from director_ai.core.observability import build_forensics_report

records = [
    {
        "director.eval.answer_id": "case-1",
        "director.eval.approved": True,
        "director.eval.score": 0.82,
        "director.eval.threshold": 0.60,
        "director.eval.scorer": "nli",
        "director.eval.model": "customer-model",
        "director.eval.evidence_count": 0,
        "label": "hallucination",
    }
]

report = build_forensics_report(records)

The report classifies each reviewed case as:

Outcome	Meaning
`false_negative`	Reviewer labelled a hallucination that the guard allowed.
`false_positive`	Reviewer labelled a grounded answer that the guard halted.
`correct_halt`	Reviewer confirmed a halted hallucination.
`correct_allow`	Reviewer confirmed an allowed grounded answer.
`unlabelled_allow` / `unlabelled_halt`	Eval record has no reviewer label yet.

For every case it records the scorer, model, model revision when supplied, domain, threshold margin, knowledge-state summary, reason, and recommended operator action. Examples include refresh_or_add_governed_facts, add_counterexample_and_recalibrate_scorer, and review_retrieval_source_mapping.

CLI¶

director-ai forensics reads either a JSON array of records or an object with a records array:

director-ai forensics --input eval_records.json --format markdown

The json output includes:

top-level miss counts;
misses grouped by scorer, model, and domain;
per-case action recommendations;
a privacy block confirming that raw prompt, response, and evidence text are not included.

This is the core file/export surface. The richer safety dashboard remains the UI/operations packet around halt rates, drift alerts, controls, and compliance exports.

API¶

director_ai.core.observability.forensics.ForensicsCase `dataclass` ¶

ForensicsCase(case_id: str, outcome: str, approved: bool, expected_label: str, score: float, threshold: float, margin: float, scorer: str, model: str, model_revision: str, domain: str, knowledge_state: str, evidence_count: int, unsupported_claims: int, reason: str, recommended_action: str)

One tenant-safe reviewed guard decision for operator forensics.

to_dict ¶

to_dict() -> dict[str, str | int | float | bool]

Return a JSON-compatible tenant-safe case payload.

director_ai.core.observability.forensics.ForensicsReport `dataclass` ¶

ForensicsReport(total_records: int, labelled_records: int, misses_total: int, false_negatives: int, false_positives: int, missed_by_scorer: dict[str, int], missed_by_model: dict[str, int], missed_by_domain: dict[str, int], cases: tuple[ForensicsCase, ...])

Tenant-safe scorer-miss report for a reviewed decision window.

to_dict ¶

to_dict() -> dict[str, Any]

Return a JSON-compatible report payload.

director_ai.core.observability.forensics.build_forensics_report ¶

build_forensics_report(records: Sequence[Mapping[str, object]]) -> ForensicsReport

Build a scorer-miss report from tenant-safe eval/reviewer records.

Parameters:

Name	Type	Description	Default
`records`	`Sequence[Mapping[str, object]]`	Eval-trace records or JSON objects containing at least approval, score, threshold, scorer/model metadata, and optionally a reviewer label. The function accepts both `director.eval.*` keys and plain aliases such as `approved` or `label` so exports can be joined without rewriting.	required

director_ai.core.observability.forensics.render_forensics_markdown ¶

render_forensics_markdown(report: ForensicsReport) -> str

Render a Markdown scorer-miss report for operator review.

director_ai.core.observability.forensics.render_forensics_text ¶

render_forensics_text(report: ForensicsReport) -> str

Render a plain-text scorer-miss report for CLI output.

Guardrail Forensics¶

CLI¶

API¶

director_ai.core.observability.forensics.ForensicsCase dataclass ¶

to_dict ¶

director_ai.core.observability.forensics.ForensicsReport dataclass ¶

to_dict ¶

director_ai.core.observability.forensics.build_forensics_report ¶

director_ai.core.observability.forensics.render_forensics_markdown ¶

director_ai.core.observability.forensics.render_forensics_text ¶

director_ai.core.observability.forensics.ForensicsCase `dataclass` ¶

director_ai.core.observability.forensics.ForensicsReport `dataclass` ¶