Safety Operations Dashboard¶
The safety operations dashboard turns tenant-safe SafetyEvent JSONL and
calibration feedback into an immediate halt-rate view.
It is intended for the first response to drift, stale knowledge, and repeated false-positive halts:
- per-tenant event count, halt count, halt rate, false-positive count, and alert state
- top contradiction sources across recent halts
- recent halt evidence with score, reason, source, and suggested operator action
- a ready-to-run retune command for recent labelled feedback
- a one-click retune action in the wizard that turns labelled feedback into a tuned profile overlay with threshold guidance and confidence notes
Launch The UI¶
The same panel is also available inside:
Open the Safety Ops tab, paste SafetyEvent JSONL, paste optional feedback
JSONL, and render the tables. Use Retune from Feedback when the feedback
rows include prompt, response, and a human verdict.
Text Mode¶
Use text mode when running over SSH or in CI:
director-ai safety-dashboard \
--text \
--events safety_events.jsonl \
--feedback recent_feedback.jsonl
The command prints tenant halt rates, top contradiction sources, recent evidence, and the retune command:
Trust Console Report¶
The same tenant-safe parser can build a customer-facing Trust Console report for security reviews, pilots, and procurement questionnaires. The report includes aggregate halt metrics, tenant alert state, recent evidence references, and operator-supplied readiness controls. It never serialises raw prompt text, response text, customer identifiers, media payloads, or feedback payloads.
from director_ai.ui import TrustControl, build_trust_console_report
events_jsonl = open("safety_events.jsonl", encoding="utf-8").read()
report = build_trust_console_report(
events_jsonl,
controls=[
TrustControl(
control="PII redaction",
status="passed",
evidence_ref="docs/BENCHMARKS.md#pii-redaction",
owner="security",
updated_at="2026-05-17",
),
TrustControl(
control="Article 15 report template",
status="passed",
evidence_ref="docs-site/guide/compliance-reporting.md",
),
],
generated_at="2026-05-17T12:05:00Z",
)
payload = report.to_dict()
markdown = report.to_markdown()
assert payload["privacy"]["raw_event_text_included"] is False
Control statuses are passed, warning, failing, or not_applicable.
Any failing control marks the report risk level as critical; tenant alert
states or warning controls mark it as attention_required.
Observability Operations Report¶
Use the operations report when the deployment gate needs one tenant-safe packet that combines halt forensics, drift alerts, readiness controls, and compliance export references:
from director_ai.ui import (
ComplianceExportRef,
TrustControl,
build_observability_operations_report,
)
events_jsonl = open("safety_events.jsonl", encoding="utf-8").read()
report = build_observability_operations_report(
events_jsonl,
controls=[
TrustControl(
control="Trace retention",
status="passed",
evidence_ref="runbooks/trace-retention.md",
),
],
compliance_exports=[
ComplianceExportRef(
standard="EU AI Act Article 15",
name="30-day operations export",
status="available",
evidence_ref="reports/article15-current.md",
),
],
drift_alert_threshold=0.10,
)
payload = report.to_dict()
markdown = report.to_markdown()
assert payload["privacy"]["raw_event_text_included"] is False
The drift alert uses the first and second halves of each tenant's recent event stream as baseline and current windows. It requires enough samples in both windows, ignores feedback rows for halt-rate drift, and reports mild, moderate, or severe drift based on rate change.
Alert Thresholds¶
The defaults are intentionally conservative:
- halt-rate alert:
0.15 - false-positive alert:
0.05
Override them when a deployment has a known review cadence:
director-ai safety-dashboard \
--text \
--events safety_events.jsonl \
--halt-alert-threshold 0.25 \
--false-positive-alert-threshold 0.10
Prometheus And Grafana Mixin¶
The deployable safety-operations mixin lives under deploy/observability/:
safety-ops-prometheus-rules.ymlsafety-ops-grafana-dashboard.json
Load the rules from Prometheus:
Import safety-ops-grafana-dashboard.json into Grafana and select the same
Prometheus data source that scrapes /v1/metrics/prometheus.
The mixin uses these metric families:
| Metric | Purpose |
|---|---|
director_ai_halts_total |
Halt-rate numerator |
director_ai_reviews_total |
Halt-rate denominator |
director_ai_feedback_total{outcome="false_positive"} |
False-positive feedback |
director_ai_kb_stale_sources |
Stale source count |
director_ai_retune_recommended |
Retune recommendation state |
director_ai_retune_recommendations_total |
Retune recommendation count |
The alert rules include safe denominator guards for low-traffic services, so a single halt during startup does not divide by zero.
Input Shape¶
Each event should be one tenant-safe JSON object per line. Native
SafetyEvent.to_dict() records work directly:
{"event_id":"e1","tenant_id":"tenant-a","policy_decision":"halt","halt_reason":"contradiction","observed_score":0.22,"trace_attribution":{"fact_source":"kb://physics"},"tenant_safe_explanation":"Refresh the cited fact."}
Feedback rows can use the calibration format:
{"event_id":"e1","tenant_id":"tenant-a","guardrail_approved":false,"human_approved":true,"source":"kb://physics"}
Rows where the guard rejected an answer but the human accepted it count as false positives for the tenant.
Rows used for one-click retuning also need the reviewed prompt and response:
{"prompt":"What is the refund window?","response":"Refunds are available for 30 days.","guardrail_approved":true,"human_approved":true,"domain":"support"}
The UI requires at least four labelled rows before it emits an overlay. Mixed approved and rejected examples produce more reliable threshold guidance; single-class feedback is allowed but marked provisional.