Types & Dataclasses¶
Shared data types returned by scorer, agent, and streaming methods. All are frozen or standard dataclasses.
CoherenceScore¶
The primary return type from CoherenceScorer.review() and score().
| Field | Type | Description |
|---|---|---|
score |
float |
Composite coherence score (0.0–1.0) |
approved |
bool |
Whether score >= threshold |
h_logical |
float |
Logical divergence (NLI contradiction probability) |
h_factual |
float |
Factual divergence (RAG deviation) |
warning |
bool |
True if score is between threshold and soft_limit |
evidence |
ScoringEvidence \| None |
Retrieved evidence and scoring details |
strict_mode_rejected |
bool |
True if rejected because NLI was unavailable in strict mode |
cross_turn_divergence |
float \| None |
Cross-turn NLI score (set when session context exists) |
verdict_confidence |
float \| None |
Guardrail confidence in its own verdict [0, 1] (v3.10.0) |
nli_model_confidence |
float \| None |
NLI softmax entropy-based confidence (v3.10.0) |
signal_agreement |
float \| None |
Agreement between h_logical and h_factual [0, 1] (v3.10.0) |
contradiction_index |
float \| None |
Cross-turn self-contradiction severity [0, 1] (v3.10.0) |
injection_risk |
float \| None |
Intent-grounded injection risk [0, 1] (when detection enabled) |
approved, score = scorer.review(query, response)
print(f"Score: {score.score:.3f}")
print(f"H_logical: {score.h_logical:.3f}")
print(f"H_factual: {score.h_factual:.3f}")
print(f"Warning: {score.warning}")
if score.evidence:
for chunk in score.evidence.chunks:
print(f" Source: {chunk.text[:80]}")
ReviewResult¶
Return type from CoherenceAgent.process().
| Field | Type | Description |
|---|---|---|
output |
str |
Best approved response (or fallback) |
coherence |
CoherenceScore \| None |
Coherence score of the response |
halted |
bool |
Whether safety kernel halted |
candidates_evaluated |
int |
Number of candidates generated |
fallback_used |
bool |
Whether a fallback was activated |
halt_evidence |
HaltEvidence \| None |
Structured halt reason |
safety_events |
tuple[SafetyEvent, ...] |
Tenant-safe hook decisions emitted during review |
ScoringEvidence¶
Evidence collected during scoring — retrieved KB chunks, NLI details, and attribution.
| Field | Type | Description |
|---|---|---|
chunks |
list[EvidenceChunk] |
Top-K retrieved chunks |
nli_premise |
str |
NLI premise text used |
nli_hypothesis |
str |
NLI hypothesis text used |
nli_score |
float |
Raw NLI divergence score |
chunk_scores |
list[float] \| None |
Per-chunk NLI scores |
premise_chunk_count |
int |
Number of premise chunks |
hypothesis_chunk_count |
int |
Number of hypothesis chunks |
attributions |
list[ClaimAttribution] \| None |
Per-claim source attribution |
token_count |
int \| None |
NLI token consumption |
estimated_cost_usd |
float \| None |
Estimated NLI inference cost |
claim_coverage |
float \| None |
Fraction of claims supported by source |
per_claim_divergences |
list[float] \| None |
Per-claim divergence scores |
claims |
list[str] \| None |
Decomposed atomic claims |
EvidenceChunk¶
A single chunk of retrieved evidence.
| Field | Type | Description |
|---|---|---|
text |
str |
Chunk text content |
distance |
float |
Similarity distance (lower = more relevant) |
source |
str |
Source identifier (default "") |
ClaimAttribution¶
Maps a summary claim to its source sentence.
| Field | Type | Description |
|---|---|---|
claim |
str |
The atomic claim |
claim_index |
int |
Index of the claim in the decomposed list |
source_sentence |
str |
Best-matching source sentence |
source_index |
int |
Index of the source sentence |
divergence |
float |
NLI divergence score (lower = better support) |
supported |
bool |
Whether the claim is supported |
HaltEvidence¶
Structured halt reason with evidence chunks.
| Field | Type | Description |
|---|---|---|
reason |
str |
Halt mechanism that triggered |
last_score |
float |
Coherence score at halt point |
evidence_chunks |
list[EvidenceChunk] |
Contradicting chunks |
nli_scores |
list[float] \| None |
NLI scores at halt point |
suggested_action |
str |
Recommended action (e.g., "retry with KB context") |
trace_attribution |
HaltTraceAttribution \| None |
Fact source, retrieval path, scorer path, token offset, threshold, and halt-margin data |
counterfactual_diagnostic |
CounterfactualHaltDiagnostic \| None |
Single-fact change diagnostic for the halt |
if session.halt_evidence_structured:
ev = session.halt_evidence_structured
print(f"Reason: {ev.reason}")
print(f"Score: {ev.last_score:.3f}")
if ev.trace_attribution:
print(f"Token: {ev.trace_attribution.token_offset}")
if ev.counterfactual_diagnostic and ev.counterfactual_diagnostic.best_change:
print(ev.counterfactual_diagnostic.best_change.fact_source)
for chunk in ev.evidence_chunks:
print(f" {chunk.text[:80]} (distance={chunk.distance:.3f})")
HaltTraceAttribution¶
Trace coordinates for the halt decision.
| Field | Type | Description |
|---|---|---|
fact_source |
str |
Source label from the contradicting evidence chunks |
retrieval_path |
str |
Retrieval path used to gather evidence for the halt |
scorer_path |
str |
Scorer method that produced the structured review |
token_offset |
int |
Token index where the halt condition triggered |
threshold |
float \| None |
Limit that was crossed |
causal_contribution |
float |
Absolute margin beyond the limit |
SafetyEvent¶
Tenant-safe halt or policy decision emitted by a runtime hook. The schema is shared by streaming, containment, attestation, ontology, trajectory, and cyber-physical paths so audit logs and trace tooling can read one record shape.
| Field | Type | Description |
|---|---|---|
schema_version |
str |
Frozen schema id, currently director.safety_event.v1 |
event_id |
str |
Opaque event id |
timestamp |
str |
RFC-3339 UTC timestamp |
request_id |
str |
Request correlation id, if available |
tenant_id |
str |
Tenant id, if available |
hook_id |
str |
Stable hook id such as streaming.kernel |
hook_scope |
str |
One of streaming, containment, attestation, ontology, trajectory, cyber_physical, swarm, or agent |
policy_decision |
str |
One of allow, warn, halt, or block |
halt_reason |
str |
Machine-readable reason |
threshold |
float \| None |
Decision threshold, when relevant |
observed_score |
float \| None |
Score that drove the decision, when relevant |
latency_ms |
float \| None |
Hook latency |
evidence_refs |
tuple[str, ...] |
Evidence identifiers only; no raw fact text |
tenant_safe_explanation |
str |
Short operator-facing explanation safe for tenant logs |
trace_attribution |
HaltTraceAttribution \| None |
Trace coordinates for halt decisions |
attributes |
dict[str, str] |
Additive string metadata for hook-specific fields |
from director_ai.core.safety_event import SafetyEvent
event = SafetyEvent.from_halt_evidence(
evidence,
hook_id="streaming.kernel",
request_id="req-123",
tenant_id="tenant-1",
)
payload = event.to_dict()
Runtime hook verdicts now carry this schema directly:
| Hook | Event field |
|---|---|
| Streaming | TokenEvent.safety_event, StreamSession.safety_events |
| Containment | ContainmentVerdict.safety_event, plus ReviewResult.safety_events when attached to CoherenceAgent |
| Cyber-physical | GroundingVerdict.safety_event |
| Attestation | PassportVerdict.safety_event |
| Ontology | OntologyViolation.safety_event |
| Trajectory | PreflightVerdict.safety_event |
| Swarm | HandoffResult.safety_event |
CounterfactualHaltDiagnostic¶
Answer to "what single fact change would have prevented this halt?"
| Field | Type | Description |
|---|---|---|
question |
str |
Diagnostic question that was answered |
observed_score |
float |
Stream score at the halt point |
threshold |
float |
Halt threshold used by the counterfactual graph |
best_change |
CounterfactualFactChange \| None |
First single-fact branch that prevents the halt |
candidates |
list[CounterfactualFactChange] |
Candidate single-fact branches |
CounterfactualFactChange¶
Candidate single-fact intervention.
| Field | Type | Description |
|---|---|---|
fact_source |
str |
Source label for the fact branch |
original_fact |
str |
Retrieved fact text at halt time |
proposed_fact |
str |
Proposed replacement fact from the generated claim |
required_score_delta |
float |
Minimum score increase needed to prevent the halt |
prevented_halt |
bool |
Whether this branch satisfies the halt invariant |
InjectionResult¶
Return type from InjectionDetector.detect() and ProductionGuard.check_injection().
| Field | Type | Description |
|---|---|---|
injection_detected |
bool |
Whether combined score exceeds threshold |
injection_risk |
float |
Aggregated injection risk [0, 1] |
intent_coverage |
float |
Fraction of grounded claims |
total_claims |
int |
Total decomposed claims |
grounded_claims |
int |
Claims aligned with intent |
drifted_claims |
int |
Claims deviating from intent |
injected_claims |
int |
Claims with no traceability to intent |
claims |
list[InjectedClaim] |
Per-claim breakdown |
input_sanitizer_score |
float |
Stage 1 pattern-match score |
combined_score |
float |
Weighted Stage 1 + Stage 2 |
InjectedClaim¶
Per-claim injection attribution from InjectionDetector.
| Field | Type | Description |
|---|---|---|
claim |
str |
The atomic claim text |
claim_index |
int |
Position in decomposed list |
intent_divergence |
float |
Forward NLI: intent → claim |
reverse_divergence |
float |
Reverse NLI: claim → intent |
bidirectional_divergence |
float |
min(forward, reverse) |
traceability |
float |
Content-word overlap with intent |
entity_match |
float |
Named-entity overlap with intent |
verdict |
str |
"grounded", "drifted", or "injected" |
confidence |
float |
Verdict confidence [0, 1] |
Full API¶
director_ai.core.types.CoherenceScore
dataclass
¶
CoherenceScore(score: float, approved: bool, h_logical: float, h_factual: float, evidence: ScoringEvidence | None = None, warning: bool = False, cross_turn_divergence: float | None = None, strict_mode_rejected: bool = False, verdict_confidence: float | None = None, nli_model_confidence: float | None = None, signal_agreement: float | None = None, contradiction_index: float | None = None, detected_task_type: str | None = None, escalated_to_judge: bool | None = None, nli_probs: dict[str, float] | None = None, retrieval_confidence: float | None = None, injection_risk: float | None = None)
Result of a coherence check on generated output.
attributions
property
¶
Per-claim source attribution with support/divergence.
claim_coverage
property
¶
Fraction of claims supported by source material (0-1).
unsupported_claims
property
¶
Claims not supported by any source — the hallucinated ones.
claim_provenance
¶
Structured provenance for each claim.
Returns a list of dicts, one per claim::
[
{
"claim": "Paris is the capital of France.",
"supported": True,
"source": "France is a country whose capital is Paris.",
"divergence": 0.12,
"source_index": 3,
},
...
]
director_ai.core.types.ReviewResult
dataclass
¶
ReviewResult(output: str, coherence: CoherenceScore | None, halted: bool, candidates_evaluated: int, fallback_used: bool = False, halt_evidence: HaltEvidence | None = None, safety_events: tuple[SafetyEvent, ...] = ())
Full review outcome from the CoherenceAgent pipeline.
director_ai.core.types.ScoringEvidence
dataclass
¶
ScoringEvidence(chunks: list[EvidenceChunk], nli_premise: str, nli_hypothesis: str, nli_score: float, chunk_scores: list[float] | None = None, premise_chunk_count: int = 1, hypothesis_chunk_count: int = 1, claim_coverage: float | None = None, per_claim_divergences: list[float] | None = None, claims: list[str] | None = None, attributions: list[ClaimAttribution] | None = None, token_count: int | None = None, estimated_cost_usd: float | None = None)
Evidence collected during coherence scoring.
director_ai.core.types.EvidenceChunk
dataclass
¶
A single RAG retrieval result with relevance distance.
director_ai.core.types.ClaimAttribution
dataclass
¶
ClaimAttribution(claim: str, claim_index: int, source_sentence: str, source_index: int, divergence: float, supported: bool)
Maps a summary claim to the source sentence that best supports/contradicts it.
director_ai.core.types.HaltEvidence
dataclass
¶
HaltEvidence(reason: str, last_score: float, evidence_chunks: list[EvidenceChunk], nli_scores: list[float] | None = None, suggested_action: str = '', trace_attribution: HaltTraceAttribution | None = None, counterfactual_diagnostic: CounterfactualHaltDiagnostic | None = None)
Structured evidence returned when the agent halts.
director_ai.core.types.HaltTraceAttribution
dataclass
¶
HaltTraceAttribution(fact_source: str = '', retrieval_path: str = '', scorer_path: str = '', token_offset: int = -1, threshold: float | None = None, causal_contribution: float = 0.0)
Trace coordinates for a halt decision.
director_ai.core.types.CounterfactualHaltDiagnostic
dataclass
¶
CounterfactualHaltDiagnostic(question: str, observed_score: float, threshold: float, best_change: CounterfactualFactChange | None, candidates: list[CounterfactualFactChange])
Counterfactual answer for a halted stream.
director_ai.core.types.CounterfactualFactChange
dataclass
¶
CounterfactualFactChange(fact_source: str, original_fact: str, proposed_fact: str, required_score_delta: float, prevented_halt: bool)
Single fact change that could prevent a halt.