Director-Lite — streaming halt in 3 lines¶
Director-AI's one differentiator is the token-level streaming halt: it stops a model's output before a hallucination finishes generating — a seatbelt that deploys before the crash, not an incident report after it.
director_ai.lite is the 30-second front door to exactly that. No model
download, no knowledge-base wiring, no config.
Priority
Director-AI publicly shipped token-level streaming halt in early 2026 and deposited it on Zenodo (March 2026) — ahead of the streaming-guardrail research literature.
from director_ai import StreamGuard
guard = StreamGuard(facts={"capital": "Paris is the capital of France."})
result = guard.guard(token_stream, prompt="What is the capital of France?")
print(result.output) # surviving text (halted tokens removed)
print(result.halted) # True if the stream was stopped
print(result.halt_reason) # e.g. "hard_limit (0.30 < 0.50)"
token_stream is any iterable of string tokens — wire it straight to your LLM's
streaming response.
One call¶
from director_ai import streaming_guard
result = streaming_guard(
token_stream,
facts={"capital": "Paris is the capital of France."},
prompt="What is the capital of France?",
)
How it relates to the full product¶
StreamGuard wraps the same Apache-2.0 StreamingKernel and CoherenceScorer
the full Director-AI uses. By default it runs in heuristic mode (no NLI
model) so it works anywhere with zero install — great for a first look, but
approximate.
To upgrade to model-backed accuracy, install the NLI extra and pass a configured scorer; the call site does not change:
from director_ai import CoherenceScorer, GroundTruthStore, StreamGuard
store = GroundTruthStore()
store.add("capital", "Paris is the capital of France.")
scorer = CoherenceScorer(ground_truth_store=store) # loads the NLI model
guard = StreamGuard(scorer=scorer, threshold=0.6)
From there, the rest of Director-AI — RAG grounding, the sealed evidence packet, the tamper-evident audit chain, multi-tenant isolation, the REST/gRPC server, and the CI quality gate — is one import away when you need it.
Parameters¶
facts |
mapping of key → grounded statement for factual scoring |
threshold |
coherence floor; the stream halts below it (default 0.5) |
window_size |
sliding-window length for trend-based halting (default 10) |
scorer |
optional pre-built scorer (e.g. model-backed NLI) — overrides the heuristic default |