Quickstart¶
Recommended Path¶
Start with the Python service and local Chroma path:
This starts the default proxy on port 8080, the FastAPI service on port 8000,
and local Chroma persistence under ./director_guard/chroma.
Other Entry Points¶
| Method | Command | Use When |
|---|---|---|
| CLI scaffold | director-ai quickstart --profile medical |
You want editable local files before running services |
| Base package | pip install director-ai |
You only need the in-process Python guard API |
| Colab notebook | You want a notebook walkthrough | |
| Docker image | docker build -t director-ai . && docker run -p 8080:8080 director-ai |
You are validating the packaged container |
| HF Spaces | Try demo (may be sleeping) | You only want to inspect the demo |
Installation¶
For the smallest in-process install:
pip install director-ai # rules engine + heuristic (zero ML, <1ms)
pip install director-ai[embed] # + embedding scorer (~65% BA, 3ms CPU)
pip install director-ai[nli] # + FactCG NLI (75.6% BA, 14.6ms GPU) — recommended
pip install director-ai[nli,server] # + REST API server for production
For backend choices beyond the default, use the advanced backend matrix.
CLI Quickstart¶
Scaffold a working project in one command:
Creates director_guard/ with config.yaml, facts.txt, guard.py,
README.md, Docker Compose, local Chroma persistence, and an opt-in FactCG
ONNX profile.
Run the default proxy and FastAPI services:
Run the ONNX service after placing exported model files in
director_guard/models/factcg-onnx/:
See ONNX Artefacts for export commands and CPU/GPU wheel targets.
Rust, Go, Julia, Lean, TensorRT, and WASM paths are optional advanced runtimes. Start with the Python-only quickstart unless one of those runtimes is explicitly needed. See Runtime Boundaries.
Score a Response¶
from director_ai import CoherenceScorer, GroundTruthStore
store = GroundTruthStore()
store.add("capital", "Paris is the capital of France.")
scorer = CoherenceScorer(threshold=0.3, ground_truth_store=store)
# Correct answer — approved
approved, cs = scorer.review(
"What is the capital of France?",
"The capital of France is Paris.",
)
print(f"Approved: {approved}") # True
print(f"Score: {cs.score:.3f}") # ~0.44
# Hallucinated answer — rejected
approved, cs = scorer.review(
"What is the capital of France?",
"The capital of France is Berlin.",
)
print(f"Approved: {approved}") # False
print(f"Score: {cs.score:.3f}") # ~0.02
Guard an SDK Client¶
Failure Modes¶
| Mode | Behavior |
|---|---|
on_fail="raise" |
Raises HallucinationError (default) |
on_fail="log" |
Logs warning, returns response unchanged |
on_fail="metadata" |
Stores score in context var for later inspection |
Streaming Halt¶
from director_ai import StreamingKernel
kernel = StreamingKernel(hard_limit=0.4, window_size=8)
def score_fn(accumulated_text):
return 0.85 # your coherence scoring logic on text so far
session = kernel.stream_tokens(token_generator, score_fn)
if session.halted:
print(f"Halted at token {session.halt_index}: {session.halt_reason}")
Fallback Modes¶
from director_ai import CoherenceAgent
# Retrieval: return KB context when all candidates fail
agent = CoherenceAgent(fallback="retrieval")
# Disclaimer: prepend warning to best-rejected candidate
agent = CoherenceAgent(fallback="disclaimer")
Batch Scoring¶
from director_ai import CoherenceScorer
scorer = CoherenceScorer(threshold=0.6, use_nli=True)
items = [
("What is 2+2?", "The answer is 4."),
("Capital of France?", "Paris is in Germany."),
]
results = scorer.review_batch(items)
for approved, score in results:
print(f"approved={approved} score={score.score:.3f}")
review_batch() batches NLI pairs into 2 GPU forward passes when NLI is available. Dialogue items fall back to sequential scoring.
Async Usage¶
import asyncio
from director_ai import CoherenceAgent
agent = CoherenceAgent(use_nli=True)
async def main():
result = await agent.aprocess("What is the capital of France?")
print(result)
asyncio.run(main())
Next Steps¶
- Scoring guide — thresholds, weights, NLI backends
- Streaming halt — halt mechanisms,
on_haltcallbacks - KB ingestion — populate your knowledge base
- Integrations — OpenAI, Anthropic, LangChain, and more
- Production deployment — scaling, caching, monitoring
- Domain presets — medical, finance, legal, creative profiles
- Tutorials — 16 Jupyter notebooks from basics to production