Quickstart¶
Use this page when you want to run Director-AI, score a response, and protect one application path quickly. If you are still deciding whether the product fits your use case, read Product Overview first. If you are planning a pilot, use Evaluation Onboarding as the checklist.
Choose Your First Path¶
| Goal | Start with | Then read |
|---|---|---|
| Understand the product | Applications and Market Map | Product Overview |
| Score one answer | score() example below |
Scoring |
| Wrap an existing app | guard() example below |
SDK Guard |
| Protect a RAG bot | director-ai[vector] |
KB Ingestion |
| Evaluate a pilot | labelled examples | Evaluation Onboarding |
| Deploy a service | director-ai quickstart --run |
Production Guide |
| Prepare a production stack | director-ai quickstart --profile production |
Monitoring |
Recommended Path¶
Start with the Python service and local Chroma path:
This starts the default proxy on port 8080, the FastAPI service on port 8000,
and local Chroma persistence under ./director_guard/chroma.
What You Should See¶
After the first run you should have:
- a guarded call path that can score a prompt/response pair;
- at least one governed fact loaded inline or through local Chroma;
- a rejection or low score for a deliberately wrong answer;
- a path to inspect the score, evidence, and failure action;
- no secrets printed in logs or notebook output.
If you are evaluating Director-AI for a team, save those five observations in the pilot evidence packet from Evaluation Onboarding.
Other Entry Points¶
| Method | Command | Use When |
|---|---|---|
| CLI scaffold | director-ai quickstart --profile medical |
You want editable local files before running services |
| Base package | pip install director-ai |
You only need the in-process Python guard API |
| Colab notebook | You want a notebook walkthrough | |
| Docker image | docker build -t director-ai . && docker run -p 8080:8080 director-ai |
You are validating the packaged container |
| HF Spaces | Try demo (may be sleeping) | You only want to inspect the demo |
Installation¶
For the smallest in-process install:
pip install director-ai # rules engine + heuristic (zero ML, <1ms)
pip install director-ai[embed] # + embedding scorer (~65% BA, 3ms CPU)
pip install director-ai[nli] # + FactCG NLI (75.6% BA, 14.6ms GPU) — recommended
pip install director-ai[nli,server] # + REST API server for production
For backend choices beyond the default, use the advanced backend matrix.
CLI Quickstart¶
Scaffold a working project in one command:
Creates director_guard/ with config.yaml, facts.txt, guard.py,
README.md, Docker Compose, local Chroma persistence, and an opt-in FactCG
ONNX profile.
Run the default proxy and FastAPI services:
For an authenticated production scaffold:
Fill .env with DIRECTOR_API_KEY_TENANT_MAP, DIRECTOR_PROXY_API_KEYS,
DIRECTOR_LLM_API_URL, DIRECTOR_UPSTREAM_URL, DIRECTOR_KB_HMAC_KEYS, and
DIRECTOR_CORS_ORIGINS, then start the service:
The production scaffold enables NLI, model-backed fail-closed checks, tenant
routing, signed knowledge writes, audit/compliance/feedback stores, JSON logs,
authenticated metrics, rate limiting, and the review queue. To add Prometheus,
write the matching API key to secrets/director-api-key and run:
Run the ONNX service after placing exported model files in
director_guard/models/factcg-onnx/:
See ONNX Artefacts for export commands and CPU/GPU wheel targets.
Rust, Go, Julia, Lean, TensorRT, and WASM paths are optional advanced runtimes. Start with the Python-only quickstart unless one of those runtimes is explicitly needed. See Runtime Boundaries.
Score a Response¶
from director_ai import CoherenceScorer, GroundTruthStore
store = GroundTruthStore()
store.add("capital", "Paris is the capital of France.")
scorer = CoherenceScorer(threshold=0.3, ground_truth_store=store)
# Correct answer — approved
approved, cs = scorer.review(
"What is the capital of France?",
"The capital of France is Paris.",
)
print(f"Approved: {approved}") # True
print(f"Score: {cs.score:.3f}") # ~0.44
# Hallucinated answer — rejected
approved, cs = scorer.review(
"What is the capital of France?",
"The capital of France is Berlin.",
)
print(f"Approved: {approved}") # False
print(f"Score: {cs.score:.3f}") # ~0.02
Guard an SDK Client¶
import os
from director_ai import guard
from mistralai import Mistral
client = guard(
Mistral(api_key=os.environ["MISTRAL_API_KEY"]),
facts={"refund": "within 30 days"},
)
response = client.chat.complete(
model="mistral-large-latest",
messages=[{"role": "user", "content": "What is the refund policy?"}],
)
Failure Modes¶
| Mode | Behavior |
|---|---|
on_fail="raise" |
Raises HallucinationError (default) |
on_fail="log" |
Logs warning, returns response unchanged |
on_fail="metadata" |
Stores score in context var for later inspection |
Streaming Halt¶
from director_ai import StreamingKernel
kernel = StreamingKernel(hard_limit=0.4, window_size=8)
def score_fn(accumulated_text):
return 0.85 # your coherence scoring logic on text so far
session = kernel.stream_tokens(token_generator, score_fn)
if session.halted:
print(f"Halted at token {session.halt_index}: {session.halt_reason}")
Fallback Modes¶
from director_ai import CoherenceAgent
# Retrieval: return KB context when all candidates fail
agent = CoherenceAgent(fallback="retrieval")
# Disclaimer: prepend warning to best-rejected candidate
agent = CoherenceAgent(fallback="disclaimer")
Batch Scoring¶
from director_ai import CoherenceScorer
scorer = CoherenceScorer(threshold=0.6, use_nli=True)
items = [
("What is 2+2?", "The answer is 4."),
("Capital of France?", "Paris is in Germany."),
]
results = scorer.review_batch(items)
for approved, score in results:
print(f"approved={approved} score={score.score:.3f}")
review_batch() batches NLI pairs into 2 GPU forward passes when NLI is available. Dialogue items fall back to sequential scoring.
Async Usage¶
import asyncio
from director_ai import CoherenceAgent
agent = CoherenceAgent(use_nli=True)
async def main():
result = await agent.aprocess("What is the capital of France?")
print(result)
asyncio.run(main())
Next Steps¶
- Scoring guide — thresholds, weights, NLI backends
- Streaming halt — halt mechanisms,
on_haltcallbacks - KB ingestion — populate your knowledge base
- Integrations — OpenAI, Anthropic, LangChain, and more
- Production deployment — scaling, caching, monitoring
- Domain presets — medical, finance, legal, creative profiles
- Tutorials — 18 Jupyter notebooks from basics to production
- Notebook Gallery — use-case index across every published notebook