Quickstart¶

Recommended Path¶

Start with the Python service and local Chroma path:

pip install director-ai[server,vector]
director-ai quickstart --run
director-ai doctor

This starts the default proxy on port 8080, the FastAPI service on port 8000, and local Chroma persistence under ./director_guard/chroma.

Other Entry Points¶

Method	Command	Use When
CLI scaffold	`director-ai quickstart --profile medical`	You want editable local files before running services
Base package	`pip install director-ai`	You only need the in-process Python guard API
Colab notebook		You want a notebook walkthrough
Docker image	`docker build -t director-ai . && docker run -p 8080:8080 director-ai`	You are validating the packaged container
HF Spaces	Try demo (may be sleeping)	You only want to inspect the demo

Installation¶

For the smallest in-process install:

pip install director-ai                # rules engine + heuristic (zero ML, <1ms)
pip install director-ai[embed]         # + embedding scorer (~65% BA, 3ms CPU)
pip install director-ai[nli]           # + FactCG NLI (75.6% BA, 14.6ms GPU) — recommended
pip install director-ai[nli,server]    # + REST API server for production

For backend choices beyond the default, use the advanced backend matrix.

CLI Quickstart¶

Scaffold a working project in one command:

director-ai quickstart --profile medical
cd director_guard
python guard.py

Creates director_guard/ with config.yaml, facts.txt, guard.py, README.md, Docker Compose, local Chroma persistence, and an opt-in FactCG ONNX profile.

Run the default proxy and FastAPI services:

director-ai quickstart --run

Run the ONNX service after placing exported model files in director_guard/models/factcg-onnx/:

cd director_guard
docker compose --profile onnx up director-proxy-onnx

See ONNX Artefacts for export commands and CPU/GPU wheel targets.

Rust, Go, Julia, Lean, TensorRT, and WASM paths are optional advanced runtimes. Start with the Python-only quickstart unless one of those runtimes is explicitly needed. See Runtime Boundaries.

Score a Response¶

from director_ai import CoherenceScorer, GroundTruthStore

store = GroundTruthStore()
store.add("capital", "Paris is the capital of France.")

scorer = CoherenceScorer(threshold=0.3, ground_truth_store=store)

# Correct answer — approved
approved, cs = scorer.review(
    "What is the capital of France?",
    "The capital of France is Paris.",
)
print(f"Approved: {approved}")        # True
print(f"Score: {cs.score:.3f}")       # ~0.44

# Hallucinated answer — rejected
approved, cs = scorer.review(
    "What is the capital of France?",
    "The capital of France is Berlin.",
)
print(f"Approved: {approved}")        # False
print(f"Score: {cs.score:.3f}")       # ~0.02

Guard an SDK Client¶

OpenAIAnthropicBedrockGemini

from director_ai import guard
from openai import OpenAI

client = guard(
    OpenAI(),
    facts={"refund": "within 30 days"},
    on_fail="raise",
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is the refund policy?"}],
)

from director_ai import guard
import anthropic

client = guard(
    anthropic.Anthropic(),
    facts={"refund": "within 30 days"},
)

message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "What is the refund policy?"}],
)

from director_ai import guard
import boto3

bedrock = boto3.client("bedrock-runtime")
client = guard(bedrock, facts={"refund": "within 30 days"})

response = client.converse(
    modelId="anthropic.claude-3-haiku-20240307-v1:0",
    messages=[{"role": "user", "content": [{"text": "Refund policy?"}]}],
)

from director_ai import guard
import google.generativeai as genai

model = genai.GenerativeModel("gemini-1.5-flash")
client = guard(model, facts={"refund": "within 30 days"})

response = client.generate_content("What is the refund policy?")

Failure Modes¶

Mode	Behavior
`on_fail="raise"`	Raises `HallucinationError` (default)
`on_fail="log"`	Logs warning, returns response unchanged
`on_fail="metadata"`	Stores score in context var for later inspection

Streaming Halt¶

from director_ai import StreamingKernel

kernel = StreamingKernel(hard_limit=0.4, window_size=8)

def score_fn(accumulated_text):
    return 0.85  # your coherence scoring logic on text so far

session = kernel.stream_tokens(token_generator, score_fn)
if session.halted:
    print(f"Halted at token {session.halt_index}: {session.halt_reason}")

Fallback Modes¶

from director_ai import CoherenceAgent

# Retrieval: return KB context when all candidates fail
agent = CoherenceAgent(fallback="retrieval")

# Disclaimer: prepend warning to best-rejected candidate
agent = CoherenceAgent(fallback="disclaimer")

Batch Scoring¶

from director_ai import CoherenceScorer

scorer = CoherenceScorer(threshold=0.6, use_nli=True)

items = [
    ("What is 2+2?", "The answer is 4."),
    ("Capital of France?", "Paris is in Germany."),
]
results = scorer.review_batch(items)
for approved, score in results:
    print(f"approved={approved}  score={score.score:.3f}")

review_batch() batches NLI pairs into 2 GPU forward passes when NLI is available. Dialogue items fall back to sequential scoring.

Async Usage¶

import asyncio
from director_ai import CoherenceAgent

agent = CoherenceAgent(use_nli=True)

async def main():
    result = await agent.aprocess("What is the capital of France?")
    print(result)

asyncio.run(main())

Next Steps¶

Scoring guide — thresholds, weights, NLI backends
Streaming halt — halt mechanisms, on_halt callbacks
KB ingestion — populate your knowledge base
Integrations — OpenAI, Anthropic, LangChain, and more
Production deployment — scaling, caching, monitoring
Domain presets — medical, finance, legal, creative profiles
Tutorials — 16 Jupyter notebooks from basics to production