Director-AI¶

Response-level LLM hallucination guardrail — NLI + RAG fact-checking with audit evidence.

v3.16.0 — factual-coherence guard, 5-tier scoring, RAG grounding, opt-in streaming contradiction checks


2-Line Integration — Wrap any LLM SDK client with `guard()`. Duck-type detection for OpenAI-compatible, Anthropic, Bedrock, Gemini, Cohere. Quickstart →	Opt-in Streaming Check — Halts completed streamed claims only when they contradict retrieved grounding facts; response-level scoring remains the production gate. Streaming →
Custom KB Grounding — Bring your own facts via RAG. ChromaDB, FAISS, Qdrant, or in-memory backends. KB Ingestion →	75.6% Balanced Accuracy on LLM-AggreFact (29K samples, 11 datasets, #6 on leaderboard; 77.76% with per-dataset tuning) — FactCG-DeBERTa-v3-Large NLI model. 14.6 ms/pair ONNX GPU. SBOM on every release. Scoring →
Injection Detection — Two-stage pipeline: regex pattern matching + bidirectional NLI intent-drift scoring. Catches injection effects in the output regardless of encoding. Per-claim attribution. Injection Detector →	ProductionGuard — Batteries-included entry point: calibrated scoring, human feedback loop, conformal CIs, tool-call verification, and injection detection. Guard →
5-Tier Scoring — From zero-dep rules engine (<1ms) to embedding similarity (3ms) to full NLI (14.6ms). Choose your accuracy/latency trade-off. Scoring →	SaaS-Ready — API key auth + token-bucket rate limiting middleware. Cloud Run Dockerfile included. Self-host or let us host.

What It Is For¶

Director-AI is for teams that need model outputs to stay tied to governed facts before those outputs reach customers, operators, downstream agents, or audit records.

For business users: this is the factual-coherence control point where hallucination rejection, auditability, and tenant-safe evidence are introduced before a wrong claim reaches users or automated workflows.

What This Software Is¶

Director-AI is a factual-coherence guardrail runtime. It gives application builders, platform teams, and evaluators a place to decide whether generated text is safe enough to show, stream, store, route, or hand to another agent.

It combines:

governed facts and retrieved evidence from a customer knowledge base;
contradiction and coherence scoring through configurable local scorer paths;
opt-in streaming contradiction checks for partial outputs;
structured checks for numbers, reasoning, freshness, consensus, injection, trajectories, and sector policy;
tenant-safe evidence, metrics, and compliance export surfaces.

It is not a replacement for access control, moderation, legal review, clinical review, or customer data governance. It is the factual-risk control layer that connects those controls to LLM output.

First 30 Minutes¶

Goal	Path	Result
Understand the product	Applications and Market Map → Product Overview → Market Value	Clear map of applications, value, and evidence boundaries
Try it in code	Quickstart → guard API	One answer scored and one SDK call guarded
See notebooks	Notebook Gallery → Tutorials	Role-based notebook path for evaluator, RAG, streaming, or production work
Plan a pilot	Evaluation Onboarding → Production Guide	Pilot checklist, acceptance criteria, and operational evidence plan
Inspect APIs	API Reference → relevant module page	Supported public API surface and integration boundary

Evidence-first deployment surfaces¶

Surface	Why it matters	Start here
Evidence packet CLI	Run and verify a sealed local packet before asking a team to trust the guard	Evidence Packet
Voice Guard	Guard token streams before they become speech output or voice-agent actions	Voice AI
Inference-server hooks	Reject or mask unsafe tokens before sampling in vLLM, TGI, and llama.cpp deployments	Inference-server hooks
Supply-chain controls	Keep model, dependency, SBOM, and ML-BOM evidence visible to operators	Supply Chain
Guardrail forensics	Review missed cases without exposing raw prompt, response, or evidence text	Guardrail Forensics

Market Positioning¶

The repository is the public core: SDK/API/verification surfaces, benchmark methods, and deployment patterns. Commercially, teams use Director-AI to reduce factual incidents in high-consequence LLM flows:

customer support and regulated self-service,
enterprise knowledge assistants,
streaming assistants and agent workflows,
and compliance-heavy environments requiring auditable evidence.

See Market Value and Positioning for a decision-focused buyer view.

For the fastest plain-language explanation of what Director-AI is, who uses it, what ships publicly, and where it creates market value, start with Applications and Market Map.

Reader	Start Here	What You Get
Evaluator	Applications and Market Map	Problem, applications, value, evidence boundaries
New pilot team	Evaluation Onboarding	First pilot shape, install path, validation checklist
Builder	Quickstart	In-process guard and SDK wrapping
Platform operator	Production Guide	REST/proxy deployment, monitoring, runbooks
RAG engineer	KB Ingestion	Grounding with private facts and vector stores

Install¶

pip install director-ai

Quick Example¶

from director_ai import guard
from openai import OpenAI

client = guard(
    OpenAI(),
    facts={"refund_policy": "Refunds within 30 days only"},
    threshold=0.3,
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is the refund policy?"}],
)

If the LLM hallucinates, guard() raises HallucinationError with the coherence score and contradicting evidence.

How It Works¶

graph LR
    LLM["LLM Response"]:::input --> SC["CoherenceScorer"]:::core
    SC --> NLI["NLI Model<br/>(H_logical)"]:::nli
    SC --> RAG["RAG Retrieval<br/>(H_factual)"]:::rag
    NLI --> SCORE["coherence = 1 - (0.6·H_L + 0.4·H_F)"]:::core
    RAG --> SCORE
    SCORE --> GATE{score ≥ threshold?}:::gate
    GATE -->|Yes| APPROVE["Approved"]:::approve
    GATE -->|No| HALT["Halt + Evidence"]:::halt
    classDef input fill:#7c4dff,stroke:#333,color:#fff
    classDef core fill:#512da8,stroke:#333,color:#fff
    classDef nli fill:#1565c0,stroke:#333,color:#fff
    classDef rag fill:#00695c,stroke:#333,color:#fff
    classDef gate fill:#ff8f00,stroke:#333,color:#fff
    classDef approve fill:#2e7d32,stroke:#333,color:#fff
    classDef halt fill:#c62828,stroke:#333,color:#fff

Competitive Positioning¶

Feature	Director-AI	NeMo Guardrails	Guardrails-AI	LLM-Guard
Mid-stream contradiction halt	Opt-in	No	No	No
Async voice AI pipeline	Yes	No	No	No
Custom KB RAG	Yes	Partial	No	No
Response-level NLI scoring	Yes	No	No	No
NLI contradiction detection	Yes	No	No	Partial
Evidence on rejection	Yes	No	No	No
Numeric verification	Yes	No	No	No
Agentic loop safety	Yes	No	No	No
Conformal prediction	Yes	No	No	No
EU AI Act Article 15	Yes	No	No	No
Adversarial self-test	Yes	No	No	No
5 SDK integrations	Yes	1	1	0
6 framework integrations	Yes	1	1	0

Paths Forward¶

Path	Time	What You Get
Quickstart	2 min	Score a response, guard an SDK client
Why Director-AI	5 min	Problem statement, decision matrix, cost comparison
Product Overview	7 min	Applications, market value, evidence boundaries
Evaluation Onboarding	10 min	Pilot checklist from first install to deployment evidence
Tutorials	30 min	18 Jupyter notebooks from basics to production
Notebook Gallery	5 min	Buyer- and use-case-oriented notebook index
API Reference	—	Every public class and function
Production Guide	15 min	Scaling, caching, monitoring, Docker
Domain Cookbooks	10 min	Legal, medical, finance, support recipes
Voice AI	10 min	Async streaming guard + TTS adapters for voice pipelines
Glossary	—	35 terms defined and cross-linked

Obtain¶

pip install director-ai            # base
pip install director-ai[nli]       # + NLI model (recommended)
pip install director-ai[server]    # + REST API server
pip install director-ai[nli,vector,server]       # everything

PyPI: pypi.org/project/director-ai | Source: github.com/anulum/director-ai | Docs: anulum.github.io/director-ai

Feedback & Bugs¶

Bug reports: GitHub Issues
Feature requests: GitHub Issues
Security: SECURITY.md
Commercial inquiries: anulum.li

Used By¶

Early adopter logos coming soon. Get in touch to be featured.

Contributing¶

See CONTRIBUTING.md for code style, test requirements, and PR workflow.

License¶

Open core: the Apache-2.0 core is free for any use, including production; the BUSL-1.1 advanced & labs tier is source-available and free for non-production. Commercial licensing for the advanced tier is available at anulum.li.

Contact: protoscience@anulum.li | GitHub Discussions | www.anulum.li

Maintained by Miroslav Šotek at Anulum. Current release: v3.16.0.

Developed by ANULUM / Fortis Studio