Director-AI¶
Response-level LLM hallucination guardrail — NLI + RAG fact-checking with audit evidence.
v3.16.0 — factual-coherence guard, 5-tier scoring, RAG grounding, opt-in streaming contradiction checks
2-Line Integration — Wrap any LLM SDK client with guard(). Duck-type detection for OpenAI-compatible, Anthropic, Bedrock, Gemini, Cohere. Quickstart → |
Opt-in Streaming Check — Halts completed streamed claims only when they contradict retrieved grounding facts; response-level scoring remains the production gate. Streaming → |
| Custom KB Grounding — Bring your own facts via RAG. ChromaDB, FAISS, Qdrant, or in-memory backends. KB Ingestion → | 75.6% Balanced Accuracy on LLM-AggreFact (29K samples, 11 datasets, #6 on leaderboard; 77.76% with per-dataset tuning) — FactCG-DeBERTa-v3-Large NLI model. 14.6 ms/pair ONNX GPU. SBOM on every release. Scoring → |
| Injection Detection — Two-stage pipeline: regex pattern matching + bidirectional NLI intent-drift scoring. Catches injection effects in the output regardless of encoding. Per-claim attribution. Injection Detector → | ProductionGuard — Batteries-included entry point: calibrated scoring, human feedback loop, conformal CIs, tool-call verification, and injection detection. Guard → |
| 5-Tier Scoring — From zero-dep rules engine (<1ms) to embedding similarity (3ms) to full NLI (14.6ms). Choose your accuracy/latency trade-off. Scoring → | SaaS-Ready — API key auth + token-bucket rate limiting middleware. Cloud Run Dockerfile included. Self-host or let us host. |
What It Is For¶
Director-AI is for teams that need model outputs to stay tied to governed facts before those outputs reach customers, operators, downstream agents, or audit records.
For business users: this is the factual-coherence control point where hallucination rejection, auditability, and tenant-safe evidence are introduced before a wrong claim reaches users or automated workflows.
What This Software Is¶
Director-AI is a factual-coherence guardrail runtime. It gives application builders, platform teams, and evaluators a place to decide whether generated text is safe enough to show, stream, store, route, or hand to another agent.
It combines:
- governed facts and retrieved evidence from a customer knowledge base;
- contradiction and coherence scoring through configurable local scorer paths;
- opt-in streaming contradiction checks for partial outputs;
- structured checks for numbers, reasoning, freshness, consensus, injection, trajectories, and sector policy;
- tenant-safe evidence, metrics, and compliance export surfaces.
It is not a replacement for access control, moderation, legal review, clinical review, or customer data governance. It is the factual-risk control layer that connects those controls to LLM output.
First 30 Minutes¶
| Goal | Path | Result |
|---|---|---|
| Understand the product | Applications and Market Map → Product Overview → Market Value | Clear map of applications, value, and evidence boundaries |
| Try it in code | Quickstart → guard API | One answer scored and one SDK call guarded |
| See notebooks | Notebook Gallery → Tutorials | Role-based notebook path for evaluator, RAG, streaming, or production work |
| Plan a pilot | Evaluation Onboarding → Production Guide | Pilot checklist, acceptance criteria, and operational evidence plan |
| Inspect APIs | API Reference → relevant module page | Supported public API surface and integration boundary |
Evidence-first deployment surfaces¶
| Surface | Why it matters | Start here |
|---|---|---|
| Evidence packet CLI | Run and verify a sealed local packet before asking a team to trust the guard | Evidence Packet |
| Voice Guard | Guard token streams before they become speech output or voice-agent actions | Voice AI |
| Inference-server hooks | Reject or mask unsafe tokens before sampling in vLLM, TGI, and llama.cpp deployments | Inference-server hooks |
| Supply-chain controls | Keep model, dependency, SBOM, and ML-BOM evidence visible to operators | Supply Chain |
| Guardrail forensics | Review missed cases without exposing raw prompt, response, or evidence text | Guardrail Forensics |
Market Positioning¶
The repository is the public core: SDK/API/verification surfaces, benchmark methods, and deployment patterns. Commercially, teams use Director-AI to reduce factual incidents in high-consequence LLM flows:
- customer support and regulated self-service,
- enterprise knowledge assistants,
- streaming assistants and agent workflows,
- and compliance-heavy environments requiring auditable evidence.
See Market Value and Positioning for a decision-focused buyer view.
For the fastest plain-language explanation of what Director-AI is, who uses it, what ships publicly, and where it creates market value, start with Applications and Market Map.
| Reader | Start Here | What You Get |
|---|---|---|
| Evaluator | Applications and Market Map | Problem, applications, value, evidence boundaries |
| New pilot team | Evaluation Onboarding | First pilot shape, install path, validation checklist |
| Builder | Quickstart | In-process guard and SDK wrapping |
| Platform operator | Production Guide | REST/proxy deployment, monitoring, runbooks |
| RAG engineer | KB Ingestion | Grounding with private facts and vector stores |
Install¶
Quick Example¶
from director_ai import guard
from openai import OpenAI
client = guard(
OpenAI(),
facts={"refund_policy": "Refunds within 30 days only"},
threshold=0.3,
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What is the refund policy?"}],
)
If the LLM hallucinates, guard() raises HallucinationError with the coherence score and contradicting evidence.
How It Works¶
graph LR
LLM["LLM Response"]:::input --> SC["CoherenceScorer"]:::core
SC --> NLI["NLI Model<br/>(H_logical)"]:::nli
SC --> RAG["RAG Retrieval<br/>(H_factual)"]:::rag
NLI --> SCORE["coherence = 1 - (0.6·H_L + 0.4·H_F)"]:::core
RAG --> SCORE
SCORE --> GATE{score ≥ threshold?}:::gate
GATE -->|Yes| APPROVE["Approved"]:::approve
GATE -->|No| HALT["Halt + Evidence"]:::halt
classDef input fill:#7c4dff,stroke:#333,color:#fff
classDef core fill:#512da8,stroke:#333,color:#fff
classDef nli fill:#1565c0,stroke:#333,color:#fff
classDef rag fill:#00695c,stroke:#333,color:#fff
classDef gate fill:#ff8f00,stroke:#333,color:#fff
classDef approve fill:#2e7d32,stroke:#333,color:#fff
classDef halt fill:#c62828,stroke:#333,color:#fff
Competitive Positioning¶
| Feature | Director-AI | NeMo Guardrails | Guardrails-AI | LLM-Guard |
|---|---|---|---|---|
| Mid-stream contradiction halt | Opt-in | No | No | No |
| Async voice AI pipeline | Yes | No | No | No |
| Custom KB RAG | Yes | Partial | No | No |
| Response-level NLI scoring | Yes | No | No | No |
| NLI contradiction detection | Yes | No | No | Partial |
| Evidence on rejection | Yes | No | No | No |
| Numeric verification | Yes | No | No | No |
| Agentic loop safety | Yes | No | No | No |
| Conformal prediction | Yes | No | No | No |
| EU AI Act Article 15 | Yes | No | No | No |
| Adversarial self-test | Yes | No | No | No |
| 5 SDK integrations | Yes | 1 | 1 | 0 |
| 6 framework integrations | Yes | 1 | 1 | 0 |
Paths Forward¶
| Path | Time | What You Get |
|---|---|---|
| Quickstart | 2 min | Score a response, guard an SDK client |
| Why Director-AI | 5 min | Problem statement, decision matrix, cost comparison |
| Product Overview | 7 min | Applications, market value, evidence boundaries |
| Evaluation Onboarding | 10 min | Pilot checklist from first install to deployment evidence |
| Tutorials | 30 min | 18 Jupyter notebooks from basics to production |
| Notebook Gallery | 5 min | Buyer- and use-case-oriented notebook index |
| API Reference | — | Every public class and function |
| Production Guide | 15 min | Scaling, caching, monitoring, Docker |
| Domain Cookbooks | 10 min | Legal, medical, finance, support recipes |
| Voice AI | 10 min | Async streaming guard + TTS adapters for voice pipelines |
| Glossary | — | 35 terms defined and cross-linked |
Obtain¶
pip install director-ai # base
pip install director-ai[nli] # + NLI model (recommended)
pip install director-ai[server] # + REST API server
pip install director-ai[nli,vector,server] # everything
PyPI: pypi.org/project/director-ai | Source: github.com/anulum/director-ai | Docs: anulum.github.io/director-ai
Feedback & Bugs¶
- Bug reports: GitHub Issues
- Feature requests: GitHub Issues
- Security: SECURITY.md
- Commercial inquiries: anulum.li
Used By¶
Early adopter logos coming soon. Get in touch to be featured.
Contributing¶
See CONTRIBUTING.md for code style, test requirements, and PR workflow.
License¶
Open core: the Apache-2.0 core is free for any use, including production; the BUSL-1.1 advanced & labs tier is source-available and free for non-production. Commercial licensing for the advanced tier is available at anulum.li.
Contact: protoscience@anulum.li | GitHub Discussions | www.anulum.li
Maintained by Miroslav Šotek at Anulum. Current release: v3.16.0.
Developed by ANULUM / Fortis Studio