Skip to content

Director-AI

Response-level LLM hallucination guardrail — NLI + RAG fact-checking with audit evidence.

v3.16.0 — factual-coherence guard, 5-tier scoring, RAG grounding, opt-in streaming contradiction checks

CI Pre-commit CodeQL PyPI Downloads Total downloads Tests Coverage Python Ruff mypy Sigstore Core License Advanced License OpenSSF Best Practices OpenSSF Scorecard DOI

2-Line Integration — Wrap any LLM SDK client with guard(). Duck-type detection for OpenAI-compatible, Anthropic, Bedrock, Gemini, Cohere. Quickstart → Opt-in Streaming Check — Halts completed streamed claims only when they contradict retrieved grounding facts; response-level scoring remains the production gate. Streaming →
Custom KB Grounding — Bring your own facts via RAG. ChromaDB, FAISS, Qdrant, or in-memory backends. KB Ingestion → 75.6% Balanced Accuracy on LLM-AggreFact (29K samples, 11 datasets, #6 on leaderboard; 77.76% with per-dataset tuning) — FactCG-DeBERTa-v3-Large NLI model. 14.6 ms/pair ONNX GPU. SBOM on every release. Scoring →
Injection Detection — Two-stage pipeline: regex pattern matching + bidirectional NLI intent-drift scoring. Catches injection effects in the output regardless of encoding. Per-claim attribution. Injection Detector → ProductionGuard — Batteries-included entry point: calibrated scoring, human feedback loop, conformal CIs, tool-call verification, and injection detection. Guard →
5-Tier Scoring — From zero-dep rules engine (<1ms) to embedding similarity (3ms) to full NLI (14.6ms). Choose your accuracy/latency trade-off. Scoring → SaaS-Ready — API key auth + token-bucket rate limiting middleware. Cloud Run Dockerfile included. Self-host or let us host.

What It Is For

Director-AI is for teams that need model outputs to stay tied to governed facts before those outputs reach customers, operators, downstream agents, or audit records.

For business users: this is the factual-coherence control point where hallucination rejection, auditability, and tenant-safe evidence are introduced before a wrong claim reaches users or automated workflows.

What This Software Is

Director-AI is a factual-coherence guardrail runtime. It gives application builders, platform teams, and evaluators a place to decide whether generated text is safe enough to show, stream, store, route, or hand to another agent.

It combines:

  • governed facts and retrieved evidence from a customer knowledge base;
  • contradiction and coherence scoring through configurable local scorer paths;
  • opt-in streaming contradiction checks for partial outputs;
  • structured checks for numbers, reasoning, freshness, consensus, injection, trajectories, and sector policy;
  • tenant-safe evidence, metrics, and compliance export surfaces.

It is not a replacement for access control, moderation, legal review, clinical review, or customer data governance. It is the factual-risk control layer that connects those controls to LLM output.

First 30 Minutes

Goal Path Result
Understand the product Applications and Market MapProduct OverviewMarket Value Clear map of applications, value, and evidence boundaries
Try it in code Quickstartguard API One answer scored and one SDK call guarded
See notebooks Notebook GalleryTutorials Role-based notebook path for evaluator, RAG, streaming, or production work
Plan a pilot Evaluation OnboardingProduction Guide Pilot checklist, acceptance criteria, and operational evidence plan
Inspect APIs API Reference → relevant module page Supported public API surface and integration boundary

Evidence-first deployment surfaces

Surface Why it matters Start here
Evidence packet CLI Run and verify a sealed local packet before asking a team to trust the guard Evidence Packet
Voice Guard Guard token streams before they become speech output or voice-agent actions Voice AI
Inference-server hooks Reject or mask unsafe tokens before sampling in vLLM, TGI, and llama.cpp deployments Inference-server hooks
Supply-chain controls Keep model, dependency, SBOM, and ML-BOM evidence visible to operators Supply Chain
Guardrail forensics Review missed cases without exposing raw prompt, response, or evidence text Guardrail Forensics

Market Positioning

The repository is the public core: SDK/API/verification surfaces, benchmark methods, and deployment patterns. Commercially, teams use Director-AI to reduce factual incidents in high-consequence LLM flows:

  • customer support and regulated self-service,
  • enterprise knowledge assistants,
  • streaming assistants and agent workflows,
  • and compliance-heavy environments requiring auditable evidence.

See Market Value and Positioning for a decision-focused buyer view.

For the fastest plain-language explanation of what Director-AI is, who uses it, what ships publicly, and where it creates market value, start with Applications and Market Map.

Reader Start Here What You Get
Evaluator Applications and Market Map Problem, applications, value, evidence boundaries
New pilot team Evaluation Onboarding First pilot shape, install path, validation checklist
Builder Quickstart In-process guard and SDK wrapping
Platform operator Production Guide REST/proxy deployment, monitoring, runbooks
RAG engineer KB Ingestion Grounding with private facts and vector stores

Install

pip install director-ai

Quick Example

from director_ai import guard
from openai import OpenAI

client = guard(
    OpenAI(),
    facts={"refund_policy": "Refunds within 30 days only"},
    threshold=0.3,
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is the refund policy?"}],
)

If the LLM hallucinates, guard() raises HallucinationError with the coherence score and contradicting evidence.

How It Works

graph LR
    LLM["LLM Response"]:::input --> SC["CoherenceScorer"]:::core
    SC --> NLI["NLI Model<br/>(H_logical)"]:::nli
    SC --> RAG["RAG Retrieval<br/>(H_factual)"]:::rag
    NLI --> SCORE["coherence = 1 - (0.6·H_L + 0.4·H_F)"]:::core
    RAG --> SCORE
    SCORE --> GATE{score ≥ threshold?}:::gate
    GATE -->|Yes| APPROVE["Approved"]:::approve
    GATE -->|No| HALT["Halt + Evidence"]:::halt
    classDef input fill:#7c4dff,stroke:#333,color:#fff
    classDef core fill:#512da8,stroke:#333,color:#fff
    classDef nli fill:#1565c0,stroke:#333,color:#fff
    classDef rag fill:#00695c,stroke:#333,color:#fff
    classDef gate fill:#ff8f00,stroke:#333,color:#fff
    classDef approve fill:#2e7d32,stroke:#333,color:#fff
    classDef halt fill:#c62828,stroke:#333,color:#fff

Competitive Positioning

Feature Director-AI NeMo Guardrails Guardrails-AI LLM-Guard
Mid-stream contradiction halt Opt-in No No No
Async voice AI pipeline Yes No No No
Custom KB RAG Yes Partial No No
Response-level NLI scoring Yes No No No
NLI contradiction detection Yes No No Partial
Evidence on rejection Yes No No No
Numeric verification Yes No No No
Agentic loop safety Yes No No No
Conformal prediction Yes No No No
EU AI Act Article 15 Yes No No No
Adversarial self-test Yes No No No
5 SDK integrations Yes 1 1 0
6 framework integrations Yes 1 1 0

Paths Forward

Path Time What You Get
Quickstart 2 min Score a response, guard an SDK client
Why Director-AI 5 min Problem statement, decision matrix, cost comparison
Product Overview 7 min Applications, market value, evidence boundaries
Evaluation Onboarding 10 min Pilot checklist from first install to deployment evidence
Tutorials 30 min 18 Jupyter notebooks from basics to production
Notebook Gallery 5 min Buyer- and use-case-oriented notebook index
API Reference Every public class and function
Production Guide 15 min Scaling, caching, monitoring, Docker
Domain Cookbooks 10 min Legal, medical, finance, support recipes
Voice AI 10 min Async streaming guard + TTS adapters for voice pipelines
Glossary 35 terms defined and cross-linked

Obtain

pip install director-ai            # base
pip install director-ai[nli]       # + NLI model (recommended)
pip install director-ai[server]    # + REST API server
pip install director-ai[nli,vector,server]       # everything

PyPI: pypi.org/project/director-ai | Source: github.com/anulum/director-ai | Docs: anulum.github.io/director-ai

Feedback & Bugs

Used By

Early adopter logos coming soon. Get in touch to be featured.

Contributing

See CONTRIBUTING.md for code style, test requirements, and PR workflow.

License

Open core: the Apache-2.0 core is free for any use, including production; the BUSL-1.1 advanced & labs tier is source-available and free for non-production. Commercial licensing for the advanced tier is available at anulum.li.


Contact: protoscience@anulum.li | GitHub Discussions | www.anulum.li

Maintained by Miroslav Šotek at Anulum. Current release: v3.16.0.

ANULUM      Fortis Studio
Developed by ANULUM / Fortis Studio