Skip to content

Product Overview

Director-AI is a real-time factual-coherence guard for LLM applications. It checks generated answers against governed facts, retrieved evidence, and structured verification rules before those answers become user-visible decisions.

The core product question is simple:

Can this model answer be trusted enough to show, stream, store, route, or act on?

Director-AI answers that question with an auditable verdict, a score, and the evidence that drove the decision.

Product Thesis

LLM adoption is moving from chat experiments into workflows that affect customers, research, operations, and regulated decisions. The failure mode is not only toxic content; it is confident unsupported content that looks useful until it becomes an incident.

Director-AI creates a control point between generation and consequence:

  • Application teams get a small API that can approve, reject, annotate, or halt output.
  • Platform teams get shared REST/gRPC infrastructure, auth, rate limits, metrics, and deployment runbooks.
  • Evaluation teams get repeatable scoring, batch processing, threshold tuning, and benchmark evidence.
  • Governance teams get tenant-safe audit trails, compliance reports, and human-review escalation paths.

The open repository contains the public core and evidence surface. Commercial work can add customer-specific sector packs, deployment recipes, tuning data, and acceptance evidence under a separate agreement.

What It Is For

Director-AI is built for teams that already use LLMs in workflows where a wrong claim has operational cost:

Use case What Director-AI protects Primary value
Customer support Policy, refund, warranty, and account answers Fewer unsupported claims reaching customers
Regulated research Scientific, medical, legal, or financial summaries Evidence-linked rejection of unsupported claims
RAG assistants Answers grounded in a private knowledge base Traceable retrieval plus coherence scoring
Streaming chat Token streams shown as they are generated Mid-stream halt before a bad answer completes
Agent workflows Tool outputs, handoffs, and multi-step traces Per-step checks before downstream action
Evaluation pipelines Batch scoring of prompt/response datasets Regression gates and labelled feedback loops
Enterprise governance Tenant-safe audit trails and compliance reports Reviewable evidence for risk, quality, and controls

It does not replace moderation, access control, data governance, or domain expert review. It adds a factual-coherence control plane that can be combined with those systems.

How It Works

Director-AI can run in-process, behind an HTTP proxy, as FastAPI middleware, as a REST/gRPC service, or inside integration adapters.

graph LR
    A["Application"] --> B["LLM provider or local model"]
    B --> C["Director-AI guard"]
    C --> D["Knowledge base"]
    C --> E["Scorers and verifiers"]
    E --> F{"Approve?"}
    F -->|yes| G["Return answer"]
    F -->|no| H["Halt, reject, or route to review"]
    H --> I["Evidence and audit event"]

The default production pattern has four layers:

  1. Grounding: bring governed facts through a key-value store, vector store, document ingestion pipeline, or customer runtime package.
  2. Scoring: combine logical contradiction, retrieval evidence, rules, embeddings, NLI models, and optional structured verification.
  3. Control: choose what happens on failure: raise, log, attach metadata, halt a stream, route to a review queue, or reject an HTTP response.
  4. Evidence: emit scores, retrieved facts, halt reasons, audit records, and optional compliance reports.

Application Lanes

Pick one lane first. Mixing all lanes in the first week usually creates noisy results because each lane has a different success signal.

Builder Lane

Use this when adding protection to an existing app.

  • Start with Quickstart.
  • Wrap an SDK client with guard().
  • Add facts inline or through KB ingestion.
  • Pick a failure mode: raise, metadata, log, reject, or review.

Platform Lane

Use this when exposing the guard as shared infrastructure.

  • Deploy the REST server or gRPC server.
  • Put the proxy in front of compatible clients.
  • Add API keys, rate limits, metrics, audit logs, and deployment runbooks.
  • Use Runtime Boundaries to decide which optional runtimes belong in production.

Evaluation Lane

Use this when proving that a configuration is good enough for a domain.

  • Build labelled prompt/response sets.
  • Run batch processing and threshold sweeps.
  • Use online calibration for human feedback loops.
  • Store benchmark evidence before making domain-specific performance claims.

Enterprise Lane

Use this when selling, piloting, or operating in a governed organisation.

  • Use Enterprise, Compliance Reporting, and Production Checklist.
  • Keep customer-specific claims scoped to customer data and acceptance criteria.
  • Use the Customer Model Factory public core to package evidence, deployment manifests, rollback data, and runtime configuration.

Market Value

The practical value is not that Director-AI is another chatbot wrapper. The value is that it gives organisations a controllable guard layer between model output and business consequence.

Director-AI can reduce:

  • unsupported customer-facing claims;
  • manual review load for routine factual checks;
  • failed RAG evaluations caused by stale or missing evidence;
  • risk from streamed hallucinations shown before post-hoc moderation runs;
  • integration work required to reuse one guard policy across multiple LLM providers, frameworks, and deployment targets.

It can increase:

  • evidence quality in regulated AI workflows;
  • confidence that a new model, prompt, or KB version did not regress;
  • portability across local, cloud, proxy, REST, and SDK integration modes;
  • customer trust by making rejection reasons inspectable.

The repository is the open core and public evidence surface. Commercial deployments can add customer-specific data mappings, tuning packages, deployment recipes, sector playbooks, and acceptance evidence under a separate agreement.

If you need this framed for leadership review, start with Market Value and Positioning. It translates the same technical surface into commercial outcomes, pilot evidence, and budget justification language.

Evidence Boundaries

Director-AI documentation is intentionally conservative about performance and market claims:

  • public benchmark numbers must point to committed artefacts or published benchmark methodology;
  • customer-specific claims require customer-specific data and approval criteria;
  • optional GPU, Rust, ONNX, gRPC, Go, Julia, Lean, and WASM paths are additive, not required for the Python quickstart;
  • 100% line coverage is not treated as proof of quality by itself. The project prioritises high honest coverage, meaningful module-specific tests, and evidence-bearing integration checks.

Start Here

Reader First page Outcome
Evaluator Quickstart Score and guard a response in minutes
Buyer Why Director-AI Understand the business problem and alternatives
Market reviewer Guardrail Landscape Compare factuality, safety, streaming, and audit guardrail categories
Developer API Reference Choose the right API surface
RAG engineer KB Ingestion Ground responses in private facts
Operator Production Guide Deploy, monitor, and audit the service
Enterprise reviewer Evaluation Onboarding Plan a scoped pilot with evidence gates