Skip to content

Domain Presets

DirectorConfig.from_profile(name) loads a preset parameter set for common use cases. These are starting points, not a substitute for calibration on your own evaluation set. Domain profiles should be treated as a starting config until director-ai tune has measured the expected false-halt and miss rates.

Profile Reference

Each built-in profile has runtime metadata available through DirectorConfig.profile_metadata(name): intended workload, validation status, expected false-halt risk, and required dependency extras.

Profile Threshold Hard Limit Soft Limit NLI Reranker W_Logic W_Fact
fast 0.50 default default no no default default
lite 0.50 default default no no default default
rules 0.50 default default no no default default
embed 0.60 default default no no default default
thorough 0.60 default default yes no default default
research 0.70 default default yes no default default
medical 0.30 0.20 0.35 yes yes 0.5 0.5
finance 0.30 0.20 0.35 yes yes 0.4 0.6
legal 0.30 0.20 0.35 yes no 0.6 0.4
creative 0.40 0.30 0.45 no no 0.7 0.3
customer_support 0.55 0.40 0.60 no no 0.5 0.5
summarization 0.15 0.08 0.25 yes no 0.0 1.0

"default" means the field inherits the DirectorConfig dataclass default (hard_limit=0.5, soft_limit=0.6, w_logic/w_fact=0.0 which defers to CoherenceScorer class defaults).

Profile Metadata

Profile Intended Workload Validation Status False-Halt Risk Required Extras
fast Development loops and heuristic screening smoke-tested heuristic baseline low for obvious checks, unknown for factual QA none
lite Offline approximate local scoring smoke-tested lite scorer baseline medium without calibration none
rules Deterministic local checks deterministic baseline low for exact rules, high for semantic hallucinations none
embed Semantic similarity screening benchmarked approximate scorer medium; tune per corpus embed
thorough General production baseline standard validated baseline medium until tuned nli
research Academic precision-heavy review experimental high-threshold baseline high by design nli
medical Biomedical fact-heavy review with curated KB limited PubMedQA validation; requires KB grounding very high without KB and calibration nli, vector
finance Financial and regulatory KB review limited FinanceBench validation; requires recalibration very high without KB and calibration nli, vector
legal Legal reasoning over curated KBs not independently validated unknown; treat as high until tuned nli
creative Drafting, fiction, and non-factual generation heuristic permissive preset low for creative drift, high for factual safety none
customer_support Policy bots and troubleshooting assistants latency-first starter preset medium; depends on policy KB coverage none
summarization Source-grounded summaries validated with summarization FPR diagnostics low after claim coverage; tune per corpus nli

Starter YAML Presets

The built-in profiles above are compact runtime defaults. The repository also ships fuller starter YAML files in configs/starter-presets/ for teams that want a ready-to-edit deployment config:

File Workload Starting stance
customer_support.yaml Policy and troubleshooting assistants latency-first, injection checks on, retrieval disabled by default
summarization.yaml Source-grounded summaries fact-only NLI, prompt-as-premise, claim coverage
rag_qa.yaml Retrieval-grounded QA grounded mode, reranker, HyDE, decomposition, compression
finance.yaml Numeric and regulatory claims high-stakes grounded mode, audit path, PII redaction
legal.yaml Legal drafting and review logic-weighted grounded mode, audit path, PII redaction
medical.yaml Biomedical or clinical fact review high-stakes grounded mode, stricter claim support
creative_drafting.yaml Fiction and exploratory drafting permissive lite scoring with basic injection checks
edge_offline.yaml Offline or constrained edge runtime rules backend, no vector or heavyweight model path
stem_fact_heavy.yaml Scientific and technical fact workflows grounded mode, stronger claim support, parent-child retrieval
code_generation.yaml Code and tool-output review logic-weighted hybrid scoring, retrieval disabled by default
multi_agent_swarm.yaml Multi-agent supervision review queue batching, retrieval routing, trace-friendly logging
voice_agents.yaml Real-time dialogue and voice agents lite scoring, dialogue thresholds, low-latency defaults
high_stakes_medical_review.yaml Clinical review workflows strict grounded review, higher retrieval and claim-support gates

Load a starter preset directly when you want the full YAML surface:

from director_ai import DirectorConfig

config = DirectorConfig.from_yaml("configs/starter-presets/rag_qa.yaml")

Grounded presets assume a populated vector store. They intentionally omit production enforcement, auth key lists, cloud endpoints, and sensitive values; add those in an ignored deployment overlay after local validation.

Profile Rationale

fast — Heuristic scoring only, no model loading. Sub-millisecond latency for dev loops and high-throughput pipelines where approximate filtering is acceptable.

lite — Lite scorer backend with no heavyweight NLI dependency. Use for offline trials and latency-sensitive routing where approximate scores are acceptable.

rules — Rules-only scorer. Use when deployments need deterministic local checks and no model downloads.

embed — Embedding scorer backend. Use when semantic similarity is the primary signal and a full NLI model is not available.

thorough — Adds NLI inference (FactCG-DeBERTa) to catch logical contradictions that heuristics miss. Standard production baseline.

research — Higher threshold (0.70) for academic and analytical workloads where factual precision matters more than recall.

medical — Equal logic/fact weighting reflects the need for both clinical reasoning and factual accuracy. Reranker enabled for precise KB retrieval. NLI-only eval on PubMedQA (1000 samples, 2026-03-20): F1=61.9% at t=0.30, but FPR=100% (all responses flagged). KB grounding or customer-specific calibration required for usable precision. Scores without KB cluster 0.25-0.35.

finance — Fact-weighted (0.6) because numerical claims and regulatory data dominate. Reranker sharpens retrieval against financial KB documents. NLI-only eval on FinanceBench (150 clean samples, 2026-03-20): FPR=100%, precision=0% — all clean responses were flagged. These thresholds need KB grounding or recalibration before production use.

legal — Logic-weighted (0.6) because legal reasoning chains (statute + precedent + application) matter more than isolated facts. No reranker; legal KBs tend to be smaller and well-structured. Not validated — CUAD benchmark OOM on 6GB VRAM. No domain-specific artefact exists.

creative — Permissive thresholds (0.40/0.30/0.45) allow divergent generation. NLI disabled to avoid penalising metaphor and fiction. Logic-weighted (0.7) because internal narrative consistency matters more than factual grounding.

customer_support — Moderate thresholds balance helpfulness with accuracy. NLI disabled for latency (support bots need fast responses). Equal weights suit mixed queries (policy facts + troubleshooting logic).

summarization — Fact-only weighting with prompt-as-premise scoring, trimmed-mean aggregation, and claim coverage enabled. Use for source-grounded summaries, then tune on your own clean and adversarial samples.

Usage

from director_ai import DirectorConfig

config = DirectorConfig.from_profile("medical")

Load via CLI:

director-ai quickstart --profile medical

Generate the Docker Compose quickstart and start it immediately:

director-ai quickstart --profile customer_support --run

Tune against a labelled evaluation set before production:

director-ai tune --dataset my_eval.jsonl --profile medical --output medical_tuned.yaml

Load via environment variable:

export DIRECTOR_PROFILE=finance

Customising a Profile

from_profile returns a regular DirectorConfig dataclass. Override fields after loading:

from dataclasses import replace
from director_ai import DirectorConfig

base = DirectorConfig.from_profile("medical")
config = replace(base, hard_limit=0.60, nli_model="lytang/MiniCheck-DeBERTa-L")

Or override via environment variables (env vars take precedence when using from_env after profile):

config = DirectorConfig.from_profile("finance")
config.coherence_threshold = 0.72
config.reranker_top_k_multiplier = 5

Profile + YAML

Combine a profile base with a YAML overlay:

# config.yaml
coherence_threshold: 0.72
chroma_persist_dir: /data/chroma
audit_log_path: /var/log/director/audit.jsonl
from director_ai import DirectorConfig

config = DirectorConfig.from_profile("finance")
yaml_overrides = DirectorConfig.from_yaml("config.yaml")

# Merge: YAML values override profile values
for field_name in DirectorConfig.__dataclass_fields__:
    yaml_val = getattr(yaml_overrides, field_name)
    default_val = DirectorConfig.__dataclass_fields__[field_name].default
    if yaml_val != default_val:
        setattr(config, field_name, yaml_val)

Adding Custom Profiles

For organisation-specific profiles, subclass or wrap:

from director_ai import DirectorConfig

INTERNAL_PROFILES = {
    "compliance": {
        "coherence_threshold": 0.80,
        "hard_limit": 0.60,
        "soft_limit": 0.80,
        "use_nli": True,
        "reranker_enabled": True,
        "w_logic": 0.5,
        "w_fact": 0.5,
    },
}

def load_profile(name: str) -> DirectorConfig:
    if name in INTERNAL_PROFILES:
        return DirectorConfig(**INTERNAL_PROFILES[name], profile=name)
    return DirectorConfig.from_profile(name)