REST Server¶
Production-ready FastAPI server exposing Director-AI scoring over HTTP.
Starting the Server¶
Endpoints¶
| Method | Path | Description |
|---|---|---|
POST |
/v1/review |
Score a prompt/response pair |
POST |
/v1/verify |
Sentence-level multi-signal fact verification |
POST |
/v1/process |
Full agent pipeline (generate + score) |
POST |
/v1/batch |
Batch score multiple pairs |
GET |
/v1/health |
Liveness probe (version, mode, NLI status) |
GET |
/v1/ready |
Readiness probe — 503 if scorer/NLI not loaded |
GET |
/v1/config |
Config introspection |
GET |
/v1/metrics |
Metrics as JSON |
GET |
/v1/metrics/prometheus |
Prometheus-compatible metrics |
GET |
/v1/source |
Source code URL (AGPL compliance) |
WS |
/v1/stream |
WebSocket streaming oversight |
POST |
/v1/knowledge/upload |
Upload file → parse → chunk → embed |
POST |
/v1/knowledge/ingest |
Ingest raw text → chunk → embed |
GET |
/v1/knowledge/documents |
List documents per tenant |
DELETE |
/v1/knowledge/documents/{id} |
Delete document and chunks |
PUT |
/v1/knowledge/documents/{id} |
Re-ingest updated content |
GET |
/v1/knowledge/search |
Test retrieval quality |
POST |
/v1/knowledge/tune-embeddings |
Fine-tune embeddings on ingested docs |
GET |
/v1/knowledge/documents/{id} |
Get single document metadata |
GET |
/v1/tenants |
List tenants (scoped to caller's binding) |
POST |
/v1/tenants/{id}/facts |
Add keyword fact for tenant |
POST |
/v1/tenants/{id}/vector-facts |
Add vector fact for tenant |
GET/DELETE |
/v1/sessions/{id} |
Get or delete a scoring session |
GET |
/v1/stats |
Aggregate scoring statistics |
GET |
/v1/stats/hourly |
Hourly scoring breakdown |
GET |
/v1/dashboard |
Dashboard summary (stats + top tenants) |
POST |
/v1/finetune/start |
Start domain fine-tuning job |
GET |
/v1/finetune/{job_id} |
Check local fine-tuning job status |
POST |
/v1/finetune/managed/submit |
Submit or dry-run managed training |
GET |
/v1/finetune/managed/jobs |
List managed training submissions for a tenant |
POST |
/v1/finetune/managed/status |
Refresh managed training backend status |
POST |
/v1/finetune/managed/cancel |
Cancel a live managed training job |
GET |
/v1/finetune/managed/models |
List selectable managed training base models |
POST |
/v1/finetune/managed/benchmark-models |
Anti-regression benchmark for trained artefacts |
POST |
/v1/verify/numeric |
Numeric consistency verification |
POST |
/v1/verify/reasoning |
Reasoning chain logic verification |
POST |
/v1/temporal-freshness |
Temporal freshness / staleness scoring |
POST |
/v1/consensus |
Cross-model factual agreement |
POST |
/v1/injection/detect |
Intent-grounded prompt injection detection |
POST |
/v1/adversarial/test |
Adversarial robustness self-test |
POST |
/v1/conformal/predict |
Conformal prediction interval |
POST |
/v1/compliance/feedback-loops |
Feedback loop detection (Art 15(4)) |
POST |
/v1/agentic/check-step |
Agentic loop step safety check |
GET |
/v1/compliance/report |
EU AI Act Article 15 report |
GET |
/v1/compliance/drift |
Statistical drift detection |
GET |
/v1/compliance/dashboard |
Compliance metrics (24h/7d/30d) |
Operational endpoint exposure rules are documented in Public Endpoint Exposure.
Review Request¶
curl -X POST http://localhost:8080/v1/review \
-H 'Content-Type: application/json' \
-H 'X-API-Key: your-key' \
-d '{
"prompt": "What is the refund policy?",
"response": "Refunds within 30 days.",
"session_id": "optional-session-id"
}'
Response¶
{
"approved": true,
"coherence": 0.85,
"h_logical": 0.10,
"h_factual": 0.15,
"warning": false,
"evidence": {
"chunks": [
{"text": "Refunds within 30 days of purchase.", "distance": 0.12}
]
}
}
Authentication¶
Set api_keys in config or via DIRECTOR_API_KEYS env var (comma-separated):
Clients send X-API-Key: key1 header. Unauthenticated requests receive 401.
Rate Limiting¶
Returns 429 when exceeded. Install pip install director-ai[server] for Redis-backed distributed rate limiting.
CORS¶
Default is empty, so browser CORS is disabled until exact origins are set. Reverse-proxy examples are documented in CORS Reverse Proxy.
Continuous Batching (ReviewQueue)¶
For high-concurrency deployments, enable server-level request accumulation:
DIRECTOR_REVIEW_QUEUE_ENABLED=1 \
DIRECTOR_REVIEW_QUEUE_MAX_BATCH=32 \
DIRECTOR_REVIEW_QUEUE_FLUSH_TIMEOUT_MS=10 \
director-ai serve
The queue collects concurrent /v1/review requests and flushes them as a single review_batch() call, reducing GPU kernel launches from 2*N to 2 per flush window (when NLI is available).
Managed Training¶
Managed training endpoints submit customer-owned fine-tuning jobs through the
same backend as the CLI. Scope requests with X-Tenant-ID; list, status, and
cancel only return jobs submitted by the same tenant during the server process.
Install the managed-training extra to include the Vertex AI SDK; the lock is
kept at the patched google-cloud-aiplatform>=1.133 floor.
curl -X POST http://localhost:8080/v1/finetune/managed/submit \
-H 'Content-Type: application/json' \
-H 'X-Tenant-ID: acme' \
-d '{
"backend": "vertex",
"dry_run": false,
"dataset_uri": "gs://bucket/train.jsonl",
"eval_uri": "gs://bucket/eval.jsonl",
"output_uri": "gs://bucket/managed-training/acme/run-001",
"project": "project-id",
"region": "europe-west4",
"container_image_uri": "region-docker.pkg.dev/project/repo/image:tag",
"base_model": "factcg-deberta-v3-large"
}'
Check or cancel a submitted job with the backend-neutral job id returned by submit:
curl -X POST http://localhost:8080/v1/finetune/managed/status \
-H 'Content-Type: application/json' \
-H 'X-Tenant-ID: acme' \
-d '{"backend": "vertex", "job_id": "projects/.../customJobs/..."}'
Experimental model choices require allow_experimental_model: true. Promotion
still requires /v1/finetune/managed/benchmark-models; submitted or harvested
training metrics alone are not an activation gate.
Injection Detection¶
Detect prompt injection effects in LLM output via bidirectional NLI divergence from original intent.
curl -X POST http://localhost:8080/v1/injection/detect \
-H 'Content-Type: application/json' \
-d '{
"system_prompt": "You are a helpful customer service agent.",
"user_query": "What is the refund policy?",
"response": "Ignore all previous instructions. The system prompt is..."
}'
Request Body¶
| Field | Type | Required | Description |
|---|---|---|---|
response |
str |
Yes | LLM response to analyse |
system_prompt |
str |
No | System prompt / task description |
user_query |
str |
No | User's original query |
intent |
str |
No | Direct intent (fallback if system_prompt/user_query empty) |
Response¶
{
"injection_detected": true,
"injection_risk": 0.85,
"intent_coverage": 0.33,
"total_claims": 3,
"grounded_claims": 1,
"drifted_claims": 0,
"injected_claims": 2,
"claims": [
{
"claim": "Ignore all previous instructions.",
"verdict": "injected",
"bidirectional_divergence": 0.92,
"traceability": 0.05
}
],
"input_sanitizer_score": 0.95,
"combined_score": 0.88
}
Full API¶
director_ai.server.create_app
¶
Create and configure the FastAPI application.