Troubleshooting¶
Import Errors¶
ImportError: No module named 'transformers'
NLI models require PyTorch + Transformers:
ImportError: No module named 'onnxruntime'
ONNX backend requires ONNX Runtime:
ImportError: No module named 'chromadb'
Vector store with ChromaDB:
Validation Errors (v2.2.0+)¶
ValueError: threshold must be in [0, 1]
The threshold parameter is out of range. Pass a float between 0.0 and 1.0:
ValueError: w_logic + w_fact must equal 1.0
Divergence weights must sum to exactly 1.0:
ValueError: score_every_n must be >= 1
Scoring cadence must be a positive integer:
Scoring Issues¶
Score always 0.5
No NLI model loaded — heuristic fallback is active. Install the NLI extras and enable it:
Or use strict_mode=True to make 0.5 explicit (neutral, no heuristic guessing).
Score always 1.0
Empty or trivial prompt/response pair. The scorer returns perfect coherence when there is nothing to contradict.
Streaming halts too aggressively
Lower hard_limit or increase window_size to smooth out transient dips:
Streaming never halts
Raise hard_limit. Verify your coherence_callback returns a float in [0, 1]:
def cb(token):
score = my_scorer.review(prompt, accumulated_text)[1].score
return score # must be float in [0, 1]
Performance¶
Slow first review
NLI model loading takes 2-5 seconds on first call. Mitigations:
- Use
cache_size > 0to cache repeated prompt/response pairs - Use
scorer_backend="onnx"for faster cold-start (~1s vs ~3s) - Pre-warm the scorer at startup:
scorer.review("warmup", "warmup")
High memory usage
DeBERTa-v3-Large uses ~1.5 GB GPU / ~2 GB CPU. Options:
- Switch to ONNX:
scorer_backend="onnx"(~400 MB) - Use 8-bit quantization:
pip install director-ai[quantize]
Streaming throughput
Scoring every token is expensive. Use cadence control:
kernel = StreamingKernel(hard_limit=0.4, score_every_n=4)
# Or adaptive cadence:
kernel = StreamingKernel(hard_limit=0.4, adaptive=True, max_cadence=8)
See Streaming Overhead for tokens/sec benchmarks by cadence.
Server Issues¶
429 Too Many Requests
rate_limit_rpm exceeded. Increase the limit or disable rate limiting:
403 Forbidden
Missing or invalid X-API-Key header. Check api_keys in your config:
Pass the key in requests:
Metrics endpoint empty
Metrics collection is disabled. Enable it in config:
Then access /metrics for Prometheus-format output.