Structured Output Verification¶

Verify JSON, tool calls, and code generated by LLMs — catching hallucinations that NLI cannot detect.

Why Structured Verification?¶

NLI scores natural language against natural language. But agentic LLMs produce JSON, function calls, SQL, and code. A JSON field with the wrong type, a function call to a nonexistent API, or a Python import that doesn't exist — NLI cannot catch these. Structured verification uses deterministic checks (schema validation, AST parsing, manifest lookup) that are 100% precise for format violations, zero latency, and require no model.

JSON Verification¶

Validate JSON output against a schema and optionally ground string values against a knowledge base.

from director_ai import verify_json

# Parse check only
result = verify_json('{"status": "shipped", "tracking": "UPS1234"}')
assert result.valid_json is True

# Schema validation
schema = {
    "type": "object",
    "required": ["status", "tracking"],
    "properties": {
        "status": {"type": "string"},
        "tracking": {"type": "string"},
        "count": {"type": "integer"},
    },
}
result = verify_json('{"status": "shipped"}', schema=schema)
assert result.schema_valid is False  # missing required "tracking"

# Value grounding against a knowledge base
from director_ai import CoherenceScorer
scorer = CoherenceScorer(use_nli=True)

result = verify_json(
    '{"status": "shipped", "carrier": "FedEx"}',
    score_fn=lambda claim: scorer.calculate_logical_divergence(claim, claim),
)
for v in result.field_verdicts:
    print(f"{v.path}: {v.verdict} ({v.reason})")

What It Catches¶

Check	Example	Detection
Malformed JSON	`{"key": value}`	`valid_json=False`
Missing required field	Schema requires `name`, JSON has only `age`	`verdict="missing"`
Wrong type	Schema says `integer`, value is `"thirty"`	`verdict="invalid_type"`
Extra field	`additionalProperties: false` and unknown key present	`verdict="extra"`
Bool as integer	`{"count": true}` where schema says `integer`	`verdict="invalid_type"`
Ungrounded value	String value contradicts knowledge base	`verdict="invalid_value"`

Result Type¶

@dataclass
class StructuredVerificationResult:
    valid_json: bool
    schema_valid: bool | None  # None if no schema provided
    field_verdicts: list[FieldVerdict]
    error_count: int
    parse_error: str = ""

Tool Call Verification¶

Verify that an agent's function call is legitimate — the function exists, arguments are correct, and the result wasn't fabricated.

from director_ai import verify_tool_call

manifest = {
    "get_weather": {
        "description": "Get current weather for a city",
        "parameters": {"city": {"type": "string"}},
        "returns": "Weather data with temperature and conditions",
    },
    "search_database": {
        "description": "Search customer database",
        "parameters": {
            "query": {"type": "string"},
            "limit": {"type": "integer", "required": False},
        },
    },
}

# Valid call
result = verify_tool_call(
    function_name="get_weather",
    arguments={"city": "Prague"},
    manifest=manifest,
)
assert result.function_exists is True
assert result.arguments_valid is True

# Nonexistent function
result = verify_tool_call(
    function_name="get_stock_price",
    arguments={"symbol": "AAPL"},
    manifest=manifest,
)
assert result.function_exists is False

# Fabrication detection via execution log
execution_log = [
    {"function": "get_weather", "arguments": {"city": "Berlin"}, "result": "cloudy 10C"}
]
result = verify_tool_call(
    function_name="get_weather",
    arguments={"city": "Prague"},
    claimed_result="sunny 22C",
    manifest=manifest,
    execution_log=execution_log,
)
# Agent claims Prague weather but only Berlin was actually called
assert "different arguments" in result.reason

What It Catches¶

Check	Example	Detection
Nonexistent function	Agent calls `get_stock_price` not in manifest	`function_exists=False`
Wrong argument type	`city=123` where string expected	`arguments_valid=False`
Missing required arg	No `query` for `search_database`	`verdict="missing"`
Fabricated result	No execution log entry for the call	`fabrication_suspected=True`
Mismatched result	Log says "rainy", agent claims "sunny"	`fabrication_suspected=True`

Code Verification¶

Verify generated Python or JSON code for syntax errors, unknown imports, and hallucinated APIs.

from director_ai import verify_code

# Valid Python
result = verify_code("import os\nfiles = os.listdir('.')")
assert result.syntax_valid is True
assert result.unknown_imports == []

# Syntax error
result = verify_code("def broken(:\n  pass")
assert result.syntax_valid is False
assert "SyntaxError" in result.parse_error

# Unknown import
result = verify_code("import nonexistent_quantum_ml")
assert "nonexistent_quantum_ml" in result.unknown_imports

# Hallucinated API detection
manifest = {"pd": {"read_csv", "DataFrame", "merge", "groupby"}}
result = verify_code(
    "import pandas as pd\ndf = pd.read_quantum_csv('data.csv')",
    api_manifest=manifest,
)
assert "pd.read_quantum_csv" in result.hallucinated_apis

What It Catches¶

Check	Example	Detection
Syntax error	`def foo(:`	`syntax_valid=False`
Unknown import	`import fake_lib`	In `unknown_imports`
Hallucinated API	`pd.read_quantum_csv()`	In `hallucinated_apis`
JSON syntax	`{key: value}`	`syntax_valid=False` (language="json")

Custom Module Registry¶

result = verify_code(
    "import my_company_sdk\nmy_company_sdk.query('x')",
    known_modules={"my_company_sdk"},
    api_manifest={"my_company_sdk": {"query", "ingest", "delete"}},
)
assert result.unknown_imports == []
assert result.hallucinated_apis == []

Zero Dependencies¶

All three verifiers use Python stdlib only (json, ast, re). No torch, no transformers, no model downloads. Works on the lite install path. Latency is sub-millisecond.