Skip to content

Structured Output Verification

Verify JSON, tool calls, and code generated by LLMs — catching hallucinations that NLI cannot detect.

Why Structured Verification?

NLI scores natural language against natural language. But agentic LLMs produce JSON, function calls, SQL, and code. A JSON field with the wrong type, a function call to a nonexistent API, or a Python import that doesn't exist — NLI cannot catch these. Structured verification uses deterministic checks (schema validation, AST parsing, manifest lookup) that are 100% precise for format violations, zero latency, and require no model.

JSON Verification

Validate JSON output against a schema and optionally ground string values against a knowledge base.

from director_ai import verify_json

# Parse check only
result = verify_json('{"status": "shipped", "tracking": "UPS1234"}')
assert result.valid_json is True

# Schema validation
schema = {
    "type": "object",
    "required": ["status", "tracking"],
    "properties": {
        "status": {"type": "string"},
        "tracking": {"type": "string"},
        "count": {"type": "integer"},
    },
}
result = verify_json('{"status": "shipped"}', schema=schema)
assert result.schema_valid is False  # missing required "tracking"

# Value grounding against a knowledge base
from director_ai import CoherenceScorer
scorer = CoherenceScorer(use_nli=True)

result = verify_json(
    '{"status": "shipped", "carrier": "FedEx"}',
    score_fn=lambda claim: scorer.calculate_logical_divergence(claim, claim),
)
for v in result.field_verdicts:
    print(f"{v.path}: {v.verdict} ({v.reason})")

What It Catches

Check Example Detection
Malformed JSON {"key": value} valid_json=False
Missing required field Schema requires name, JSON has only age verdict="missing"
Wrong type Schema says integer, value is "thirty" verdict="invalid_type"
Extra field additionalProperties: false and unknown key present verdict="extra"
Bool as integer {"count": true} where schema says integer verdict="invalid_type"
Ungrounded value String value contradicts knowledge base verdict="invalid_value"

Result Type

@dataclass
class StructuredVerificationResult:
    valid_json: bool
    schema_valid: bool | None  # None if no schema provided
    field_verdicts: list[FieldVerdict]
    error_count: int
    parse_error: str = ""

Tool Call Verification

Verify that an agent's function call is legitimate — the function exists, arguments are correct, and the result wasn't fabricated.

from director_ai import verify_tool_call

manifest = {
    "get_weather": {
        "description": "Get current weather for a city",
        "parameters": {"city": {"type": "string"}},
        "returns": "Weather data with temperature and conditions",
    },
    "search_database": {
        "description": "Search customer database",
        "parameters": {
            "query": {"type": "string"},
            "limit": {"type": "integer", "required": False},
        },
    },
}

# Valid call
result = verify_tool_call(
    function_name="get_weather",
    arguments={"city": "Prague"},
    manifest=manifest,
)
assert result.function_exists is True
assert result.arguments_valid is True

# Nonexistent function
result = verify_tool_call(
    function_name="get_stock_price",
    arguments={"symbol": "AAPL"},
    manifest=manifest,
)
assert result.function_exists is False

# Fabrication detection via execution log
execution_log = [
    {"function": "get_weather", "arguments": {"city": "Berlin"}, "result": "cloudy 10C"}
]
result = verify_tool_call(
    function_name="get_weather",
    arguments={"city": "Prague"},
    claimed_result="sunny 22C",
    manifest=manifest,
    execution_log=execution_log,
)
# Agent claims Prague weather but only Berlin was actually called
assert "different arguments" in result.reason

What It Catches

Check Example Detection
Nonexistent function Agent calls get_stock_price not in manifest function_exists=False
Wrong argument type city=123 where string expected arguments_valid=False
Missing required arg No query for search_database verdict="missing"
Fabricated result No execution log entry for the call fabrication_suspected=True
Mismatched result Log says "rainy", agent claims "sunny" fabrication_suspected=True

Code Verification

Verify generated Python or JSON code for syntax errors, unknown imports, and hallucinated APIs.

from director_ai import verify_code

# Valid Python
result = verify_code("import os\nfiles = os.listdir('.')")
assert result.syntax_valid is True
assert result.unknown_imports == []

# Syntax error
result = verify_code("def broken(:\n  pass")
assert result.syntax_valid is False
assert "SyntaxError" in result.parse_error

# Unknown import
result = verify_code("import nonexistent_quantum_ml")
assert "nonexistent_quantum_ml" in result.unknown_imports

# Hallucinated API detection
manifest = {"pd": {"read_csv", "DataFrame", "merge", "groupby"}}
result = verify_code(
    "import pandas as pd\ndf = pd.read_quantum_csv('data.csv')",
    api_manifest=manifest,
)
assert "pd.read_quantum_csv" in result.hallucinated_apis

What It Catches

Check Example Detection
Syntax error def foo(: syntax_valid=False
Unknown import import fake_lib In unknown_imports
Hallucinated API pd.read_quantum_csv() In hallucinated_apis
JSON syntax {key: value} syntax_valid=False (language="json")

Custom Module Registry

result = verify_code(
    "import my_company_sdk\nmy_company_sdk.query('x')",
    known_modules={"my_company_sdk"},
    api_manifest={"my_company_sdk": {"query", "ingest", "delete"}},
)
assert result.unknown_imports == []
assert result.hallucinated_apis == []

Zero Dependencies

All three verifiers use Python stdlib only (json, ast, re). No torch, no transformers, no model downloads. Works on the lite install path. Latency is sub-millisecond.