Continuous Guard Fuzzing¶

A static adversarial suite checks a fixed list of attacks; a fuzzer mutates a seed corpus round after round and surfaces the variant a guard fails to flag. The ContinuousFuzzer takes a guard predicate and a corpus of strings the guard should flag, applies obfuscating mutations, and reports every one that slipped through — plus any seed the guard missed outright. The RNG is seeded, so each finding is replayable as a regression case.

Quick start¶

from director_ai import ProductionGuard
from director_ai.core.config import DirectorConfig

fuzzer = ProductionGuard(DirectorConfig()).continuous_fuzzer(seed=0)

# predicate: True means "the guard flags this as an attack".
def guard_flags(text: str) -> bool:
    return my_injection_detector.is_attack(text)

report = fuzzer.run(guard_flags, rounds_per_seed=50)   # default attack corpus
print(report.ok)                 # False if any mutation bypassed the guard
for bypass in report.bypasses:
    print(bypass.operator, "→", bypass.mutation)        # replayable obfuscation

Mutation operators¶

Each operator perturbs an attack string while a human still reads the same intent:

Operator	Obfuscation
`case_flip`	Randomised upper/lower casing.
`whitespace_inject`	Stray spaces, tabs, newlines between characters.
`zero_width_inject`	Zero-width spaces/joiners/BOM sprinkled in.
`homoglyph_substitute`	Latin letters → confusable Cyrillic homoglyphs.
`leetspeak`	`a→4`, `e→3`, `i→1`, `o→0`, `s→5`, `t→7`.
`char_duplicate`	Doubles a sample of characters.
`delimiter_inject`	Splices chat/template delimiters (`</s>`, `[INST]`, …).

Pass a custom mutators mapping to ContinuousFuzzer to add domain-specific operators.

The report¶

run(predicate, corpus=None, rounds_per_seed=25) returns a FuzzReport:

Field	Meaning
`seeds_tested`	Number of corpus seeds.
`mutations_run`	Total mutations evaluated.
`bypasses`	`Bypass(operator, seed, mutation)` for each mutation the guard missed.
`seed_misses`	Seeds the guard failed to flag even unmutated (a baseline gap).
`operators_used`	Which operators were exercised.
`ok`	True only when there were no bypasses and no seed misses.

A seed the guard cannot even flag unmutated is reported as a seed_miss and is not mutated further — fix the baseline first. Wire run() into CI to fail the build when report.ok is False, turning bypass discovery into a standing regression gate. The bypass payloads are attack-corpus mutations, not tenant data, so they are safe to surface to the security team for triage.