Timing-Aware Formal Properties¶
SC-NeuroCore now carries a timing-aware formal framework for FPGA control loops. The framework targets bounded latency, hard deadlines, and bounded liveness for generated or hand-authored HDL surfaces where a missed response is a control-system fault, not a cosmetic regression.
Production contract¶
The framework has four surfaces:
SystemVerilog: reusable monitors inhdl/formal/timing/timing_wrapper_lib.svand assertion macros inhdl/formal/timing/timing_assertions.svh.SymbiYosys/cvc5: executable proof examples, currently anchored by the dense-layer latency contract inhdl/formal/timing/example_dense_layer_core_latency.sby.Python:TimingPropertyandTimingProofOrchestratorprovide deterministic proof orchestration and fail-closed dependency reporting.nuXmvandKind2: emitters generate model-checker forms for the same timing property specification. Local execution is recorded as unavailable when those external binaries are absent.
A timing property is valid only when it states a real event-response contract. The dense-layer example proves these contracts:
start_pulsemust lead torun_donewithin the bounded dense-layer stream latency.start_pulsemust lead tostep_validinside the first-step liveness window.run_donemust not remain concurrent withrunning.
Monitor semantics¶
sc_latency_monitor starts one obligation when start_event rises while no obligation is active. It clears the obligation on end_event. If the obligation remains active after the configured cycle bound, the sticky violation flag is asserted.
sc_deadline_monitor and sc_bounded_liveness_monitor are named wrappers over the same obligation semantics. This keeps deadline and liveness contracts auditable while avoiding divergent RTL behaviour between property classes.
Consumable surface for external integration¶
A downstream repository (for example SCPN-MIF-CORE) binds these properties against its own RTL in the open-source Yosys/SymbiYosys flow, which has no concurrent SVA. Every property here is therefore a counter/shift-register monitor plus a procedural immediate assert (!violation) inside always @(posedge CLK) — read with read_verilog -formal -sv alongside the consumer's RTL. The framework owns the reusable property; the consumer owns the instantiation against its own signals.
Bounded latency¶
`SC_ASSERT_LATENCY_LE(NAME, CLK, RST_N, START_EVENT, END_EVENT, BOUND_CYCLES) expresses "END_EVENT within BOUND_CYCLES of START_EVENT", e.g. an actuator asserted within N cycles of a sensor event. SC_ASSERT_DEADLINE_LE and SC_ASSERT_BOUNDED_LIVENESS share the shape. Cycle-accurate only — nanosecond timing is a post-route/silicon fact, not a framework claim.
Two-flop CDC synchroniser¶
`SC_ASSERT_CDC_TWO_FLOP(NAME, DST_CLK, RST_N, ASYNC_IN, META_Q, SYNC_OUT, SYNC_DEPTH_N) binds sc_cdc_two_flop_monitor over the destination-domain synchroniser flops the consumer owns (META_Q = first flop, SYNC_OUT = last flop). The monitor builds a SYNC_DEPTH_N-deep reference delay of ASYNC_IN and, once warmed up, asserts:
SYNC_OUTisASYNC_INdelayed by exactlySYNC_DEPTH_Nflops — this pins the synchroniser depth (a one-flop chain fails) and the absence of combinational contamination on the data path;SYNC_OUTis a pure registered copy ofMETA_Q— no combinational/glitch path escapes past the last flop, so the output is stable for the whole destination cycle.
A cover on the synchronised-output transition keeps the proof non-vacuous. This is RTL-level structural verification; physical metastability resolution time stays a silicon/STA fact, so the monitor never asserts that the first flop is metastability-free. example_cdc_two_flop_synchroniser.{sv,sby} is the worked proof; the test suite also proves a one-flop synchroniser is rejected, so the property has teeth. Multi-bit non-gray buses need a handshake/FIFO template rather than this single-bit/gray-coded one.
SymbiYosys task convention¶
Tasks register in a consumer's formal_manifest.py drift gate with the triple (mode, engine, depth) used by example_*.sby:
[options]
mode bmc # or `prove` for k-induction; the monitors reset cleanly and are induction-friendly
depth <BOUND + margin> # CDC: SYNC_DEPTH + 4; latency: BOUND + 2
[engines]
smtbmc cvc5 # boolector is interchangeable
[script]
read_verilog -formal -sv -I. timing_wrapper_lib.sv <wrapper>.sv <rtl...>.sv
prep -top <formal_wrapper>
Ownership¶
For a CDC crossing into a consumer's fabric, the consumer owns the synchroniser RTL instantiation (e.g. mif_aer_cdc_synchroniser.sv) and sc-neurocore owns the property template plus the upstream stream (the AER router/encoder up to the crossing boundary).
Dependency policy¶
SymbiYosys and cvc5 are executable proof dependencies for the dense-layer example. nuXmv and Kind 2 are optional external model-checker runtimes in this repository state: the emitters are tested and benchmarked, while runtime execution is reported as unavailable unless the binaries are installed on the host.
The orchestrator never treats a missing executable as a proof pass. It returns exit_code=127, records the missing dependency names, and marks the proof as failed.
Benchmarking and core isolation¶
Timing-framework benchmark evidence belongs in benchmarks/results/local_python_2026-06-04_timing_formal_framework.json. The benchmark records:
- SystemVerilog proof runtime through SymbiYosys/cvc5.
- Python property construction and proof orchestration.
- nuXmv model-emitter throughput.
- Kind 2 Lustre-emitter throughput.
- Runtime CPU affinity, cgroup cpuset, load average, and governor sample.
Because the workstation can be under load, timing benchmark comparisons are only valid when run under a runtime cpuset shield. The required local lane pins the process to CPUs 10-11 and keeps the host system slice off those CPUs during the run.
Replication¶
Run the proof directly:
sby -f hdl/formal/timing/example_dense_layer_core_latency.sby
Run the benchmark with isolated benchmark cores:
sudo systemctl set-property --runtime system.slice AllowedCPUs=0-9
sudo systemctl set-property --runtime user.slice AllowedCPUs=0-9
sudo systemctl set-property --runtime init.scope AllowedCPUs=0-9
sudo systemd-run --wait --collect --pipe --slice=benchmark.slice \
-p AllowedCPUs=10-11 -p CPUAffinity='10 11' \
--uid=anulum --gid=anulum --working-directory="$PWD" \
--setenv=PYTHONPATH="src:hdl/formal" \
/usr/bin/taskset -c 10-11 \
.venv/bin/python benchmarks/bench_timing_formal_framework.py \
--json benchmarks/results/local_python_2026-06-04_timing_formal_framework.json
sudo systemctl set-property --runtime system.slice AllowedCPUs=0-31
sudo systemctl set-property --runtime user.slice AllowedCPUs=0-31
sudo systemctl set-property --runtime init.scope AllowedCPUs=0-31
If a different workstation has a different CPU topology, replace 10-11 with dedicated benchmark cores and record the resulting host_context.cgroup_effective_cpuset in the benchmark JSON.