Kuramoto neural-operator advantage study¶

This page reports whether the DeepONet neural-operator surrogate for Kuramoto network dynamics (the forecasting.kuramoto_neural_operator module) does more than reduce its training loss — whether it forecasts unseen initial conditions accurately and whether it is cheaper than integrating the Kuramoto equations directly. The study separates the host-independent claims (forecast fidelity and the operation-count model) from the host-dependent ones (wall-clock milliseconds), and it states a claim boundary so nothing here is read as a portable performance number.

The committed artefact is docs/benchmarks/neural_operator_advantage.json, generated by scripts/bench_neural_operator_advantage.py. The surrogate itself is quantified by forecasting.neural_operator_advantage.evaluate_neural_operator_advantage; its host-independent arithmetic lives in the pure-NumPy forecasting.neural_operator_cost_model.

The network and configuration¶

A complete-graph Kuramoto network of N = 32 oscillators with Gaussian natural frequencies (σ = 0.5) and uniform mean-field coupling (strength 2.0, i.e. K_ij = 2.0 / N off-diagonal), integrated with a fixed RK4 step dt = 0.05 over n_steps = 20 (horizon T = 1.0). The surrogate (latent width 32, hidden width 96) was trained on 256 RK4 rollouts for 300 full-batch Adam epochs; the training loss fell from 0.623 to 0.0053. The ground-truth trajectories were produced by the toolkit's dispatched RK4 integrator on the rust tier.

Fidelity against the persistence baseline (host-independent)¶

On 40 held-out initial conditions — none seen during training — the surrogate's forecast is compared against the naive persistence baseline, which holds the initial phase constant. The error metric is the mean wrapped angular error (radians).

Quantity	Surrogate	Persistence
Mean error over the horizon	0.105	0.225
Error at the horizon (`t = 1.0`)	0.193	0.467

The surrogate beats persistence by roughly a factor of two over the horizon. The error-versus-horizon curve shows why: persistence is exact at t = 0 and degrades linearly, while the surrogate carries a small fixed reconstruction error at t = 0 and then degrades far more slowly, overtaking persistence by t ≈ 0.2 and widening the gap thereafter.

`t`	Surrogate error	Persistence error
0.0	0.063	0.000
0.2	0.066	0.085
0.4	0.080	0.174
0.6	0.107	0.267
0.8	0.145	0.365
1.0	0.193	0.467

These numbers are deterministic on a fixed host for the recorded dataset, training and evaluation seeds; reproducibility is asserted on the content, never on timings.

Operation count (host-independent)¶

The surrogate maps (θ₀, t) → θ(t) in a single forward pass for any query time, whereas direct RK4 must traverse every intermediate step to reach a far time. Two statements follow that are fixed by the algorithm, not the host:

Random access (model-free). Reaching the horizon by RK4 costs 4 · n_steps = 80 right-hand-side evaluations, each an O(N²) dense-force evaluation; the surrogate reaches any query time in one forward pass. So a single random-access query replaces 80 right-hand-side evaluations with one pass.
Per-query FLOP ratio (stated model). Under an explicit FLOP model — a matrix multiply of an in-vector to an out-vector counts as 2 · in · out operations, and each transcendental, elementary arithmetic operation and activation counts as one — the per-query FLOP ratio of direct simulation to the surrogate at this configuration is 0.82. At this small network and short horizon the surrogate is not a per-query FLOP win, and there is no amortised break-even: the one-time training cost never pays back through cheaper per-query FLOPs here.

The per-query FLOP ratio is a modelling estimate, not a measurement; the model's assumptions are stated so the arithmetic is reproducible.

Where the FLOP advantage appears¶

The per-query direct cost grows like n_steps · N² while the surrogate's grows like N · hidden · latent, so the ratio grows with both the horizon and the network size. The committed artefact records this sweep (latent 32, hidden 96); the ratio crosses one — the surrogate becomes the cheaper per-query option — as the network or horizon grows:

`N` \ `n_steps`	20	40	80	160
16	0.41	0.83	1.66	3.32
32	0.82	1.63	3.26	6.52
64	1.62	3.23	6.46	12.92
128	3.21	6.43	12.86	25.72

The advantage is structural — random access at any N, and a per-query FLOP ratio that grows without bound as the network or horizon grows.

Wall clock (host-dependent, boundary-guarded — not a claim)¶

For completeness the artefact records advisory millisecond timings, captured on an 11th Gen Intel Core i5-11600K under a one-minute load average of about 27 with all twelve logical cores available (no reserved core, powersave governor). On this host the surrogate's single-query forecast took about 0.27 ms against about 1.07 ms for the full direct trajectory.

These timings are excluded from the reproducible set and are not a performance claim. They depend on the host, the governor, the BLAS backend and the accelerator tier, and they can disagree with the operation-count model — here the model rates the surrogate slightly more expensive per query while the wall clock rates it faster, precisely because a millisecond margin is a host and backend artefact. Clean absolute numbers require a quiesced, core-reserved host; see the isolation guidance under docs/internal.

Honest conclusion and limitation¶

The surrogate's genuine advantages are its held-out forecast fidelity (it beats the persistence baseline by about a factor of two) and its structural random access to any query time in a single pass. Its per-query FLOP advantage is real but regime-dependent: it emerges for larger networks and longer horizons, not at the small demonstrated configuration.

There is an honest tension between the two axes. The operation-count advantage grows with the network size, but a surrogate's forecast fidelity at a larger network requires proportionally more training capacity and data; a fixed training budget tuned to beat persistence at this network size will underfit a substantially larger one. The demonstration is therefore reported at the network size where the recorded budget achieves the fidelity bar, with the operation-count crossover projected arithmetically rather than by inflating the network until the FLOP ratio looks large.

Reproduce¶

python scripts/bench_neural_operator_advantage.py --n 32 --n-trajectories 256 --epochs 300 --n-eval 40

This requires the optional PyTorch extra (scpn-quantum-control[torch]). The command writes docs/benchmarks/neural_operator_advantage.json, whose payload_sha256 digest covers only the bit-exact cost model and configuration, so it reproduces regardless of the host provenance or timings.

Kuramoto Competitive Benchmark — the external cross-library comparison of the direct integrators.
Kuramoto Tier Benchmark — the multi-tier (Rust / Julia / Python) performance provenance for the integrators themselves.
Kuramoto Handbook — the full facade the surrogate's ground-truth integrator is part of.