Kuramoto neural-operator advantage study¶
This page reports whether the DeepONet neural-operator surrogate for Kuramoto network dynamics
(the forecasting.kuramoto_neural_operator module) does more than reduce its training loss — whether
it forecasts unseen initial conditions accurately and whether it is cheaper than integrating the
Kuramoto equations directly. The study separates the host-independent claims (forecast fidelity and
the operation-count model) from the host-dependent ones (wall-clock milliseconds), and it states a
claim boundary so nothing here is read as a portable performance number.
The committed artefact is docs/benchmarks/neural_operator_advantage.json, generated by
scripts/bench_neural_operator_advantage.py. The surrogate itself is quantified by
forecasting.neural_operator_advantage.evaluate_neural_operator_advantage; its host-independent
arithmetic lives in the pure-NumPy forecasting.neural_operator_cost_model.
The network and configuration¶
A complete-graph Kuramoto network of N = 32 oscillators with Gaussian natural frequencies
(σ = 0.5) and uniform mean-field coupling (strength 2.0, i.e. K_ij = 2.0 / N off-diagonal),
integrated with a fixed RK4 step dt = 0.05 over n_steps = 20 (horizon T = 1.0). The surrogate
(latent width 32, hidden width 96) was trained on 256 RK4 rollouts for 300 full-batch Adam epochs; the
training loss fell from 0.623 to 0.0053. The ground-truth trajectories were produced by the
toolkit's dispatched RK4 integrator on the rust tier.
Fidelity against the persistence baseline (host-independent)¶
On 40 held-out initial conditions — none seen during training — the surrogate's forecast is compared against the naive persistence baseline, which holds the initial phase constant. The error metric is the mean wrapped angular error (radians).
| Quantity | Surrogate | Persistence |
|---|---|---|
| Mean error over the horizon | 0.105 | 0.225 |
Error at the horizon (t = 1.0) |
0.193 | 0.467 |
The surrogate beats persistence by roughly a factor of two over the horizon. The error-versus-horizon
curve shows why: persistence is exact at t = 0 and degrades linearly, while the surrogate carries a
small fixed reconstruction error at t = 0 and then degrades far more slowly, overtaking persistence
by t ≈ 0.2 and widening the gap thereafter.
t |
Surrogate error | Persistence error |
|---|---|---|
| 0.0 | 0.063 | 0.000 |
| 0.2 | 0.066 | 0.085 |
| 0.4 | 0.080 | 0.174 |
| 0.6 | 0.107 | 0.267 |
| 0.8 | 0.145 | 0.365 |
| 1.0 | 0.193 | 0.467 |
These numbers are deterministic on a fixed host for the recorded dataset, training and evaluation seeds; reproducibility is asserted on the content, never on timings.
Operation count (host-independent)¶
The surrogate maps (θ₀, t) → θ(t) in a single forward pass for any query time, whereas direct RK4
must traverse every intermediate step to reach a far time. Two statements follow that are fixed by the
algorithm, not the host:
- Random access (model-free). Reaching the horizon by RK4 costs
4 · n_steps = 80right-hand-side evaluations, each anO(N²)dense-force evaluation; the surrogate reaches any query time in one forward pass. So a single random-access query replaces 80 right-hand-side evaluations with one pass. - Per-query FLOP ratio (stated model). Under an explicit FLOP model — a matrix multiply of an
in-vector to anout-vector counts as2 · in · outoperations, and each transcendental, elementary arithmetic operation and activation counts as one — the per-query FLOP ratio of direct simulation to the surrogate at this configuration is0.82. At this small network and short horizon the surrogate is not a per-query FLOP win, and there is no amortised break-even: the one-time training cost never pays back through cheaper per-query FLOPs here.
The per-query FLOP ratio is a modelling estimate, not a measurement; the model's assumptions are stated so the arithmetic is reproducible.
Where the FLOP advantage appears¶
The per-query direct cost grows like n_steps · N² while the surrogate's grows like
N · hidden · latent, so the ratio grows with both the horizon and the network size. The committed
artefact records this sweep (latent 32, hidden 96); the ratio crosses one — the surrogate becomes the
cheaper per-query option — as the network or horizon grows:
N \ n_steps |
20 | 40 | 80 | 160 |
|---|---|---|---|---|
| 16 | 0.41 | 0.83 | 1.66 | 3.32 |
| 32 | 0.82 | 1.63 | 3.26 | 6.52 |
| 64 | 1.62 | 3.23 | 6.46 | 12.92 |
| 128 | 3.21 | 6.43 | 12.86 | 25.72 |
The advantage is structural — random access at any N, and a per-query FLOP ratio that grows without
bound as the network or horizon grows.
Wall clock (host-dependent, boundary-guarded — not a claim)¶
For completeness the artefact records advisory millisecond timings, captured on an 11th Gen Intel
Core i5-11600K under a one-minute load average of about 27 with all twelve logical cores available
(no reserved core, powersave governor). On this host the surrogate's single-query forecast took
about 0.27 ms against about 1.07 ms for the full direct trajectory.
These timings are excluded from the reproducible set and are not a performance claim. They depend
on the host, the governor, the BLAS backend and the accelerator tier, and they can disagree with the
operation-count model — here the model rates the surrogate slightly more expensive per query while the
wall clock rates it faster, precisely because a millisecond margin is a host and backend artefact.
Clean absolute numbers require a quiesced, core-reserved host; see the isolation guidance under
docs/internal.
Honest conclusion and limitation¶
The surrogate's genuine advantages are its held-out forecast fidelity (it beats the persistence baseline by about a factor of two) and its structural random access to any query time in a single pass. Its per-query FLOP advantage is real but regime-dependent: it emerges for larger networks and longer horizons, not at the small demonstrated configuration.
There is an honest tension between the two axes. The operation-count advantage grows with the network size, but a surrogate's forecast fidelity at a larger network requires proportionally more training capacity and data; a fixed training budget tuned to beat persistence at this network size will underfit a substantially larger one. The demonstration is therefore reported at the network size where the recorded budget achieves the fidelity bar, with the operation-count crossover projected arithmetically rather than by inflating the network until the FLOP ratio looks large.
Reproduce¶
python scripts/bench_neural_operator_advantage.py --n 32 --n-trajectories 256 --epochs 300 --n-eval 40
This requires the optional PyTorch extra (scpn-quantum-control[torch]). The command writes
docs/benchmarks/neural_operator_advantage.json, whose payload_sha256 digest covers only the
bit-exact cost model and configuration, so it reproduces regardless of the host provenance or timings.
Related¶
- Kuramoto Competitive Benchmark — the external cross-library comparison of the direct integrators.
- Kuramoto Tier Benchmark — the multi-tier (Rust / Julia / Python) performance provenance for the integrators themselves.
- Kuramoto Handbook — the full facade the surrogate's ground-truth integrator is part of.