SPDX-License-Identifier: AGPL-3.0-or-later¶
Commercial license available¶
© Concepts 1996–2026 Miroslav Šotek. All rights reserved.¶
© Code 2020–2026 Miroslav Šotek. All rights reserved.¶
ORCID: 0009-0009-3560-0851¶
Contact: www.anulum.li | protoscience@anulum.li¶
SCPN Quantum Control — Methods Benchmark Dashboard¶
Methods Benchmark Dashboard¶
This page is the public reproducibility dashboard for the benchmark artefacts supporting the Rust/VQE methods papers and the SCPN/FIM Hamiltonian paper. The rule is artefact-first: tables and manuscript claims should be regenerated from committed scripts, JSON summaries, and CSV summaries.
Repository: https://github.com/anulum/scpn-quantum-control
Current dashboard snapshot¶
| Area | Current status |
|---|---|
| One-command methods reproduction | scpn-bench reproduce-methods regenerates the local Rust/VQE methods artefacts. |
| One-command FIM reproduction | scpn-bench fim-all regenerates the committed SCPN/FIM offline artefacts. |
| Full portfolio reproduction | scpn-bench all runs both methods and FIM offline harness groups. |
| Optional GPU benchmark | --include-gpu adds the Vertex/local GPU dense-expectation harness when CUDA dependencies are available. |
| Optional scaling benchmark | --include-scaling adds n=4--12 ansatz-scaling and tensor-network diagnostics. |
| Optional readout mitigation | --include-readout adds full-basis readout-matrix mitigation where the calibration basis is complete. |
| Hardware spending boundary | scpn-bench never submits IBM jobs; it analyses committed artefacts only. |
Quick reproduction recipes¶
Run these commands from the repository root after installing the package and its development dependencies.
Minimal methods-paper check:
Minimal SCPN/FIM-paper check:
Full offline portfolio check:
List selected harnesses without executing them:
Include heavier optional checks:
scpn-bench reproduce-methods --include-scaling
scpn-bench reproduce-methods --include-gpu
scpn-bench fim-all --include-readout
The CLI reports whether regenerated artefacts differ from committed files. A non-zero diff status means the local run produced artefact drift that should be reviewed before any manuscript number is updated.
Reproducibility commands¶
The scpn-bench entry point is the public one-command interface for local
artefact regeneration:
Useful options:
| Option | Purpose |
|---|---|
--dry-run |
Print selected harnesses without executing them. |
--include-gpu |
Include optional GPU harnesses. |
--include-readout |
Include full-basis offline readout-matrix mitigation where the calibration basis is complete. |
--include-scaling |
Include n=4--12 ansatz-scaling and tensor-network diagnostics. |
--keep-going |
Continue after a failed harness and report all failures. |
--no-diff |
Skip the post-run committed-artefact diff summary. |
By default the CLI runs offline harnesses only. IBM preparation and submission
scripts are deliberately excluded from scpn-bench; IBM raw-count analyses are
included only where they consume already committed JSON data.
Current scientific dashboard¶
| Claim family | Dashboard value | Artefact source | Boundary |
|---|---|---|---|
| Rust/VQE methods | Tables are regenerated from committed JSON/CSV artefacts and scripts. | data/rust_vqe_methods/ |
Opportunistic timing data, not universal hardware constants. |
| Ansatz scaling | n=4--12 scaling rows, tensor-network truncation diagnostics, and committed VQE-reference comparison rows are generated by the optional scaling harness. | ansatz_scaling_tn_summary_2026-05-05.json |
Dense exact references cover small n; sparse eigensolver references extend the current promoted diagnostics where feasible; missing larger-n VQE rows are marked skipped rather than extrapolated. |
| FIM exact Hamiltonian | Spectra, level spacing, entanglement, sector survival, and VQE scoring are generated offline. | data/scpn_fim_hamiltonian/ |
Exact small-system structure, not a hardware-protection claim. |
| FIM hardware repeated run | Repeated IBM follow-up falsifies the simple lambda=4 hardware-protection hypothesis. |
fim_ibm_repeated_followup_analysis_2026-05-05_ibm-run-cf4835290f607387.json |
Backend/circuit-family specific. |
| FIM full-basis readout mitigation | Full 16-state readout inversion preserves the negative FIM result; matrix condition number is 1.049. |
fim_readout_matrix_mitigation_summary_2026-05-05_ibm-run-cf4835290f607387.json |
Measurement-confusion mitigation only; no gate-error or Trotter correction. |
| Readout-mitigation eligibility | Promoted n<=8 raw-count datasets are marked as full-basis eligible, partial exact-state baseline only, or missing readout calibration. | readout_mitigation_eligibility_2026-05-06.json |
Marker only; new calibration circuits still require separate QPU approval. |
Current artefact groups¶
Current combined artefact hashes¶
| Artefact | SHA256 |
|---|---|
combined_methods_benchmark_summary_2026-05-05.json |
593330a1dd19f495b899be1031ebe3dd4caa07171053aa376c2f761e557c1428 |
combined_methods_benchmark_summary_2026-05-05.csv |
e69b94df590ff06708b3b21245864f74c3df630b514254526dc6c4af3fe24c2f |
Individual harness commands¶
The one-command CLI is preferred for reproducibility checks. Individual harnesses remain useful when a single table needs to be regenerated during development:
python scripts/benchmark_rust_core_methods.py
python scripts/benchmark_ansatz_methods.py
python scripts/benchmark_vqe_methods.py
python scripts/benchmark_multilang_knm_methods.py
python scripts/benchmark_gpu_methods.py
python scripts/summarise_rust_vqe_method_artifacts.py
python scripts/benchmark_ansatz_scaling_tn.py
python scripts/analyse_fim_spectrum.py
python scripts/analyse_fim_level_spacing.py
python scripts/analyse_fim_entanglement.py
python scripts/analyse_fim_sector_survival.py
python scripts/benchmark_fim_vqe_ground_state.py
python scripts/analyse_fim_ibm_pilot.py
python scripts/analyse_fim_ibm_repeated_followup.py
python scripts/analyse_fim_readout_matrix_mitigation.py
python scripts/audit_readout_mitigation_eligibility.py
Remote or non-local machine artefacts should record the machine identity,
hardware context, command, timestamp, and checksum before being promoted into
data/rust_vqe_methods/.
Implemented CLI behaviour¶
- Regenerate local deterministic artefacts from committed scripts.
- Keep optional GPU harnesses behind
--include-gpu. - Rebuild combined JSON and CSV summaries where a summariser exists.
- Compare regenerated artefacts with committed files.
- Report changed artefacts explicitly instead of silently accepting drift.
- Avoid spending QPU time or submitting hardware jobs.
- Keep full-basis readout-matrix mitigation optional behind
--include-readout, because it is only valid for hardware datasets with a complete calibration basis.
Machine provenance¶
Current promoted benchmark artefacts include:
- Local workstation CPU runs.
- ML350 CPU runs.
- Vertex
n1-standard-4CPU runs. - Vertex T4 GPU runs for batched dense expectation validation.
These timings are opportunistic and not isolated benchmark-lab measurements. They are useful for reproducibility and cross-machine sanity checks, but the papers should not interpret them as universal hardware performance constants.
Dashboard boundaries¶
- The dashboard is a reproducibility surface, not a performance leaderboard.
- Shared-machine CPU timings can vary with background load.
- GPU artefacts validate batched dense expectation workloads, not the Rust scalar coupling kernel.
- IBM raw-count analyses consume committed JSON data only.
- New IBM submissions require separate approval, live backend readiness checks, and QPU budget accounting.
Planned extensions¶
Ansatz scaling with tensor-network baselines¶
The n=6--12 ansatz-scaling study records circuit scaling, tensor-network
truncation diagnostics, and per-n comparisons against committed VQE aggregate
references where such VQE rows exist. The harness uses dense exact
diagonalisation up to the configured exact limit and sparse eigensolver ground
states above that limit where feasible. Rows beyond the configured sparse limit,
or larger-n VQE comparisons without committed optimisation artefacts, remain
skipped rather than extrapolated. Current outputs:
data/rust_vqe_methods/ansatz_scaling_tn_summary_*.jsondata/rust_vqe_methods/ansatz_scaling_summary_*.csvdata/rust_vqe_methods/tn_truncation_summary_*.csvdata/rust_vqe_methods/ansatz_tn_reference_comparison_summary_*.csv
Analog XY bridge¶
The analog bridge should start as an optional Pulser / Bloqade design spike for neutral-atom XY mappings. It should remain separate from the default digital Qiskit workflow until the mapping assumptions, dependencies, and reproducibility artefacts are documented.