Neural Transport Surrogate Training Recipe¶
This guide describes how to train or retrain the neural transport surrogate
used in scpn-control. The surrogate is a compact Multi-Layer Perceptron (MLP)
that replaces expensive gyrokinetic simulations (like TGLF or QuaLiKiz)
with millisecond-scale inference.
1. Prerequisites¶
- Python: 3.9 or newer.
- Framework: PyTorch (recommended) or a raw NumPy-based trainer.
- Data Science Stack:
numpy,pandas,scikit-learn. - Hardware: CPU is sufficient for this compact architecture (< 100k parameters).
2. Acquire Dataset Metadata¶
The default surrogate training lane targets QLKNN-class QuaLiKiz data.
Normalised public Zenodo file manifests are mirrored in
validation/reference_data/qlknn/ without vendoring raw third-party record
descriptions or multi-GB tensors:
zenodo_3497066: QLKNN10D training set, doi:10.5281/zenodo.3497066.zenodo_7418108: QuaLiKiz v2.6.2 JET linear-instability spectra, doi:10.5281/zenodo.7418108.zenodo_8017522: QLKNN11D training set, doi:10.5281/zenodo.8017522.
Each directory contains a Zenodo record.json and files_manifest.json.
Each directory contains a files_manifest.json with source DOI, record digest,
file sizes, Zenodo API download URLs, and MD5 checksums. Multi-GB NetCDF and
HDF5 tensors remain deferred and must be pulled onto an admitted storage or GPU
target before training or validation:
The acquisition manifests are not validation evidence by themselves. Quantitative
neural-transport claims still require strict
scpn-control.neural-transport-reference.v1 artifacts for the trained weights,
reference tensors, predictions, units, feature ordering, and error metrics.
- File Format: NetCDF or HDF5.
- Columns: QLKNN10D uses 10 input features and transport-flux targets.
- Units: Normalised gradients (R/L), temperatures (keV), and fluxes (gyro-Bohm units or m²/s), as declared by the processed reference artifacts.
3. Data Preparation¶
Feature Selection¶
The 10 input features required by the TransportInputs class are:
1. R/L_Ti: Ion temperature gradient.
2. R/L_Te: Electron temperature gradient.
3. R/L_ne: Electron density gradient.
4. q: Safety factor.
5. s_hat: Magnetic shear.
6. alpha_MHD: Pressure gradient parameter.
7. Te/Ti: Temperature ratio.
8. Z_eff: Effective charge.
9. nu_star: Collisionality.
10. beta_e: Electron beta.
Processing Pipeline¶
- Filtering: Remove non-physical samples (e.g., negative temperatures).
- Normalisation: Use
StandardScalerto reach zero mean and unit variance. Note: Store the mean and scale values; they must be provided toNeuralTransportModelfor inference. - Split: 80% Train, 10% Validation, 10% Test.
4. Architecture¶
We use a three-layer MLP: 10 → 128 → 64 → 3.
- Hidden Layers: 128 and 64 neurons with ReLU activation.
- Output Layer: 3 neurons (
chi_e,chi_i,D_e) with Softplus activation to ensure positive diffusivities. - Design Goal: Compactness. This architecture achieves < 25 µs inference time on a single CPU thread, enabling integration into 10kHz control loops.
5. Training Loop¶
Hyperparameters¶
- Batch Size: 512
- Optimizer: Adam
- Initial Learning Rate: 1e-3
- Scheduler:
ReduceLROnPlateau(factor 0.5, patience 10) - Early Stopping: Patience 20 on validation loss.
Code Snippet (PyTorch)¶
model = nn.Sequential(
nn.Linear(10, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 3),
nn.Softplus()
)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)
6. Weight Export¶
The NeuralTransportModel expects weights in a specific .npz format.
After training, export the state dictionary using NumPy:
import numpy as np
# Extract weights from PyTorch model
w1 = model[0].weight.detach().numpy().T
b1 = model[0].bias.detach().numpy()
w2 = model[2].weight.detach().numpy().T
b2 = model[2].bias.detach().numpy()
w3 = model[4].weight.detach().numpy().T
b3 = model[4].bias.detach().numpy()
np.savez("neural_transport_custom.npz",
w1=w1, b1=b1, w2=w2, b2=b2, w3=w3, b3=b3,
input_mean=scaler.mean_,
input_scale=scaler.scale_)
7. Validation¶
After exporting, verify the new weights using the provided test suite:
- Update the weight path in your local config or environment variable.
- Run
pytest tests/test_neural_transport_core.pyto check for shape and consistency errors. - Run
pytest tests/test_neural_transport_physics.pyto ensure the new model respects fundamental plasma trends (monotonicity, thresholds).
8. Retraining from Custom Data¶
To adapt the surrogate for different regimes:
- ETG-dominant: Include higher resolution in \(R/L_{Te}\) and focus
training on electron flux columns.
- Stellarators: You may need to add additional geometry features
(e.g., helical ripple) and increase hidden layer width to 256.
- Data Sources: For custom tokamak configurations, use the
scpn_control.core.tglf_adapter (if available) to generate
local simulation batches.