NeuroBench Benchmarking¶
NeuroBench-compatible standardized SNN evaluation framework.
Metrics¶
sc_neurocore.benchmarks.metrics
¶
NeuroBench-compatible metrics: accuracy, compute complexity, spike counts.
Follows the NeuroBench algorithm track specification: - Correctness metrics: accuracy, mAP, MSE (task-specific) - Complexity metrics: synaptic operations, activation sparsity, total parameters, classification latency
Reference: NeuroBench (Nature Communications 2025)
BenchmarkResult
dataclass
¶
NeuroBench-compatible benchmark result.
Source code in src/sc_neurocore/benchmarks/metrics.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | |
to_neurobench_json()
¶
Export as NeuroBench-compatible JSON.
Source code in src/sc_neurocore/benchmarks/metrics.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | |
compute_metrics(predictions, targets, spike_counts=None, weights=None, timesteps=1, latency_ms=0.0, task='classification', model='sc_neurocore')
¶
Compute NeuroBench-compatible metrics from model outputs.
Parameters¶
predictions : ndarray Model predictions (class indices for classification). targets : ndarray Ground truth labels. spike_counts : ndarray, optional Per-sample total spike counts. weights : list of ndarray, optional Weight matrices for parameter counting. timesteps : int Number of simulation timesteps. latency_ms : float Inference latency in milliseconds. task : str Task name for the report. model : str Model name for the report.
Returns¶
BenchmarkResult
Source code in src/sc_neurocore/benchmarks/metrics.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 | |
Tasks¶
sc_neurocore.benchmarks.tasks
¶
Built-in benchmark task definitions aligned with NeuroBench.
Each task defines: dataset, input shape, number of classes/outputs, evaluation metric, and baseline performance.
TASKS = {'keyword_spotting': BenchmarkTask(name='Keyword Spotting', description='12-class spoken keyword classification (Google Speech Commands v2)', input_shape=(16000,), n_classes=12, metric='accuracy', neurobench_id='keyword_spotting', dataset='speech_commands_v2', baseline_accuracy=0.92), 'dvs_gesture': BenchmarkTask(name='DVS Gesture Recognition', description='11-class gesture classification from DVS128 event camera', input_shape=(128, 128), n_classes=11, metric='accuracy', neurobench_id='dvs_gesture', dataset='dvs_gesture', baseline_accuracy=0.95), 'heartbeat_anomaly': BenchmarkTask(name='Heartbeat Anomaly Detection', description='Binary anomaly detection on MIT-BIH ECG dataset', input_shape=(187,), n_classes=2, metric='accuracy', neurobench_id='ecg_anomaly', dataset='mit_bih', baseline_accuracy=0.97), 'mnist': BenchmarkTask(name='MNIST Classification', description='10-class handwritten digit classification', input_shape=(784,), n_classes=10, metric='accuracy', neurobench_id='mnist', dataset='mnist', baseline_accuracy=0.99), 'shd': BenchmarkTask(name='Spiking Heidelberg Digits', description='20-class spoken digit classification (spiking audio)', input_shape=(700,), n_classes=20, metric='accuracy', neurobench_id='shd', dataset='shd', baseline_accuracy=0.85)}
module-attribute
¶
BenchmarkTask
dataclass
¶
Definition of a benchmark task.
Source code in src/sc_neurocore/benchmarks/tasks.py
19 20 21 22 23 24 25 26 27 28 29 30 | |