Skip to content

Neuromorphic Datasets

Module: sc_neurocore.datasets Source: src/sc_neurocore/datasets/ — 3 files, 423 LOC Status (v3.14.0): all 5 public symbols wired; 23 tests pass; pure NumPy I/O — no Rust path needed for the loaders, no synaptic kinetics. The "Poisson" encoder is actually Bernoulli (§3.1, same wording issue as network/stimulus.PoissonInput).

This page covers the two encoders (poisson_encode, latency_encode) and the three event-camera / cochlear loaders (load_nmnist, load_shd, load_dvs_cifar10), each of which can fall back to synthetic data when the real archive is not on disk.


1. Public surface

sc_neurocore.datasets.__init__ re-exports 5 symbols:

Symbol Source file Role
poisson_encode encoding.py Per-neuron Bernoulli draw → spike train
latency_encode encoding.py Continuous value → first-spike-time
load_nmnist loaders.py N-MNIST (Orchard 2015), 34×34 DVS
load_shd loaders.py Spiking Heidelberg Digits (Cramer 2022), 700 channels
load_dvs_cifar10 loaders.py DVS-CIFAR10 (Li 2017), 128×128 DVS

Each loader accepts synthetic=True to bypass disk reads — useful for unit tests and for CI where the real archives are not stored.


2. Loaders

2.1 load_nmnist

Python
def load_nmnist(
    root: str | Path = "data/nmnist",
    train: bool = True,
    dt_ms: float = 1.0,
    T: int = 300,
    synthetic: bool = False,
    n_samples: int = 100,
    seed: int = 42,
) -> tuple[list[np.ndarray], np.ndarray]:

Loads the N-MNIST dataset:

Orchard G., Cohen G., Jayawant A., Thakor N. "Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades." Front Neurosci 9:437 (2015).

34×34 ATIS DVS recordings of MNIST digits moved across the sensor by saccadic eye movements. 10 classes.

Returns (samples, labels): - samples: list of (N_events, 4) float32 arrays with columns [x, y, polarity, timestamp_ms] - labels: int64 array of length len(samples)

The real-data path expects the directory layout root/{Train,Test}/<class_id>/<sample>.bin. Each .bin file is a sequence of 5-byte events: [addr_high, addr_low, ts2, ts1, ts0], parsed by the helper _parse_nmnist_bin (loaders.py:150):

Bits Meaning
0–4 x coordinate (5-bit, max 31)
5–9 y coordinate
10 polarity
16-bit ts timestamp in microseconds

dt_ms scales the parsed timestamps to milliseconds via ts_us * (dt_ms / 1000.0).

2.2 load_shd

Loads the Spiking Heidelberg Digits dataset:

Cramer B., Stradmann Y., Schemmel J., Zenke F. "The Heidelberg Spiking Data Sets for the Systematic Evaluation of Spiking Neural Networks." IEEE Transactions on Neural Networks and Learning Systems 33(7):2744-2757 (2022).

20-class English/German digit utterances (0–9 in two languages) spike-encoded through Lauscher's artificial cochlea model. 700 input channels.

Returns (samples, labels): - samples: list of (T_per_sample, 700) bool arrays (binned spike rasters). T_per_sample is min(ceil(times.max() / (dt_ms/1000)) + 1, T). - labels: int64 array

The real-data path requires h5py (declared in extras) and reads root/shd_{train,test}.h5. The H5 layout is the standard SHD release: /spikes/times[i], /spikes/units[i], /labels.

2.3 load_dvs_cifar10

Loads the DVS-CIFAR10 dataset:

Li H., Liu H., Ji X., Li G., Shi L. "CIFAR10-DVS: An Event-Stream Dataset for Object Classification." Front Neurosci 11:309 (2017).

CIFAR-10 images displayed on a monitor and recorded by a 128×128 DVS camera. 10 classes.

Returns (samples, labels) in the same shape as load_nmnist. The real-data path expects .npy files (one per sample) under root/{train,test}/<class_id>/. Each .npy must be an array with columns [x, y, polarity, timestamp_ms]. Raw .aedat / .mat conversion is left to the caller.

2.4 Common contracts

All three loaders: - Raise FileNotFoundError with the dataset's download URL embedded in the message when root does not exist (_check_root, loaders.py:73). - Raise FileNotFoundError when root exists but the train/test subdirectory is missing. - Accept synthetic=True to bypass disk reads entirely; the synthetic path uses _synthetic_event_dataset (event-based loaders) or _synthetic_shd (binned-raster loader). Both pin their RNG to seed for reproducibility.

The synthetic generators draw class-conditional rate templates from U(0, 0.3) (event loaders) or U(0, 0.1) (SHD), then expand them through poisson_encode to per-sample spike trains. Polarities for event loaders are randint(0, 2).


3. Encoders

3.1 poisson_encode (actually Bernoulli)

Python
def poisson_encode(
    rates: npt.ArrayLike,
    T: int,
    dt_ms: float = 1.0,
    seed: int | None = None,
) -> np.ndarray:  # shape (T, N), bool

Returns (T, N) boolean spike train: each cell is rng.random() < min(rate * dt_ms, 1). The function name says "Poisson" but the per-step sample is Bernoulli, not a true Poisson draw. For low rate * dt_ms (< 0.1) the Bernoulli / Poisson distinction is < 5 % — the two distributions agree to first order. For high rate * dt_ms (> 0.5) Bernoulli under-counts because it cannot emit more than one spike per timestep; a true Poisson would.

Same wording issue as PoissonInput in network/stimulus.py. Either rename to bernoulli_encode or replace the < scaled line with rng.poisson(scaled, size) and accept fractional spike counts. Tracked as task #26.

3.2 latency_encode (first-spike-time, FIXED by task #27)

Python
def latency_encode(
    values: npt.ArrayLike,
    T: int,
    tau: float = 5.0,
    strict: bool = True,
) -> np.ndarray:  # shape (T, N), bool

Each value v ∈ [0, 1] produces exactly one spike at timestep int(tau * (1 - v)), clamped to [0, T-1]. Higher value → earlier spike.

Input range guard (strict=True default): the function now raises ValueError when any element of values is outside [0, 1]. The error message reports the offending min/max and suggests strict=False for the legacy silent-clip behaviour. This closes the contract gap that the original docstring claimed but did not enforce.

strict=False keeps the v3.14.0 behaviour: values=1.5 clips to spike-time 0, values=-0.5 clips toward T-1.

tau = 5.0 (default) means the latest possible spike (for v=0) is at timestep 5. For larger T, most timesteps are silent.

tau = 5.0 (default) means the latest possible spike (for v=0) is at timestep 5. For larger T, most timesteps are silent.


4. Performance — measured (this workstation)

Hardware: Intel i5-11600K, 32 GB DDR4, Python 3.12.3, NumPy 2.2.6.

4.1 Encoder throughput (mean of 20 calls)

Encoder N T Per-call wall Spike-cells/s
poisson_encode 100 300 0.37 ms 81.1 M
poisson_encode 1 000 300 3.33 ms 90.0 M
poisson_encode 10 000 300 37.38 ms 80.3 M
latency_encode 100 300 0.06 ms
latency_encode 1 000 300 0.05 ms
latency_encode 10 000 300 0.63 ms

poisson_encode is dominated by the rng.random((T, N)) call (uniform draw of T*N floats). Throughput is ~80 M spike-cells/s across all sizes, which matches NumPy's PRNG cost (~10 ns/element).

latency_encode is much faster because it draws no random numbers — just one fancy-indexed write per call. The (T=300, N=10000) call still runs in under 1 ms.

4.2 Synthetic loader cost

Loading N-MNIST in synthetic mode at T = 300, single-threaded:

n_samples Wall Total events generated
10 170.7 ms 515 780
100 1 739.8 ms 5 168 651
500 7 566.2 ms 25 894 833

Linear in n_samples: ~17 ms per sample, ~50 k events per sample. The cost is split between poisson_encode and the per-event np.column_stack + dtype cast inside _synthetic_event_dataset.

4.3 No Rust path

These are I/O loaders + per-call NumPy vectorised ops. The hot path (rng.random((T, N))) is already at NumPy/PCG64 speed (~80 M samples/s). A Rust port would gain little on the encoder side; the loader side is dominated by file I/O for real datasets and by NumPy allocation for synthetic data. No Rust path planned.


5. Pipeline wiring

Surface How it's wired Verifier
from sc_neurocore.datasets import load_nmnist, ... __init__.py:8-9 re-export tests/test_datasets.py
Synthetic fallback path each loader checks synthetic first TestSyntheticLoaders
Real-data path _check_root raises with download URL TestNMNISTRealLoader::test_load_nmnist_real_path, etc.
_synthetic_event_dataset calls poisson_encode loaders.py:56 covered transitively
H5 path imports h5py lazily inside load_shd body works without h5py if synthetic=True

No orphan helpers; _parse_nmnist_bin and _check_root are private but reachable from public loaders.


6. Audit (7-point checklist)

# Dimension Status Detail
1 Pipeline wiring ✅ PASS All 5 symbols wired; loaders → encoders → synthetic fallbacks
2 Multi-angle tests ✅ PASS 23 tests across 6 classes (TestCheckRoot, TestSyntheticLoaders, TestEncoding, TestNMNISTRealLoader, TestSHDRealLoader, TestDVSCIFAR10RealLoader); covers shape, reproducibility, file-not-found, real-data parse, encoder rate correlation
3 Rust path N/A I/O + NumPy-vectorised encoders; no compute kernel that would benefit
4 Benchmarks ✅ PASS §4.1 + §4.2 measured this session
5 Performance docs ✅ PASS §4
6 Documentation page ✅ PASS This page
7 Rules followed ⚠️ WARN SPDX header on every file ✅. poisson_encode is misnamed — it is Bernoulli, not Poisson (§3.1). latency_encode has an unenforced [0, 1] input contract (§3.2). British English consistent.

Net: 1 WARN, 0 FAIL. Both WARN items are naming / contract issues, not behavioural bugs. Tasks #27 and #28 track them.


7. Known issues

7.1 poisson_encode is Bernoulli (task #26)

For low rates this is fine; for high rates it under-counts. Either rename to bernoulli_encode (preferred — the function does not implement what the name claims) or replace the body with an actual Poisson draw and accept fractional spike counts (wider behaviour change).

7.2 latency_encode silently clips out-of-range input (FIXED by task #27)

The function now raises ValueError by default when any value is outside [0, 1]. Pass strict=False to keep the legacy silent-clip behaviour. Regression tests: tests/test_datasets.py::TestLatencyEncodeStrict (5 cases — above-1 raises, negative raises, strict=False keeps clip, boundary values 0.0 / 1.0 accepted, interior values correctly ordered).

7.3 N-MNIST _NMNIST_RES constant is unused on the real path

loaders.py:24 declares _NMNIST_RES = 34 but the real-data parser (_parse_nmnist_bin) decodes coordinates straight from the 5-bit address fields without referring to the constant. The constant is used only on the synthetic path. Either delete it from the real path, or assert that decoded coordinates fall inside [0, _NMNIST_RES).

7.4 load_dvs_cifar10 real path requires .npy not raw

The docstring says "DVS-CIFAR10 event-camera dataset" but the loader expects pre-converted .npy files, not the raw .aedat/.mat released by Li et al. 2017. The error message at line 329-332 makes this clear, but the docstring at line 271-300 does not. Either add a one-line "Note: requires .npy-converted input" to the docstring, or ship a convert_dvs_cifar10_to_npy utility.

7.5 Synthetic SHD differs from real SHD distributionally

_synthetic_shd draws class templates from U(0, 0.1) independently per channel, then Poisson-encodes them. Real SHD has rich temporal structure (cochlear filter banks, formants). The synthetic data produces correct shapes and labels for unit testing but trains a classifier to chance if used for actual learning. Document this constraint in the loader docstring.


8. Tests

Bash
PYTHONPATH=src python3 -m pytest tests/test_datasets.py -q
# 23 passed in 10.04s (verified 2026-04-17)

Coverage breakdown:

  • TestCheckRoot (2): _check_root returns Path on existing dir, raises FileNotFoundError with URL on missing dir.
  • TestSyntheticLoaders (7): synthetic-shape correctness for all 3 loaders, missing-root paths raise even with bad inputs, reproducibility across two same-seed calls.
  • TestEncoding (6): poisson_encode shape + rate correlation + zero/ones edge cases; latency_encode shape + monotonic earlier-fire-for-higher-value.
  • TestNMNISTRealLoader (3): _parse_nmnist_bin decodes a hand-crafted 5-byte event correctly; load_nmnist real path with a synthesised directory tree; missing-split raises.
  • TestSHDRealLoader (2): real-path with synthesised H5 file; missing-h5 raises.
  • TestDVSCIFAR10RealLoader (3): real-path with synthesised npy tree; missing-split raises; empty-dir raises.

Not covered:

  • High-rate Poisson distinction — no test asserts that poisson_encode(rates=1.5, T=10) saturates at 1 spike/step (the Bernoulli ceiling). A test would document the §3.1 issue.
  • Latency input range — no test asserts behaviour for value > 1 or value < 0.
  • Real h5py formatTestSHDRealLoader::test_load_shd_real_path uses a synthesised H5 file; the actual SHD release format has not been smoke-tested in CI.

9. References

Datasets (cited by source):

  • Orchard G. et al. "Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades." Front Neurosci 9:437 (2015). N-MNIST.
  • Cramer B., Stradmann Y., Schemmel J., Zenke F. "The Heidelberg Spiking Data Sets for the Systematic Evaluation of Spiking Neural Networks." IEEE TNNLS 33(7):2744-2757 (2022). SHD.
  • Li H. et al. "CIFAR10-DVS: An Event-Stream Dataset for Object Classification." Front Neurosci 11:309 (2017). DVS-CIFAR10.

Encoders (background):

  • Gerstner W., Kistler W. M. Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge UP (2002). Chapters on rate vs latency coding.
  • Thorpe S., Fize D., Marlot C. "Speed of processing in the human visual system." Nature 381:520-522 (1996). The original motivation for first-spike-time / latency coding.

Internal:


10. Auto-rendered API

sc_neurocore.datasets

load_nmnist(root='data/nmnist', train=True, dt_ms=1.0, T=300, synthetic=False, n_samples=100, seed=42)

Load N-MNIST spiking vision dataset.

Neuromorphic-MNIST: 34x34 DVS recordings of MNIST digits moved on an ATIS sensor via saccadic eye movements. 10 classes.

Orchard et al., "Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades", Front. Neurosci. 2015.

Parameters

root : path Directory containing the extracted dataset. train : bool Load training split if True, test split otherwise. dt_ms : float Temporal resolution for synthetic fallback. T : int Number of timesteps for synthetic fallback. synthetic : bool Force synthetic data generation. n_samples : int Number of synthetic samples to generate. seed : int RNG seed for reproducible synthetic data.

Returns

samples : list of ndarray, each shape (N_events, 4) Columns: [x, y, polarity, timestamp_ms]. labels : ndarray of int

Source code in src/sc_neurocore/datasets/loaders.py
Python
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
def load_nmnist(
    root: str | Path = "data/nmnist",
    train: bool = True,
    dt_ms: float = 1.0,
    T: int = 300,
    synthetic: bool = False,
    n_samples: int = 100,
    seed: int = 42,
) -> tuple[list[np.ndarray], np.ndarray]:
    """Load N-MNIST spiking vision dataset.

    Neuromorphic-MNIST: 34x34 DVS recordings of MNIST digits moved on
    an ATIS sensor via saccadic eye movements. 10 classes.

    Orchard et al., "Converting Static Image Datasets to Spiking
    Neuromorphic Datasets Using Saccades", Front. Neurosci. 2015.

    Parameters
    ----------
    root : path
        Directory containing the extracted dataset.
    train : bool
        Load training split if True, test split otherwise.
    dt_ms : float
        Temporal resolution for synthetic fallback.
    T : int
        Number of timesteps for synthetic fallback.
    synthetic : bool
        Force synthetic data generation.
    n_samples : int
        Number of synthetic samples to generate.
    seed : int
        RNG seed for reproducible synthetic data.

    Returns
    -------
    samples : list of ndarray, each shape (N_events, 4)
        Columns: [x, y, polarity, timestamp_ms].
    labels : ndarray of int
    """
    if synthetic:
        return _synthetic_event_dataset(
            n_samples,
            _NMNIST_RES,
            10,
            T,
            dt_ms,
            seed,
        )
    _check_root(root, "N-MNIST", _NMNIST_URL)
    split_dir = Path(root) / ("Train" if train else "Test")
    if not split_dir.exists():
        raise FileNotFoundError(
            f"Expected split directory {split_dir.resolve()}. Download from {_NMNIST_URL}"
        )
    # Real loader: N-MNIST uses .bin files, one per sample, grouped by class
    samples: list[np.ndarray] = []
    label_list: list[int] = []
    for class_dir in sorted(split_dir.iterdir()):
        if not class_dir.is_dir():
            continue
        class_label = int(class_dir.name)
        for bin_file in sorted(class_dir.glob("*.bin")):
            events = _parse_nmnist_bin(bin_file, dt_ms)
            samples.append(events)
            label_list.append(class_label)
    return samples, np.array(label_list, dtype=np.int64)

load_shd(root='data/shd', train=True, dt_ms=1.0, T=1000, synthetic=False, n_samples=100, seed=42)

Load Spiking Heidelberg Digits (SHD) dataset.

Audio digits 0-9 in English and German, spike-encoded through an artificial cochlea model. 700 input channels, 20 classes.

Cramer et al., "The Heidelberg Spiking Data Sets for the Systematic Evaluation of Spiking Neural Networks", IEEE TNNLS 2022.

Parameters

root : path Directory containing shd_train.h5 / shd_test.h5. train : bool Load training split if True, test split otherwise. dt_ms : float Temporal resolution for binning spikes. T : int Number of timesteps for synthetic fallback. synthetic : bool Force synthetic data generation. n_samples : int Number of synthetic samples to generate. seed : int RNG seed for reproducible synthetic data.

Returns

samples : list of ndarray, each shape (T, 700) dtype bool Binned spike trains. labels : ndarray of int

Source code in src/sc_neurocore/datasets/loaders.py
Python
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
def load_shd(
    root: str | Path = "data/shd",
    train: bool = True,
    dt_ms: float = 1.0,
    T: int = 1000,
    synthetic: bool = False,
    n_samples: int = 100,
    seed: int = 42,
) -> tuple[list[np.ndarray], np.ndarray]:
    """Load Spiking Heidelberg Digits (SHD) dataset.

    Audio digits 0-9 in English and German, spike-encoded through an
    artificial cochlea model. 700 input channels, 20 classes.

    Cramer et al., "The Heidelberg Spiking Data Sets for the Systematic
    Evaluation of Spiking Neural Networks", IEEE TNNLS 2022.

    Parameters
    ----------
    root : path
        Directory containing shd_train.h5 / shd_test.h5.
    train : bool
        Load training split if True, test split otherwise.
    dt_ms : float
        Temporal resolution for binning spikes.
    T : int
        Number of timesteps for synthetic fallback.
    synthetic : bool
        Force synthetic data generation.
    n_samples : int
        Number of synthetic samples to generate.
    seed : int
        RNG seed for reproducible synthetic data.

    Returns
    -------
    samples : list of ndarray, each shape (T, 700) dtype bool
        Binned spike trains.
    labels : ndarray of int
    """
    if synthetic:
        return _synthetic_shd(n_samples, T, dt_ms, seed)

    _check_root(root, "SHD", _SHD_URL)
    fname = "shd_train.h5" if train else "shd_test.h5"
    h5_path = Path(root) / fname
    if not h5_path.exists():
        raise FileNotFoundError(f"{h5_path.resolve()} not found. Download from {_SHD_URL}")
    import h5py

    samples: list[np.ndarray] = []
    with h5py.File(h5_path, "r") as f:
        spike_times = f["spikes"]["times"]
        spike_units = f["spikes"]["units"]
        raw_labels = f["labels"][:]
        for i in range(len(raw_labels)):
            times = np.asarray(spike_times[i])
            units = np.asarray(spike_units[i])
            if len(times) > 0:
                n_bins = min(int(np.ceil(times.max() / (dt_ms / 1000.0))) + 1, T)
            else:
                n_bins = T
            train_arr = np.zeros((n_bins, _SHD_CHANNELS), dtype=bool)
            if len(times) > 0:
                bin_idx = np.clip((times / (dt_ms / 1000.0)).astype(int), 0, n_bins - 1)
                unit_idx = np.clip(units.astype(int), 0, _SHD_CHANNELS - 1)
                train_arr[bin_idx, unit_idx] = True
            samples.append(train_arr)

    return samples, raw_labels.astype(np.int64)

load_dvs_cifar10(root='data/dvs_cifar10', train=True, dt_ms=1.0, T=300, synthetic=False, n_samples=100, seed=42)

Load DVS-CIFAR10 event-camera dataset.

CIFAR-10 images displayed on a monitor and recorded by a DVS camera at 128x128 resolution. 10 classes.

Li et al., "CIFAR10-DVS: An Event-Stream Dataset for Object Classification", Front. Neurosci. 2017.

Parameters

root : path Directory containing the extracted dataset. train : bool Load training split if True, test split otherwise. dt_ms : float Temporal resolution for synthetic fallback. T : int Number of timesteps for synthetic fallback. synthetic : bool Force synthetic data generation. n_samples : int Number of synthetic samples to generate. seed : int RNG seed for reproducible synthetic data.

Returns

samples : list of ndarray, each shape (N_events, 4) Columns: [x, y, polarity, timestamp_ms]. labels : ndarray of int

Source code in src/sc_neurocore/datasets/loaders.py
Python
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
def load_dvs_cifar10(
    root: str | Path = "data/dvs_cifar10",
    train: bool = True,
    dt_ms: float = 1.0,
    T: int = 300,
    synthetic: bool = False,
    n_samples: int = 100,
    seed: int = 42,
) -> tuple[list[np.ndarray], np.ndarray]:
    """Load DVS-CIFAR10 event-camera dataset.

    CIFAR-10 images displayed on a monitor and recorded by a DVS camera
    at 128x128 resolution. 10 classes.

    Li et al., "CIFAR10-DVS: An Event-Stream Dataset for Object
    Classification", Front. Neurosci. 2017.

    Parameters
    ----------
    root : path
        Directory containing the extracted dataset.
    train : bool
        Load training split if True, test split otherwise.
    dt_ms : float
        Temporal resolution for synthetic fallback.
    T : int
        Number of timesteps for synthetic fallback.
    synthetic : bool
        Force synthetic data generation.
    n_samples : int
        Number of synthetic samples to generate.
    seed : int
        RNG seed for reproducible synthetic data.

    Returns
    -------
    samples : list of ndarray, each shape (N_events, 4)
        Columns: [x, y, polarity, timestamp_ms].
    labels : ndarray of int
    """
    if synthetic:
        return _synthetic_event_dataset(
            n_samples,
            _DVS_CIFAR10_RES,
            10,
            T,
            dt_ms,
            seed,
        )
    _check_root(root, "DVS-CIFAR10", _DVS_CIFAR10_URL)
    split_dir = Path(root) / ("train" if train else "test")
    if not split_dir.exists():
        raise FileNotFoundError(
            f"Expected split directory {split_dir.resolve()}. Download from {_DVS_CIFAR10_URL}"
        )
    # Real loader: .aedat or .mat files grouped by class
    samples: list[np.ndarray] = []
    label_list: list[int] = []
    for class_dir in sorted(split_dir.iterdir()):
        if not class_dir.is_dir():
            continue
        class_label = int(class_dir.name)
        for event_file in sorted(class_dir.glob("*.npy")):
            events = np.load(event_file).astype(np.float32)
            samples.append(events)
            label_list.append(class_label)
    if not samples:
        raise FileNotFoundError(
            f"No .npy event files found in {split_dir.resolve()}. "
            f"Convert raw data to .npy arrays with columns [x, y, pol, ts_ms]."
        )
    return samples, np.array(label_list, dtype=np.int64)

poisson_encode(rates, T, dt_ms=1.0, seed=None)

Convert firing-rate array to Poisson spike trains.

Parameters

rates : array_like, shape (N,) Firing probabilities per timestep, clipped to [0, 1]. T : int Number of timesteps. dt_ms : float Timestep duration in ms (scales rates linearly). seed : int or None RNG seed for reproducibility.

Returns

spikes : ndarray, shape (T, N), dtype bool

Source code in src/sc_neurocore/datasets/encoding.py
Python
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def poisson_encode(
    rates: npt.ArrayLike,
    T: int,
    dt_ms: float = 1.0,
    seed: int | None = None,
) -> np.ndarray:
    """Convert firing-rate array to Poisson spike trains.

    Parameters
    ----------
    rates : array_like, shape (N,)
        Firing probabilities per timestep, clipped to [0, 1].
    T : int
        Number of timesteps.
    dt_ms : float
        Timestep duration in ms (scales rates linearly).
    seed : int or None
        RNG seed for reproducibility.

    Returns
    -------
    spikes : ndarray, shape (T, N), dtype bool
    """
    rng = np.random.default_rng(seed)
    rates = np.asarray(rates, dtype=np.float64)
    scaled = np.clip(rates * (dt_ms / 1.0), 0.0, 1.0)
    return rng.random((T, rates.shape[0])) < scaled

latency_encode(values, T, tau=5.0, strict=True)

Convert normalised values in [0, 1] to first-spike-time trains.

Higher values spike earlier. Each neuron fires exactly once.

Parameters

values : array_like, shape (N,) Input values, expected in [0, 1]. T : int Number of timesteps. tau : float Time constant controlling the spike-time spread. strict : bool If True (default), raise ValueError when any value lies outside [0, 1]. If False, silently clip the resulting spike times to [0, T-1] (the legacy behaviour). The clip happens regardless of strict; this flag controls only whether the function raises before clipping.

Returns

spikes : ndarray, shape (T, N), dtype bool

Raises

ValueError If strict=True (default) and any element of values is outside [0, 1].

Source code in src/sc_neurocore/datasets/encoding.py
Python
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def latency_encode(
    values: npt.ArrayLike,
    T: int,
    tau: float = 5.0,
    strict: bool = True,
) -> np.ndarray:
    """Convert normalised values in [0, 1] to first-spike-time trains.

    Higher values spike earlier. Each neuron fires exactly once.

    Parameters
    ----------
    values : array_like, shape (N,)
        Input values, expected in ``[0, 1]``.
    T : int
        Number of timesteps.
    tau : float
        Time constant controlling the spike-time spread.
    strict : bool
        If True (default), raise ``ValueError`` when any value lies
        outside ``[0, 1]``. If False, silently clip the resulting
        spike times to ``[0, T-1]`` (the legacy behaviour). The
        clip happens regardless of ``strict``; this flag controls
        only whether the function raises before clipping.

    Returns
    -------
    spikes : ndarray, shape (T, N), dtype bool

    Raises
    ------
    ValueError
        If ``strict=True`` (default) and any element of ``values``
        is outside ``[0, 1]``.
    """
    values = np.asarray(values, dtype=np.float64)
    if strict and (values.min() < 0.0 or values.max() > 1.0):
        bad_min = float(values.min())
        bad_max = float(values.max())
        raise ValueError(
            f"latency_encode: values must be in [0, 1] when strict=True; "
            f"got min={bad_min}, max={bad_max}. Pass strict=False to "
            f"accept the legacy silent-clip behaviour."
        )
    # spike_time = tau * (1 - value); higher value => earlier spike
    spike_times = np.clip(tau * (1.0 - values), 0, T - 1).astype(int)
    spikes = np.zeros((T, values.shape[0]), dtype=bool)
    neuron_idx = np.arange(values.shape[0])
    spikes[spike_times, neuron_idx] = True
    return spikes