Module cortical_inject

Expand description

Per-row-parallel CSR sparse matrix-vector add for the Potjans CorticalColumn block-CSR injection path.

parallel_csr_spmv_add(indptr, indices, data, x, y) computes y += W @ x where W is a CSR matrix described by (indptr, indices, data). Rows are processed in parallel via rayon.

This is the kernel that lets CorticalColumn use the per-(source- type, global-bin) block matrices at scale ≥ 0.5: a single block mat-vec at scale=0.1 is ≈ 18 ms scipy-single-threaded; with rayon over 8 cores it is ≈ 2-3 ms. At scale=0.5 the savings extrapolate linearly with nnz, bringing 600 ms simulation wall-time from ~50 minutes (single-threaded scipy block) into the ~10-minute range and unlocking the full-scale (~77 000-cell) convergence regime documented by van Albada et al. 2015 Fig 5.

Determinism: per-row reductions are LOCAL to each row, so the parallel order does not affect the result. Bit-identical to the scipy single-threaded reference for matching inputs.

Constants§

CHUNK_SIZE 🔒: y[r] += sum_k data[k] * x[indices[k]] for k in indptr[r]..indptr[r+1], processing rows in chunks in parallel via rayon.

Functions§

parallel_csr_multi_spmv_add: Batched per-row-parallel CSR spmv add: y += sum_b W_b @ x_b across n_blocks (matrix, vector) pairs, all sharing the same row dimension. Used by CorticalColumn._inject_block(dt) to do 2 × n_delay_bins (= 10) spmv calls in one FFI call instead of 10 separate FFI calls per step. The per-row reduction is local so chunking still parallelises cleanly.
parallel_csr_spmv_add