Tutorial 55: O(1) Memory Online Learning¶

Train SNNs on arbitrarily long sequences without running out of memory. Standard BPTT unrolls all timesteps, storing every intermediate activation. E-prop and online learning maintain only local eligibility traces — constant memory regardless of sequence length.

The Problem¶

BPTT stores activations for all T timesteps before computing gradients:

T=25 (typical SNN training): manageable, ~100 MB for a small network
T=1,000 (1 second at 1 ms): starts filling GPU memory
T=100,000 (real-time BCI, 100 seconds): impossible with BPTT

Real-world spiking applications (BCI, robotics, always-on sensors) require continuous learning on potentially infinite streams. BPTT cannot do this. Online learning can.

E-prop: Eligibility Propagation¶

E-prop (Bellec et al. 2020) maintains a per-synapse eligibility trace that summarises the effect of past inputs on current weights. The trace decays exponentially, so only recent history matters — O(1) memory.

Python

from sc_neurocore.online_learning import EpropTrainer

trainer = EpropTrainer(
    n_inputs=64,
    n_neurons=128,
    n_outputs=10,
    tau_mem=20.0,     # membrane time constant (ms)
    tau_trace=20.0,   # eligibility trace decay (ms)
    lr=0.001,
)

# Train on a 1000-step sequence — O(1) memory per step
import numpy as np
inputs = np.random.rand(1000, 64).astype(np.float32)
targets = np.zeros((1000, 10), dtype=np.float32)
targets[:, 3] = 1.0  # target class 3

loss = trainer.train_sequence(inputs, targets)
print(f"Sequence loss: {loss:.4f}")

# Memory usage is constant regardless of sequence length
print(f"Memory per step: {trainer.memory_per_step} parameters")
# This doesn't change if you train on 100 steps or 100,000 steps

How E-prop Works¶

At each timestep t, three quantities are updated locally:

Eligibility trace e(t): tracks how each synapse contributed to the neuron's membrane potential. Decays with time constant tau_trace.

Text Only

e(t) = alpha * e(t-1) + spike_pre(t) * pseudo_derivative(t)

Learning signal L(t): error signal broadcast from the output layer (analogous to backpropagated error, but local in time).
Weight update: dW = lr * L(t) * e(t) — applied immediately, no accumulation across time.

The key insight: the eligibility trace summarises the credit assignment problem locally. No need to store the full unrolled computation graph.

OnlineTrainer: Feedforward Stacks¶

For feedforward architectures without recurrence:

Python

from sc_neurocore.online_learning import OnlineTrainer

trainer = OnlineTrainer(
    layer_sizes=[784, 256, 128, 10],  # MNIST architecture
    tau_mem=20.0,
    threshold=1.0,
    lr=0.001,
)

# Train online, one timestep at a time
trainer.reset()
total_loss = 0.0
for t in range(1000):
    x_t = np.random.rand(784).astype(np.float32)
    target_t = np.zeros(10, dtype=np.float32)
    target_t[t % 10] = 1.0

    result = trainer.step(x_t, target=target_t)
    total_loss += result["loss"]

    if (t + 1) % 100 == 0:
        print(f"t={t+1}, avg loss={total_loss / (t+1):.4f}")

Comparison: BPTT vs E-prop vs Online¶

Property	BPTT	E-prop	OnlineTrainer
Memory	O(T × N)	O(N) per step	O(N) per step
Gradient quality	Exact	Approximate (3-factor)	Approximate (local)
Max sequence	~1000 steps (GPU)	Unlimited	Unlimited
Recurrence	Yes	Yes	Feedforward only
Hardware target	GPU only	CPU, FPGA, neuromorphic	CPU, FPGA
Biological plausibility	No	Yes (3-factor rule)	Partial

When to Use Which¶

Use BPTT (standard train_epoch()) when: - Sequence length ≤ 100 timesteps - GPU memory is available - Maximum accuracy matters more than deployment constraints

Use E-prop when: - Sequence length > 1000 timesteps or unbounded - Deploying on neuromorphic hardware (Loihi, BrainScaleS) - Biological plausibility is a requirement - Training on streaming data (BCI, robotics)

Use OnlineTrainer when: - Feedforward architecture (no recurrence) - Continuous online adaptation needed - Edge deployment with limited memory

Accuracy Gap: Honest Assessment¶

E-prop is an approximation. On benchmarks requiring long-range credit assignment (sequential MNIST, where digit class depends on all 784 pixels presented sequentially), BPTT achieves ~97% while E-prop typically reaches ~93-95%.

For tasks with temporal locality (most real-world signals — speech, motor control, sensor processing), the gap narrows to <1%.

The trade-off is memory vs accuracy: E-prop runs on 1 MB of memory where BPTT would need 100+ MB. For deployment on neuromorphic chips or FPGAs, E-prop is the only viable option.

Hardware Deployment¶

E-prop's local learning rule maps directly to on-chip learning on neuromorphic hardware:

Python

# Export trained weights for FPGA deployment
weights = trainer.export_weights()

# The learning rule itself can run on-chip:
# each synapse maintains its own trace register
# weight updates happen locally, no global backprop bus

In the Studio:

Train with E-prop in the Training Monitor
Export weights via the Code Generator
Compile the network to SystemVerilog
Deploy with on-chip learning enabled

References¶

Bellec et al. (2020). "A solution to the learning dilemma for recurrent networks of spiking neurons." Nature Communications 11:3625.
Zenke & Neftci (2021). "Brain-Inspired Learning on Neuromorphic Substrates." Proceedings of the IEEE 109(5):935-950.
Kaiser et al. (2020). "Synaptic Plasticity Dynamics for Deep Continuous Local Learning (DECOLLE)." Frontiers in Neuroscience 14:424.