Tutorial 54: Spiking Transformers & State-Space Models¶
Build spike-driven transformers with zero multiplications in attention.
Spike-Driven Self-Attention (SSA)¶
Traditional attention: softmax(Q*K^T/sqrt(d)) * V — O(n^2) multiplications.
SSA: SpikeFn(Q) AND SpikeFn(K)^T * V — zero multiplications, pure AND gates.
from sc_neurocore.transformers import SpikeDrivenAttention
ssa = SpikeDrivenAttention(
embed_dim=64,
num_heads=4,
T=8, # simulation timesteps
threshold=1.0, # spike threshold for Q/K
)
# Input: 10 tokens, 64 dimensions (values in [0, 1])
x = np.random.rand(10, 64)
output = ssa.forward(x)
# output.shape == (10, 64)
# Zero multiplications in the attention core
print(f"Multiply ops: {ssa.num_multiply_ops}") # 0
Spiking State-Space Model (SSM)¶
O(1) memory per timestep. Combines linear state dynamics with spiking output:
from sc_neurocore.transformers import SpikyStateSpace
ssm = SpikyStateSpace(
d_model=32,
d_state=64, # hidden state dimension
threshold=1.0,
dt=0.01,
)
# Process a 200-step sequence
x_seq = np.random.rand(200, 32)
spike_output = ssm.forward(x_seq) # (200, 32) binary spikes
# Or step-by-step (online, O(1) memory)
ssm.reset()
for t in range(200):
spikes, y = ssm.step(x_seq[t])
CPG Positional Encoding¶
Biologically-inspired positional encoding using Central Pattern Generator oscillators:
from sc_neurocore.transformers import CPGPositionalEncoding
cpe = CPGPositionalEncoding(d_model=64, max_len=512)
# Continuous encoding (values in [0, 1])
pos_enc = cpe.encode(seq_len=100) # (100, 64)
# Spike-encoded (binary)
pos_spikes = cpe.encode_spikes(seq_len=100) # (100, 64) binary
Why SC + SSA is a Natural Match¶
Stochastic Computing uses AND gates for multiplication. SSA uses AND operations between binary spike vectors for attention weights. SC-NeuroCore's FPGA pipeline can compile SSA attention directly to AND-gate arrays — the most energy-efficient transformer implementation possible on hardware.