Tutorial 26: Zero-Multiplication Predictive Coding (Conjecture C9)¶
SC-NeuroCore implements predictive coding with ZERO multiplications:
- Prediction error = XOR(predicted, actual) — one gate per bit
- Error magnitude = popcount(XOR result) / L — Hamming distance
- Precision update = STDP — push weights toward actual input
This maps entirely to XOR gates + a popcount tree on FPGA. No DSP blocks needed.
What is Predictive Coding?¶
Predictive coding is a theory of neural computation where each layer predicts the activity of the layer below. The difference between prediction and reality (the prediction error) is all that propagates upward. When predictions are accurate, error signals are small — the brain only processes what's surprising.
In conventional digital hardware, computing prediction error requires subtraction and multiplication. SC-NeuroCore eliminates both by exploiting stochastic bitstream properties.
The SC Insight¶
Two stochastic bitstreams $a$ and $b$ encoding probabilities $p_a$ and $p_b$:
- $\text{XOR}(a, b)$ produces a bitstream encoding $|p_a - p_b|$ (approximately)
- $\text{popcount}(\text{XOR}(a, b)) / L$ gives the Hamming distance $\approx |p_a - p_b|$
XOR IS subtraction in the stochastic domain. One gate per bit, no carry chain, no DSP block, no multiplier.
1. Basic Usage¶
from sc_neurocore.layers.predictive_coding import PredictiveCodingSCLayer
layer = PredictiveCodingSCLayer(
n_inputs=4,
n_neurons=2,
length=512,
lr=0.1,
seed=42,
)
# Train on a repeated pattern
errors = []
for epoch in range(30):
result = layer.forward([0.3, 0.7, 0.5, 0.9])
errors.append(result['prediction_error'])
if epoch % 5 == 0:
print(f"Epoch {epoch:2d}: error = {result['prediction_error']:.4f}")
# Error decreases as the layer learns to predict the input
print(f"Initial error: {errors[0]:.4f}")
print(f"Final error: {errors[-1]:.4f}")
print(f"Reduction: {(1 - errors[-1]/errors[0])*100:.0f}%")
2. Novelty Detection¶
After learning a pattern, novel inputs produce high prediction error:
layer = PredictiveCodingSCLayer(n_inputs=3, n_neurons=2, length=512, lr=0.2, seed=42)
# Learn a familiar pattern
for _ in range(50):
layer.forward([0.8, 0.2, 0.5])
# Test familiar vs novel
familiar = layer.forward([0.8, 0.2, 0.5])["prediction_error"]
slightly_novel = layer.forward([0.7, 0.3, 0.5])["prediction_error"]
very_novel = layer.forward([0.2, 0.8, 0.5])["prediction_error"]
print(f"Familiar: {familiar:.4f}")
print(f"Slightly novel: {slightly_novel:.4f}")
print(f"Very novel: {very_novel:.4f}")
# Prediction error scales with input novelty
This is the computational basis for surprise detection — the same mechanism used by ArcaneNeuron's predictor compartment (Tutorial 34).
3. Multi-Pattern Learning¶
import numpy as np
layer = PredictiveCodingSCLayer(n_inputs=4, n_neurons=4, length=1024, lr=0.05, seed=42)
# Two alternating patterns
pattern_a = [0.9, 0.1, 0.8, 0.2]
pattern_b = [0.1, 0.9, 0.2, 0.8]
for epoch in range(100):
pattern = pattern_a if epoch % 2 == 0 else pattern_b
result = layer.forward(pattern)
# After training, both patterns should have low error
err_a = layer.forward(pattern_a)["prediction_error"]
err_b = layer.forward(pattern_b)["prediction_error"]
err_new = layer.forward([0.5, 0.5, 0.5, 0.5])["prediction_error"]
print(f"Pattern A error: {err_a:.4f}")
print(f"Pattern B error: {err_b:.4f}")
print(f"Novel pattern: {err_new:.4f}") # higher than A or B
4. FPGA Resource Comparison¶
| Operation | Conventional | SC Predictive Coding |
|---|---|---|
| Subtraction | 16-bit ripple subtract (16 LUTs) | XOR gate (1 LUT per bit) |
| Multiplication | Array multiplier (256 LUTs) | Not needed |
| Error magnitude | Absolute value circuit | Popcount tree (log₂(L) LUTs) |
| Weight update | MAC unit + accumulator | STDP counter |
| Total per synapse | ~300 LUTs | ~10 LUTs |
For a 100-input, 10-output predictive coding layer: - Conventional: ~300K LUTs (won't fit on most FPGAs) - SC: ~10K LUTs (fits easily on iCE40 HX8K)
5. How the Learning Rule Works¶
The STDP-based update rule adjusts predictions toward actual inputs:
- Compute prediction error:
error = XOR(predicted_bits, actual_bits) - For each input-output pair with high error: increase weight (LTP)
- For each input-output pair with low error: decrease weight slightly (LTD)
- The asymmetry drives weights toward values that minimize prediction error
This is biologically plausible — real cortical circuits adjust synaptic strength based on prediction error signals from higher areas.
6. Hierarchical Predictive Coding¶
Stack multiple layers for hierarchical prediction:
# Layer 1: predicts raw input
l1 = PredictiveCodingSCLayer(n_inputs=8, n_neurons=4, length=512, lr=0.1, seed=1)
# Layer 2: predicts L1's residual errors
l2 = PredictiveCodingSCLayer(n_inputs=4, n_neurons=2, length=512, lr=0.05, seed=2)
inputs = [0.3, 0.7, 0.5, 0.9, 0.2, 0.8, 0.4, 0.6]
for epoch in range(50):
r1 = l1.forward(inputs)
# Feed L1's prediction error as L2's input
r2 = l2.forward(list(r1['surprises'][:4]))
if epoch % 10 == 0:
print(f"Epoch {epoch}: L1 error={r1['prediction_error']:.4f}, "
f"L2 error={r2['prediction_error']:.4f}")
Each layer compresses the residual from the layer below. This is analogous to hierarchical processing in visual cortex: V1 predicts retinal input, V2 predicts V1 residuals, V4 predicts V2 residuals.
Further Reading¶
- Tutorial 34: ArcaneNeuron — uses predictive coding for self-modeling
- Tutorial 27: Fault Tolerance — SC's inherent noise robustness
- API: Layers — PredictiveCodingSCLayer API docs