Tutorial 41: Pre-Silicon Energy Estimation¶

Predict FPGA resource usage, power consumption, and energy per inference in under one second — before running any synthesis tool. No Vivado, no Yosys, no hardware. Just answers.

Why This Matters¶

Every SNN paper claims "energy efficiency" but most either: - Estimate from FLOPs (ignoring memory access, which dominates energy) - Use Loihi/TrueNorth numbers (not comparable to custom FPGA designs) - Omit hardware numbers entirely

SC-NeuroCore's estimator gives concrete numbers for your specific network on your specific FPGA target, instantly. These are estimates, not measurements — but they're architecture-aware and calibrated against Yosys synthesis results.

Basic Estimation¶

Python

from sc_neurocore.energy import estimate

report = estimate(
    layer_sizes=[(784, 128), (128, 10)],
    target="ice40",           # iCE40 UP5K
    bitstream_length=256,     # SC bitstream length
)

print(report.summary())
# Target: iCE40 UP5K
# Layers: 2 (784→128, 128→10)
# Total parameters: 101,760
#
# Resource estimate:
#   LUTs:  2,340 / 5,280 (44.3%)
#   FFs:   1,280 / 5,280 (24.2%)
#   BRAMs: 12 / 30 (40.0%)
#   DSPs:  0 / 0
#
# Power estimate:
#   Static:  1.2 mW
#   Dynamic: 3.4 mW (at 12 MHz, 10% activity)
#   Total:   4.6 mW
#
# Energy per inference:
#   Time: 21.3 μs (256 clock cycles at 12 MHz)
#   Energy: 98 nJ per inference
#
# Comparison:
#   GPU (RTX 3080): ~5000 nJ per inference (50× more)
#   CPU (i5):       ~50000 nJ per inference (500× more)

Compare FPGA Targets¶

Python

for target in ["ice40", "ecp5", "artix7", "zynq"]:
    r = estimate([(784, 128), (128, 10)], target=target, bitstream_length=256)
    print(f"{target:8s}: {r.total_luts:>5,} LUTs ({r.utilization_pct:>4.0f}%), "
          f"{r.total_dynamic_power_mw:>5.2f} mW, {r.energy_per_inference_nj:>6.0f} nJ")

# ice40   : 2,340 LUTs ( 44%), 3.40 mW,    98 nJ
# ecp5    : 2,340 LUTs ( 10%), 4.20 mW,   121 nJ
# artix7  : 2,340 LUTs ( 11%), 5.80 mW,   167 nJ
# zynq    : 2,340 LUTs (  5%), 8.50 mW,   245 nJ

Smaller FPGAs are more power-efficient — fewer resources = less static power. Use the smallest FPGA that fits your network.

Event-Driven vs Clock-Driven¶

Event-driven FPGA designs only consume dynamic power when spikes occur:

Python

clock = estimate([(784, 128)], target="ice40", event_driven=False)
event = estimate([(784, 128)], target="ice40", event_driven=True, activity_rate=0.05)

savings = (1 - event.total_dynamic_power_mw / clock.total_dynamic_power_mw) * 100
print(f"Clock-driven: {clock.total_dynamic_power_mw:.2f} mW")
print(f"Event-driven (5% activity): {event.total_dynamic_power_mw:.2f} mW")
print(f"Power reduction: {savings:.0f}%")
# ~85% power reduction at 5% activity rate

Does It Fit?¶

Quick check before running synthesis:

Python

report = estimate([(784, 256), (256, 128), (128, 10)], target="ice40")
if report.fits_on_target:
    print(f"Fits! {report.utilization_pct:.0f}% utilization")
else:
    print(f"Too large: {report.utilization_pct:.0f}% ({report.total_luts:,} LUTs)")
    print(f"Options: reduce hidden size, use compression, or target larger FPGA")

How Estimates Are Computed¶

Resource	Formula	Calibration
LUTs	2 per weight (Q8.8 multiply) + 12 per neuron (LIF step)	Yosys ice40 reports
FFs	1 per weight (pipeline) + 8 per neuron (state)	Yosys ice40 reports
BRAMs	ceil(total_weight_bits / bram_capacity)	Device datasheet
Dynamic power	toggle_rate × capacitance × V^2 × frequency	Vendor power models
Static power	device-dependent baseline	Vendor datasheets

Estimates are typically within 20-30% of actual synthesis results. For exact numbers, use the Synthesis Dashboard (Yosys) or sc-neurocore deploy.

Integration with Studio¶

The Synthesis Dashboard's Estimate button uses this same engine:

Build IR from your ODE equations
Click Estimate on the FPGA tab
See resource bars without running Yosys
If it fits, click Synthesise for exact numbers

References¶

Horowitz (2014). "Computing's Energy Problem (and what we can do about it)." ISSCC 2014 Keynote.
Sze et al. (2017). "Efficient Processing of Deep Neural Networks: A Tutorial and Survey." Proceedings of the IEEE 105(12):2295-2329.