Multi-Target Deployment Guide¶

SC-NeuroCore can deploy a single SNN model to any combination of 175 hardware profiles across 38 platform classes. This guide covers the complete multi-target deployment workflow from portability scoring through target recommendation, heterogeneous dispatch, multi-die floorplanning, and chiplet protocol mapping, with formal cost models for network partitioning and inter-chip bandwidth optimisation.

1. Mathematical Formalism¶

1.1 Portability Score¶

The portability score measures how many hardware profiles can execute a given neuron model without modification:

$$ S_{\text{port}} = \frac{|{p \in \mathcal{P} : \text{compatible}(p, M)}|}{|\mathcal{P}|} \times 100 $$

where $\mathcal{P}$ is the set of all profiles and $M$ is the model.

1.2 Target Recommendation Scoring¶

Each candidate target $t$ is scored based on user constraints $C$:

$$ \text{score}(t) = \sum_{i} w_i \cdot \text{match}_i(t, C) $$

where $w_i$ are importance weights and $\text{match}_i$ measures how well the target satisfies constraint $i$ (power, frequency, width, cost).

1.3 Multi-Die Bin Packing¶

Assigning neuron blocks to dies is a variant of the bin-packing problem. Given $n$ blocks of sizes $s_1, \ldots, s_n$ and $k$ dies of capacity $D$:

$$ \min \sum_{j=1}^{k} \left(D - \sum_{i: \text{die}(i)=j} s_i\right)^2 $$

Subject to: $\sum_{i: \text{die}(i)=j} s_i \leq D$ for all $j$.

1.4 Network Partitioning (Inter-Chip Bandwidth)¶

For graph $G = (V, E)$ partitioned into $k$ chips, the inter-chip bandwidth is:

$$ B_{\text{inter}} = \sum_{(u,v) \in E} \mathbb{1}[\text{chip}(u) \neq \text{chip}(v)] \cdot w(u,v) $$

The optimisation minimises $B_{\text{inter}}$ using spectral partitioning or METIS-style multi-level algorithms.

1.5 UCIe Bandwidth Model¶

UCIe die-to-die links provide:

$$ B_{\text{lane}} = R_{\text{data}} \times W_{\text{lane}} \times E_{\text{encoding}} $$

For UCIe 2.0 at 32 GT/s with 64-bit lanes and 128b/130b encoding:

$$ B_{\text{lane}} = 32 \times 64 \times \frac{128}{130} \approx 2016 \text{ Gbps} $$

2. Architecture¶

2.1 Deployment Decision Flow¶

flowchart TB
    A["SNN Model"] --> B["score_portability()"]
    B --> C["recommend_target()"]
    C --> D["score_supply_chain_risk()"]
    D --> E{"Multi-Target?"}
    E -->|"Single"| F["compile_to_verilog()"]
    E -->|"Multi"| G["compile_multi_target()"]
    G --> H["plan_heterogeneous_dispatch()"]
    H --> I["plan_multi_die_floorplan()"]
    I --> J["map_ucie_protocol()"]
    J --> K["optimize_network_topology()"]
    K --> L["Deploy"]
    F --> L

    style A fill:#e1f5fe
    style L fill:#e8f5e9

2.2 Multi-Target Compilation Stack¶

Text Only

┌──────────────────────────────────────────────────┐
│ Layer 1: Model Definition (equations)             │
├──────────────────────────────────────────────────┤
│ Layer 2: Portability Analysis + Target Selection  │
├──────────────────────────────────────────────────┤
│ Layer 3: Per-Target Compilation                   │
│   ├─ compile_to_verilog(target="artix7")         │
│   ├─ compile_to_verilog(target="loihi2")         │
│   └─ compile_to_verilog(target="asic_16")        │
├──────────────────────────────────────────────────┤
│ Layer 4: format_comparison_table()                │
├──────────────────────────────────────────────────┤
│ Layer 5: Deployment (constraints, drivers, SBOMs) │
└──────────────────────────────────────────────────┘

3. Supported Configurations¶

3.1 Platform Classes (31 total)¶

Class	Profiles	Example
Xilinx/AMD FPGA	25	Artix-7, Kintex U+, Versal
Intel/Altera FPGA	15	Cyclone V, Arria 10, Agilex
Lattice FPGA	8	iCE40, ECP5, CrossLink-NX
Efinix FPGA	5	Trion T20, Titanium Ti375
Gowin FPGA	4	GW1N, GW2A, GW5A
Neuromorphic	6	Intel Loihi 2, SpiNNaker2
ASIC	10	TSMC 7nm, Samsung 5nm
Compute-in-Memory	5	TSMC CIM, RRAM
RISC-V SoC	12	PolarFire SoC, SiFive X280
MCU	8	MAX78000, RP2040
Space-Qualified	6	BAE RAD750, RT PolarFire

3.2 Multi-Target Comparison Metrics¶

Metric	Unit	Source
Data width	bits	Profile
Estimated LUTs	count	`estimate_resources()`
Estimated DSPs	count	`estimate_resources()`
Estimated FFs	count	`estimate_resources()`
Max frequency	MHz	Profile
Guard bits	count	`prove_overflow_free()`
Overflow safe	bool	`prove_overflow_free()`

4. Python API¶

4.1 Auto-Target Recommendation¶

Python

from sc_neurocore.compiler.intelligence import recommend_target

recs = recommend_target(
    constraints={
        "max_power_mw": 500,
        "min_freq_mhz": 100,
        "max_width": 16,
    },
    top_k=5,
)
for r in recs:
    print(f"  {r['name']:30s} score={r['score']:.2f}")

4.2 Portability Scoring¶

Python

from sc_neurocore.compiler.intelligence import score_portability

# Simple LIF — runs on almost everything
s = score_portability({"v": "-(v - v_rest) / tau + I"})
print(f"Portable to {s.compatible_profiles}/{s.total_profiles} profiles")
print(f"Score: {s.score}/100")

# Complex model — may have blockers
s = score_portability({"v": "g*m*m*m*h + g*n*n*n*n"})
if s.blockers:
    print("Blockers:")
    for b in s.blockers:
        print(f"  ⚠️  {b}")

4.3 Multi-Target Compilation¶

Python

from sc_neurocore.compiler.deployment import (
    compile_multi_target,
    format_comparison_table,
)
from sc_neurocore.neurons.equation_builder import from_equations

neuron = from_equations(
    "dv/dt = -(v - E_L)/tau_m + I/C",
    threshold="v > -50", reset="v = -65",
    params=dict(E_L=-65, tau_m=10, C=1),
    init=dict(v=-65),
)

results = compile_multi_target(
    neuron,
    targets=["artix7", "ecp5", "ice40", "asic_16", "loihi2"],
    module_name="sc_lif",
)

table = format_comparison_table(results)
print(table)

4.4 Heterogeneous Dispatch¶

Python

from sc_neurocore.compiler.intelligence import plan_heterogeneous_dispatch

plan = plan_heterogeneous_dispatch(
    populations={
        "retina": "max78000",           # Edge MCU (sensor)
        "visual_cortex": "artix7",      # FPGA (processing)
        "decision": "loihi2",           # Neuromorphic (learning)
        "motor": "rp2040",              # MCU (actuation)
    },
    connections=[
        ("retina", "visual_cortex"),
        ("visual_cortex", "decision"),
        ("decision", "motor"),
    ],
)
print(f"Populations: {len(plan.populations)}")
print(f"Inter-chip links: {len(plan.inter_chip_links)}")

4.5 Multi-Die Floorplanning¶

Python

from sc_neurocore.compiler.intelligence import plan_multi_die_floorplan

result = plan_multi_die_floorplan(
    blocks={
        "visual_cortex": 800,
        "auditory_cortex": 600,
        "motor_cortex": 400,
        "prefrontal": 500,
        "cerebellum": 900,
        "hippocampus": 300,
    },
    die_capacity=1000,
    num_dies=4,
)
for block, die in result.die_assignment.items():
    print(f"  {block:20s} → Die {die}")
print(f"\nDie utilisation:")
for die, util in result.die_utilization.items():
    print(f"  Die {die}: {util:.0%}")

4.6 UCIe Chiplet Protocol Mapping¶

Python

from sc_neurocore.compiler.intelligence import map_ucie_protocol

mapping = map_ucie_protocol(
    {"visual_cortex": 256, "motor_cortex": 128, "prefrontal": 64},
    lane_bandwidth_gbps=32.0,
    protocol_version="UCIe 2.0",
)
for block, lanes in mapping.lanes.items():
    print(f"  {block}: {lanes} UCIe lanes")
print(f"Total: {mapping.total_bandwidth_gbps} Gbps")

4.7 Network Topology Optimisation¶

Python

from sc_neurocore.compiler.intelligence import optimize_network_topology

result = optimize_network_topology(
    adjacency={
        "V1": ["V2", "V4"],
        "V2": ["V1", "V4", "IT"],
        "V4": ["V1", "V2", "IT"],
        "IT": ["V2", "V4", "PFC"],
        "PFC": ["IT", "M1"],
        "M1": ["PFC"],
    },
    num_chips=2,
)
print(f"Bandwidth reduction: {result.bandwidth_reduction:.1%}")

4.8 Supply Chain Risk Assessment¶

Python

from sc_neurocore.compiler.intelligence import score_supply_chain_risk

for target in ["artix7", "loihi2", "bae_rad750_sq", "tsmc_cim_n7"]:
    risk = score_supply_chain_risk(target)
    print(f"  {target:20s} Risk: {risk.overall_risk}")

4.9 Partial Reconfiguration¶

Python

from sc_neurocore.compiler.intelligence import plan_partial_reconfiguration

plan = plan_partial_reconfiguration(
    regions={
        "conv_layer_1": 5000,  # LUTs
        "conv_layer_2": 4000,
        "fc_layer": 3000,
    },
    total_luts=10000,
)
print(f"Schedule: {plan.schedule}")
print(f"Context switches: {plan.num_contexts}")

5. CLI Usage¶

5.1 Multi-Target Compilation¶

Bash

python -c "
from sc_neurocore.neurons.equation_builder import from_equations
from sc_neurocore.compiler.deployment import compile_multi_target, format_comparison_table

n = from_equations('dv/dt = -(v-E_L)/tau_m + I/C',
    threshold='v > -50', reset='v = -65',
    params=dict(E_L=-65, tau_m=10, C=1), init=dict(v=-65))

results = compile_multi_target(n, ['artix7', 'ecp5', 'ice40'], 'sc_lif')
print(format_comparison_table(results))
"

5.2 Portability Check¶

Bash

python -c "
from sc_neurocore.compiler.intelligence import score_portability
s = score_portability({'v': '-(v)/tau + I'})
print(f'Portable: {s.compatible_profiles}/{s.total_profiles} ({s.score}/100)')
"

6. Generated Output Structure¶

6.1 Comparison Table Format¶

Text Only

╔══════════════╦════════╦════════╦═══════╦═══════╦════════╗
║ Target       ║ Bits   ║ DSPs   ║ LUTs  ║ Fmax  ║ Safe?  ║
╠══════════════╬════════╬════════╬═══════╬═══════╬════════╣
║ artix7       ║ 18     ║ 3      ║ ~120  ║ 450   ║ ✓      ║
║ ecp5         ║ 16     ║ 2      ║ ~100  ║ 400   ║ ✓      ║
║ ice40        ║ 16     ║ 0      ║ ~130  ║ 250   ║ ✓      ║
║ loihi2       ║ 24     ║ N/A    ║ N/A   ║ N/A   ║ ✓      ║
║ asic_16      ║ 16     ║ N/A    ║ ~80   ║ N/A   ║ ✓      ║
╚══════════════╩════════╩════════╩═══════╩═══════╩════════╝

6.2 Heterogeneous Dispatch Plan¶

JSON

{
  "populations": {
    "retina": {"target": "max78000", "neurons": 64},
    "visual_cortex": {"target": "artix7", "neurons": 1024},
    "decision": {"target": "loihi2", "neurons": 256}
  },
  "inter_chip_links": [
    {"src": "retina", "dst": "visual_cortex", "protocol": "SPI"},
    {"src": "visual_cortex", "dst": "decision", "protocol": "AER"}
  ]
}

6.3 SBOM Output (EU CRA Compliance)¶

Python

from sc_neurocore.compiler.intelligence import generate_sbom

for target in ["artix7", "loihi2", "sifive_x280_ai"]:
    sbom = generate_sbom("sc_cortex", target)
    print(f"  {target}: {sbom.total_components} components")

7. Performance Characteristics¶

7.1 Compilation Time by Target Count¶

Targets	Compile Time	Speedup (cached)
1	~50 ms	—
3	~150 ms	~100 ms
5	~250 ms	~120 ms
10	~500 ms	~150 ms
All 175	~8 s	~2 s

7.2 Platform Comparison (LIF Q8.8)¶

Target	LUTs	DSPs	FFs	Max Freq	Power
iCE40 HX8K	120	0	30	250 MHz	15 mW
Artix-7 100T	80	1	25	450 MHz	20 mW
ECP5 85F	95	1	28	400 MHz	18 mW
Kintex U+	75	1	22	550 MHz	25 mW
ASIC 16nm	60	N/A	20	1 GHz	0.5 mW

7.3 Multi-Die Floorplan Quality¶

Blocks	Dies	Balancing	Runtime
4	2	95%	< 1 ms
8	4	92%	< 5 ms
16	8	88%	< 20 ms
32	16	85%	< 100 ms

8. Test Suite and Verification¶

8.1 Multi-Target Compilation Test¶

Bash

python -c "
from sc_neurocore.neurons.equation_builder import from_equations
from sc_neurocore.compiler.deployment import compile_multi_target

n = from_equations('dv/dt = -(v-E_L)/tau + I/C',
    threshold='v > -50', reset='v = -65',
    params=dict(E_L=-65, tau_m=10, C=1), init=dict(v=-65))

results = compile_multi_target(n, ['artix7', 'ecp5', 'ice40'], 'sc_lif')
assert len(results) == 3
for r in results:
    assert r.verilog_lines > 0
    print(f'{r.target}: {r.verilog_lines} lines, {r.estimated_luts} LUTs — PASS')
"

8.2 Portability Score Test¶

Bash

python -c "
from sc_neurocore.compiler.intelligence import score_portability

s = score_portability({'v': '-(v)/tau + I'})
assert s.score > 50  # LIF is very portable
assert s.compatible_profiles > 100
print(f'Portability: {s.score}/100 — PASS')
"

8.3 Target Recommendation Test¶

Bash

python -c "
from sc_neurocore.compiler.intelligence import recommend_target

recs = recommend_target({'max_power_mw': 100, 'min_freq_mhz': 200}, top_k=3)
assert len(recs) == 3
assert all(r['score'] > 0 for r in recs)
print(f'Top recommendation: {recs[0][\"name\"]} (score={recs[0][\"score\"]:.2f}) — PASS')
"

8.4 Comparison Table Format Test¶

Bash

python -c "
from sc_neurocore.neurons.equation_builder import from_equations
from sc_neurocore.compiler.deployment import compile_multi_target, format_comparison_table

n = from_equations('dv/dt = -(v-E_L)/tau + I/C',
    threshold='v > -50', reset='v = -65',
    params=dict(E_L=-65, tau_m=10, C=1), init=dict(v=-65))
results = compile_multi_target(n, ['artix7', 'ice40'], 'sc_lif')
table = format_comparison_table(results)
assert 'artix7' in table
assert 'ice40' in table
print('Comparison table: PASS')
"

8.5 E2E Pipeline Test¶

Bash

python -m pytest tests/e2e/test_e2e_pipeline.py -v -k "multi_target"

8.6 Troubleshooting¶

Symptom	Cause	Fix
Zero compatible targets	Model too complex	Simplify equations or widen constraints
Unbalanced die mapping	Uneven block sizes	Split large blocks
High inter-chip bandwidth	Poor partitioning	Use `optimize_network_topology()`
Missing profile	Custom hardware	Use `from_constraints()` in extensibility

8.6 Digital Twin Generation¶

Python

from sc_neurocore.compiler.intelligence import generate_digital_twin

twin = generate_digital_twin("sc_cortex", equations, "artix7")
# Deploy twin alongside hardware — compare on every cycle

8.7 Memory Map Generation¶

Python

from sc_neurocore.compiler.intelligence import generate_memory_map

mmap = generate_memory_map(
    "sc_cortex",
    {"v": "expr", "u": "expr", "I_syn": "expr"},
    num_neurons=4096,
    data_width=16,
    base_address=0x4000_0000,
)
print(f"Address space: {mmap.total_bytes:,} bytes")
for e in mmap.entries[:5]:
    print(f"  0x{e['address']:08X}: {e['name']} ({e['width']}b)")

8.8 Complete Deployment Workflow¶

Text Only

┌─────────────────────────────────────────────────┐
│  1. score_portability()             — §48       │
│  2. recommend_target()              — §34       │
│  3. score_supply_chain_risk()       — §36       │
│  4. estimate_carbon_footprint()     — §45       │
│  5. plan_heterogeneous_dispatch()   — §33       │
│  6. plan_multi_die_floorplan()      — §54       │
│  7. map_ucie_protocol()             — §64       │
│  8. optimize_network_topology()     — §41       │
│  9. generate_memory_map()           — §47       │
│ 10. generate_power_intent()         — §44       │
│ 11. generate_sbom()                 — §61       │
│ 12. generate_digital_twin()         — §63       │
│ 13. generate_compilation_report()   — §59       │
└─────────────────────────────────────────────────┘

References¶

METIS graph partitioning: Karypis, G. & Kumar, V. "A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs." SIAM J. Sci. Comput., 20(1):359–392, 1998.
UCIe specification: UCIe Consortium. "Universal Chiplet Interconnect Express Specification." v1.1, 2023.
Bin-packing algorithms: Coffman, E.G. et al. "Bin Packing Approximation Algorithms: Survey and Classification." Handbook of Combinatorial Optimization, Springer, 2013.
Heterogeneous SNN deployment: Davies, M. et al. "Loihi: A Neuromorphic Manycore Processor with On-Chip Learning." IEEE Micro, 38(1), 2018.

Multi-Target Deployment Guide¶

1. Mathematical Formalism¶

1.1 Portability Score¶

1.2 Target Recommendation Scoring¶

1.3 Multi-Die Bin Packing¶

1.4 Network Partitioning (Inter-Chip Bandwidth)¶

1.5 UCIe Bandwidth Model¶

2. Architecture¶

2.1 Deployment Decision Flow¶

2.2 Multi-Target Compilation Stack¶

3. Supported Configurations¶

3.1 Platform Classes (31 total)¶

3.2 Multi-Target Comparison Metrics¶

4. Python API¶

4.1 Auto-Target Recommendation¶

4.2 Portability Scoring¶

4.3 Multi-Target Compilation¶

4.4 Heterogeneous Dispatch¶

4.5 Multi-Die Floorplanning¶

4.6 UCIe Chiplet Protocol Mapping¶

4.7 Network Topology Optimisation¶

4.8 Supply Chain Risk Assessment¶

4.9 Partial Reconfiguration¶

5. CLI Usage¶

5.1 Multi-Target Compilation¶

5.2 Portability Check¶

6. Generated Output Structure¶

6.1 Comparison Table Format¶

6.2 Heterogeneous Dispatch Plan¶

6.3 SBOM Output (EU CRA Compliance)¶

7. Performance Characteristics¶

7.1 Compilation Time by Target Count¶

7.2 Platform Comparison (LIF Q8.8)¶

7.3 Multi-Die Floorplan Quality¶

8. Test Suite and Verification¶

8.1 Multi-Target Compilation Test¶

8.2 Portability Score Test¶

8.3 Target Recommendation Test¶

8.4 Comparison Table Format Test¶

8.5 E2E Pipeline Test¶

8.6 Troubleshooting¶

8.6 Digital Twin Generation¶

8.7 Memory Map Generation¶

8.8 Complete Deployment Workflow¶

References¶

Further Reading¶