Run-time Mapping of Spiking Neural Networks to Neuromorphic Hardware by Balaji, Adarsha et al.
Noname manuscript No.
(will be inserted by the editor)
Run-time Mapping of Spiking Neural Networks to
Neuromorphic Hardware
Adarsha Balaji · Thibaut Marty · Anup Das · Francky Catthoor
the date of receipt and acceptance should be inserted later
Abstract Neuromorphic architectures implement bio-
logical neurons and synapses to execute machine learn-
ing algorithms with spiking neurons and bio-inspired
learning algorithms. These architectures are energy ef-
ficient and therefore, suitable for cognitive information
processing on resource and power-constrained environ-
ments, ones where sensor and edge nodes of internet-of-
things (IoT) operate. To map a spiking neural network
(SNN) to a neuromorphic architecture, prior works have
proposed design-time based solutions, where the SNN is
first analyzed offline using representative data and then
mapped to the hardware to optimize some objective
functions such as minimizing spike communication or
maximizing resource utilization. In many emerging ap-
plications, machine learning models may change based
on the input using some online learning rules. In online
learning, new connections may form or existing connec-
tions may disappear at run-time based on input excita-
tion. Therefore, an already mapped SNN may need to
be re-mapped to the neuromorphic hardware to ensure
optimal performance. Unfortunately, due to the high
computation time, design-time based approaches are
not suitable for remapping a machine learning model
at run-time after every learning epoch.
Adarsha Balaji
Drexel University, Philadelphia, Pennsylvania, USA, 19104
E-mail: adarsha.balaji@drexel.edu
Thibaut Marty
E-mail: thibaut.marty@ens-rennes.fr
Anup Das
Drexel University, Philadelphia, Pennsylvania, USA, 19104
E-mail: anup.das@drexel.edu
Francky Catthoor
Neuromorphic Division, IMEC, 3001 Leuven,Belgium. E-
mail: francky.catthoor@imec.be
In this paper, we propose a design methodology to
partition and map the neurons and synapses of online
learning SNN-based applications to neuromorphic ar-
chitectures at run-time. Our design methodology oper-
ates in two steps – step 1 is a layer-wise greedy approach
to partition SNNs into clusters of neurons and synapses
incorporating the constraints of the neuromorphic ar-
chitecture, and step 2 is a hill-climbing optimization al-
gorithm that minimizes the total spikes communicated
between clusters, improving energy consumption on the
shared interconnect of the architecture. We conduct ex-
periments to evaluate the feasibility of our algorithm us-
ing synthetic and realistic SNN-based applications. We
demonstrate that our algorithm reduces SNN mapping
time by an average 780x compared to a state-of-the-
art design-time based SNN partitioning approach with
only 6.25% lower solution quality.
Keywords Spiking Neural Networks (SNN) · Neu-
romorphic Computing · Internet of Things (IoT) ·
Run-Time · Mapping
1 Introduction
Internet of things (IoT) is an emerging computing para-
digm that enables the integration of ubiquitous sensors
over a wireless network [3]. Recent estimates predict
that over 50 billion IoT devices will be interconnected
via the cloud over the next decade [22]. In a conven-
tional IoT, data collected from sensors and actuators
are transferred to the cloud and processed centrally [34].
However, with an increase in the number of connected
IoT devices, processing on the cloud becomes the per-
formance and energy bottleneck [41].
Edge computing is emerging as a scalable solution
to process large volumes of data by executing machine
ar
X
iv
:2
00
6.
06
77
7v
1 
 [c
s.N
E]
  1
1 J
un
 20
20
2 Adarsha Balaji et al.
learning tasks closer to the data source e.g. on a sen-
sor or an edge node [40]. Processing on edge devices
allows real-time data processing and decision making,
and offers network scalability and privacy benefits as
data transferred to the cloud over a possibly insecure
communication channel is minimized [32,31].
Spiking neural networks (SNNs) [29] are extremely
energy efficient in executing machine learning tasks on
event-driven neuromorphic architectures such as True-
North [2], DYNAP-SE [36], and Loihi [19], making them
suitable for machine learning-based edge computing.
A neuromorphic architecture is typically designed us-
ing crossbars, which can accommodate only a limited
number of synapses per neuron to reduce energy con-
sumption. To build a large neuromorphic chip, multi-
ple crossbars are integrated using a shared interconnect
such as network-on-chips (NoC) [7]. To map an SNN to
these architectures, the common practice is to partition
the neurons and synapses of the SNN into clusters and
map these clusters to the crossbars, optimizing hard-
ware performance such as minimizing the number of
spikes communicated between crossbar, which reduces
energy consumption [16].
Most prior works on machine learning-based edge
computing focus on supervised approaches, where neu-
ral network models are first trained offline with repre-
sentative data from the field and then deployed on edge
devices to perform inference in real-time [39]. However,
data collected by IoT sensors constantly evolve over
time and may not resemble the representative data used
to train the neural network model. This change in the
relation between the input data and an offline trained
model is referred to as concept drift [23]. Eventually, the
concept drift will reduce the prediction accuracy of the
model over time, lowering its quality. Therefore, there is
a clear need to periodically re-train the model using re-
cent data with adaptive learning algorithms. Examples
of such algorithms include transfer learning [38], life-
long learning [43] and deep reinforcement learning[33].
Mapping decisions for a supervised SNN are made at
design-time before the initial deployment of the trained
model. However, in the case of online learning, when the
model is re-trained, (1) synaptic connections within the
SNN may change, i.e. new connections may form and
existing connection may be removed as new events are
learned, and (2) weights of existing synaptic connec-
tions may undergo changes after every learning epoch.
In order to ensure the optimal hardware performance at
all times, a run-time approach is required that remaps
the SNN to the hardware after every learning epoch.
Prior methods to partition and map an SNN to neuro-
morphic hardware, such as PSOPART[16], SpiNeMap[6],
PyCARL[4], NEUTRAMS[25] and DFSynthesizer[42]
are design-time approaches that require significant ex-
ploration time to generate a good solution. Although
suitable for mapping supervised machine learning mod-
els, these approaches cannot be used at run-time to
remap SNNs frequently. For online learning, we propose
an approach to perform run-time layer-wise mapping
of SNNs on to crossbar-based neuromorphic hardware.
The approach is implemented in two steps. First, we
perform a layer-wise greedy clustering of the neurons
in the SNN. Second, we use an instance of hill-climbing
optimization (HCO) to lower the total number of spikes
communicated between the crossbars.
Contributions: Following are our key contributions.
– We propose an algorithm to partition and map on-
line learning SNNs on to neuromorphic hardware for
IoT applications in run-time;
– We demonstrate suitability of our approach for on-
line mapping in terms of the exploration time and
total number of spikes communicated between the
crossbars, when compared to a state-of-the-art de-
sign time approach.
The remainder of this paper is organized as follows,
Section 2 presents the background, Section 3 discusses
the problem of partitioning a neural network into clus-
ters to map on to the crossbars neuromorphic hard-
ware and describes our two-step approach. Section 5
presents the experimental results based on synthetic ap-
plications. Section 6 concludes the paper followed by a
discussion in Section 7.
2 Background
Spiking neural networks are event-driven computational
models inspired by the mammalian brain. Spiking neu-
rons are typically implemented using Integrate-and-Fire
(I&F) models [9] and communicate using short impulses,
called spikes, via synapses. Figure 1(a) illustrates an
SNN with two pre-synaptic neurons connected to a post-
synaptic neuron via synaptic elements with weights w1,
w2 respectively. When a pre-synaptic neuron generates
a spike, current is injected into the post-synaptic neu-
ron, proportional to the product of the spike voltage
and the conductance of the respective synapse. SNNs
are trained by adjusting the synaptic weights using a
supervised, a semi-supervised, or an unsupervised ap-
proach [27,28,37].
Due to the ultra-low power footprint of neuromor-
phic hardware, several machine learning applications
based on SNNs are implemented. In [18], the authors
propose a multi-layer perceptron (MLP) based SNN
to classify heartbeats using electrocardiagram (ECG)
data. In [21], the authors propose the handwritten digit
Run-time Mapping of Spiking Neural Networks to Neuromorphic Hardware 3
(n
 in
pu
t n
eu
ro
ns
)
(n output neurons)
w2
w1
synapse post-synaptic 
neuron
pr
e-
sy
na
pt
ic 
ne
ur
on
s
Fig. 1: Overview of a SNN hardware: (a) connection of pre- and post-synaptic neurons via synapses in a spiking
neural network, (b) a crossbar organization with fully connected pre- and post-synaptic neurons, and (c) a modern
neuromorphic hardware with multiple crossbars and a time-multiplexed interconnect.
recognition using unsupervised SNNs. In [15], a spiking
liquid state machine for heart-rate estimation is pro-
posed. A SNN-based liquid state machine (LSM) for
facial recognition is proposed in [24]. In [5], the authors
propose a technique to convert a convolutional neural
network (CNN) model for heartbeat classification into
a SNN, with a minimal loss in accuracy.
Typically, SNNs are executed on special purpose
neuromorphic hardware. These hardware can (1) re-
duce energy consumption, due to their low-power de-
signs, and (2) improve application throughput, due to
their distributed computing architecture. Several dig-
ital and mixed-signal neuromorphic hardware are re-
cently developed to execute SNNs, such as Neurogrid[8],
TrueNorth [1] and DYNAP-SE [35]. Although these
hardware differ in their operation (analog vs. digital),
they all support crossbar-based architectures. A cross-
bar is a two-dimensional arrangement of synapses (n2
synapses for n neurons). Figure 1(b) illustrates a sin-
gle crossbar with n pre-synaptic neurons and n post-
synaptic neurons. The pre- and post-synaptic neurons
are connected via synaptic elements. Crossbar size (n)
is limited (<512) as scaling the size of the crossbar will
lead to an exponential increase in dynamic and leakage
energy. Therefore, to build large neuromorphic hard-
ware, multiple crossbars are integrated using a shared
interconnect, as illustrated in Figure 1(c).
In order to execute an SNN on a neuromorphic hard-
ware, the SNN is first partitioned into clusters of neu-
rons and synapses. The clustered (local) synapses are
then mapped to the crossbars and the inter-cluster syna-
pses to the time-multiplexed interconnect. Several de-
sign time partitioning approach are presented in liter-
ature. In [46,45,44] the authors proposes techniques to
efficiently map the neurons and synapses on a cross-
bar. The aim of these techniques is to maximize the
utilization of the crossbar. NEUTRAMS partitions the
SNN for crossbar-based neuromorphic hardware [26].
The NEUTRAMs approach also looks to minimize the
energy consumption of the neuromorphic hardware ex-
ecuting the SNN. PyCARL [4] facilitates the hardware-
software co-simulation of SNN-based applications. The
framework allows users to analyze and optimize the
partitioning and mapping of an SNN on cycle-accurate
models of neuromorphic hardware. DFSynthesizer [42]
uses a greedy technique to partition the neurons and
synapses of an SNN. The SNN partitions are mapped
to the neuromorphic hardware using an algorithm that
adapts to the available resources of the hardware. SpiNe-
Map [6] uses a greedy partitioning technique to parti-
tion the SNN followed by a meta-heuristic-based tech-
nique to map the partitions on the hardware. PSO-
PART SNNs to a crossbar architecture [17]. The ob-
jective of SpiNeMap and PSOPART is to minimize the
spike communication on the time-multiplexed intercon-
nect in order to improve the overall latency and power
consumption of the DYNAP-SE hardware. Table 1 com-
pares our contributions to the state-of-the-art techn-
iques.
As these partitioning approaches aim to find the op-
timal hardware performance, their exploration time is
relatively large and therefore not suitable for partition-
ing and re-mapping of online learning SNNs. Run-time
approaches are proposed for task mapping on multipro-
cessor systems. A heuristic-based run-time manager is
proposed in [12]. The run-time manager controls the
thread allocation and voltage/frequency scaling for en-
ergy efficient execution of applications on multi pro-
cessor systems. In [30], the authors propose a genetic
algorithm-based run-time manager to schedule real-time
tasks on Dynamic Voltage Scaling (DVS) enabled pro-
cessors, with an aim to minimize energy consumption.
4 Adarsha Balaji et al.
Crossbar
Design-time Partition 
and Map
Crossbar Crossbar
Partition
Map
Run-time Partition 
and Map
pr
e-
sy
na
pt
ic
ne
ur
on
s
post-synaptic
neuron
(c) Neuromorphic Hardware(a) Application (b) Spiking Neural Network
(d) HCO Algorithm
Fig. 2: Mapping of online learning SNN on Neuromorphic Hardware.
Related Works
Run-time
Mapping
Objective
[46,45,44] × Maximize single cross-
bar utilization
NEUTRAMS [25] × Minimize number of
crossbars utilized
SpiNeMap [6] × Minimize spikes on
time-multiplexed
interconnect
PSOPART [16] × Minimize spikes on
time-multiplexed
interconnect
DFSynthesizer [42] × Optimize the hardware
utilization in run-time
Proposed
√
Reduces energy con-
sumption of online
learning SNNs on
hardware.
Table 1: Summary of related works.
A workload aware thread scheduler is proposed in [20]
for multi-processor systems. In [14], the authors pro-
pose a multinomial logistic regression model to parti-
tion the input workload in run-time. Each partition is
then executed at pre-determined frequencies to ensure
minimum energy consumption. In [13], the authors pro-
pose a technique to remap tasks run on faulty proces-
sors with a minimal migration overhead. A thermal-
aware task scheduling approach is proposed in [11] to
estimate and reduce the temperature of the multi pro-
cessor system at run-time. The technique performs an
extensive design-time analysis of fault scenarios and
determines the optimal mapping of tasks in run-time.
However, such run-time techniques to remap SNN on
neuromorphic hardware are not proposed. To the best
of our knowledge, this is the first work to propose a
run-time mapping approach with a significantly lower
execution time when compared to existing design-time
approaches. Our technique reduces the spikes commu-
nicated on the time-multiplexed interconnect, therefore
reducing the energy consumption.
Reduce inter-crossbar
spike communication
Neurons
Spikes
Crossbar Size
SNN-based
Application
Step 1: 
Build Sublists
Naive 
Sublists
Step 2: Local Search(HCO)+
Cost Function Neuron and Synapse
Mapping
Deploy SNN-based 
Application to 
Neuromorphic hardware
Fig. 3: Overview of proposed partitioning algorithm.
3 Methodology
The proposed method to partition and map an SNN
in run-time is illustrated in Figure 2 illustrates. The
network model is built using a directed graph, wherein
each edge represents a synapse whose weight is the total
number of spikes communicated between the two SNN
neurons. The input to the mapping algorithm is a list
of all the neurons (A), the total number of spikes com-
municated over each synapse and the size of a crossbar
(k). The mapping algorithm is split into two steps, as
shown in Figure 3.
Figure 4 illustrates the partitioning of an SNN with
6 neurons into 3 sub-lists. The spikes communicated be-
tween the neurons is indicated on the synapse. First, we
divide the input list of neurons into sub-lists (Section
3.1), such that each sub-list can be mapped to an avail-
able crossbar. Second, we reduce the number of spikes
communicated between the sub-lists (Section 3.2), by
moving the neurons between the sub-list (indicated in
blue).
Run-time Mapping of Spiking Neural Networks to Neuromorphic Hardware 5
3
6
1
3
24
3
6
1
3
24Sub-list 1
Sub-list 2
Sub-list 3
Spiking Neural Network
HCO
Fig. 4: Partitioning of an SNN.
3.1 Building Sub-lists
Algorithm 1 describes the greedy partitioning approach.
The objective is to greedily cut the input list of neu-
rons (A) into s sub-lists, where s is the total number
of crossbars in the given design. The size of a sub-list is
determined by the size of the crossbars (k) on the target
hardware. A variable margin (line 3) is defined to store
the unused neuron slots available in each sub-list. The
mean (line 4) number of spikes generated per crossbar
is computed using the total number of spikes commu-
nicated in the SNN-based application. A cost function
(Algorithm 2) is defined to compute the total number
of spikes communicated (cost) between each of the sub-
lists.
The algorithm iterates over the neurons (ni) in the
input list (A) and updates the slots in the current sub-
list (line 8). Neurons are added to the current sub-list
until one of following two criteria are met - (1) the
length of the sub-list equals k, or (2) the cost (num-
ber of spikes) is greater than the mean value and suffi-
cient extra slots (margin) are still available. When the
criteria is met, the current sublist is validated and its
boundary stored. When the penultimate sub-list is val-
idated, the execution ends because the boundary of the
last sub-lists is already known (nth element in list). The
list p contains the sub-lists boundaries.
3.2 Local Search
The solution obtained from Algorithm-1 is naive and
not optimal. Although each sublist s obtained from
Algorithm-1 meets the cost criteria, it is possible to
have unevenly distributed costs across the sublists. We
search for a better solution by performing multiple local
searches to balance the cost. This is done by using the
hill-climbing optimization technique to iterate through
the sublist and move its boundary.
Algorithm-3 describes the hill-climbing optimization
technique. The technique relies on a cost function (line
2) to compute and evaluate a solution. The cost func-
tion used in the optimization process is shown in Algo-
Algorithm 1: Building Sublists
1 procedure FUNCTION (A[1→ n])
2 foreach Crossbar s ∈ p do
/* iterate over all crossbars in p */
3 Input the variable margin;
/* Mean spikes per crossbar */
4 Compute Mean;
/* iterate over all neurons in A */
5 foreach ni ∈ A do
/* Cost is the number of spikes in
current cluster */
6 Compute Cost;
7 while Cost ≤ Mean do
8 Assign ni to crossbar p;
9 end
10 end
11 end
Algorithm 2: Cost Function.
1 procedure FUNCTION (A[1→ n], p[1→ s])
2 max← 0;
3 foreach Cluster (p[i]) do
4 sum← 0 ;
5 foreach n in p[i] do
/* total spikes communicated */
6 compute Sum;
7 end
8 if Sum > Max then
9 Max ← Sum;
10 end
11 end
Algorithm 3: Hill Climbing Algorithm.
1 procedure FUNCTION (A[1→ n], p[1→ s])
/* compute the initial cost */
2 compute Cost;
3 foreach n in A do
4 move n across cluster boundary;
5 compute new Cost Cn;
6 select min(Cn);
7 end
/* end 2-part procedure */
rithm-2. The cost function computes the maximum cost
(number of spikes) for a chosen sub-list. The optimal
solution should contain the lowest cost. The algorithm
iterates through each subslist to search for the best so-
lution (cost) of its neighbors. The algorithm begins by
moving the boundary of a sub-list one position to the
left or one position to the right. Each neuron (ni) in the
sublist is moved across the boundary to a neighboring
sub-list and the cost of the neighbors are computed.
The algorithm selects the solution with the local mini-
mum cost. The process is repeated for every neuron in
the list (A) until the sub-lists with the minimum cost
is found.
6 Adarsha Balaji et al.
Category Applications Synapses Topology Spikes
synthetic
S 1000 240,000 FeedForward (400, 400, 100) 5,948,200
S 2000 640,000 FeedForward (800, 400, 800) 45,807,200
realistic
EdgeDet [10] 272,628 FeedForward (4096, 1024, 1024, 1024) 22,780
MLP-MNIST [21] 79,400 FeedForward (784, 100, 10) 2,395,300
Table 2: Applications used for evaluating.
4 Evaluation
4.1 Simulation environment
We conduct all experiments on a system with 8 CPUs,
32GB RAM, and NVIDIA Tesla GPU, running Ubuntu
16.04.
– CARLsim [10] : A GPU accelerated simulator used
to train and test SNN-based applications. CARLsim
reports spike times for every synapse in the SNN.
– DYNAP-SE [36]: Our approach is evaluated using
the DYNAP-SE model, with 256-neuron crossbars
interconnected using a NoC. [47].
4.2 Evaluated applications
In order to evaluate the online mapping algorithm, we
use 2 synthetic and 2 realistic SNN-based applications.
Synthetic applications are indicated with an ’S ’ fol-
lowed by the number of neurons in the application.
Edge detection (EdgeDet) and MLP-based digit recog-
nition (MLP-MNIST) are the two realistic applications
used. Table 2 also indicates the number of synapses
(column 3), the topology (column 4) and the number
of spikes for the application obtained through simula-
tions using CARLsim [10].
4.3 Evaluated design-time vs run-time approach
In order to compare the performance of our proposed
run-time approach, we choose a state-of-the-art design-
time approach as the baseline. The crossbar size for
both the algorithms is set to 256 (k=256).In this paper
we compare the following approaches:
– PSOPART [16]: The PSOPART approach is a design-
time partitioning technique that uses and instance
of particle swarm optimization (PSO) to minimize
the number of spikes communicated on the time-
multiplexed interconnect.
– HCO-Partitioning : Our HCO-partitioning approach
is a two-step layer-wise partitioning technique with
a greedy partitioning followed by a HCO-based lo-
cal search approach to reduce the number of spikes
communicated between the crossbars.
5 Results
Table 3 reports the execution time (in seconds) of the
design-time and run-time mapping algorithms for syn-
thetic and realistic applications, respectively. We make
the following two observations. First, on average, our
HCO partitioning algorithm has an execution time 780x
lower than that of the PSOPART algorithm. Second,
the significantly lower run-time of the HCO partition-
ing algorithm (<50 seconds) allows for the online learn-
ing SNN to be re-mapped on the edge devices, before
the start of the next training epoch.
Category Applications PSOPART (sec) HCO-Partition (sec)
synthetic
S 1000 20011.33 19.10
S 2000 45265.00 24.68
realistic
EdgeDet 6771.02 45.62
MLP-MNIST 5153.41 11.03
Table 3: Execution time of design-time and proposed
run-time approach in seconds.
Figure 5 shows the lifetime of an online learning ap-
plication with respect to the execution times of each
training epoch (t) and the HCO partitioning algorithm
(h). The execution time of the partitioning algorithm
needs to be significantly lower than the time interval be-
tween training epochs. This is achieved with the HCO-
partitioning algorithm as its execution time is signif-
icantly (780x) lower than the state-of-the-art design-
time approaches.
In Figure 6, we compare the number of spikes com-
municated between the crossbars while partitioning the
SNN using the HCO partitioning algorithm when com-
pared to the design-time PSOPART approach. We see
that, on average, the PSOPART algorithm reduces the
number of spikes by a further 6.25%, when compared
to the HCO partitioning algorithm. The PSOPART
will contribute to a further reduction in the overall en-
ergy consumed on the neuromorphic hardware. How-
ever, this outcome is expected as the design-time parti-
tioning approach is afforded far more exploration time
to minimize the number of spikes communicated be-
tween the crossbars. Also, the effects of concept drift
will soon lead to the design-time solution becoming
outmoded. Therefore, a run-time partitioning and re-
mapping of the SNN will significantly improve the per-
Run-time Mapping of Spiking Neural Networks to Neuromorphic Hardware 7
Application lifetime
Start of execution
Learning Epoch
HCO partitioning
t0
h0
t1
h1
tn
hn
Fig. 5: Life-time of online learning SNN
S_1000 S_2000 EdgeDet MLP-MNIST Average
0.0
0.2
0.4
0.6
0.8
1.0
1.2
A
v
e
ra
g
e
 s
p
ik
e
 c
o
u
n
t 
n
o
rm
a
liz
e
d
 
to
 t
h
e
 t
o
ta
l 
n
u
m
b
e
r 
o
f 
sp
ik
e
s 
g
e
n
e
ra
te
d
.
Total
HCO-Partition
PSOPART
Fig. 6: Number of spikes communicated on the time-multiplexed interconnect normalized to the total number of
spikes generated.
formance of the SNN on the neuromorhpic hardware
and mitigate the effects of concept drift.
6 Conclusion
In this paper, we propose an algorithm to re-map on-
line learning SNNs on neuromorphic hardware. Our ap-
proach performs the run-time mapping in two steps:
(1) a layer-wise greedy partitioning of SNN neurons,
and (2) a hill-climbing based optimization of the greedy
partitions with an aim to reduce the number of spikes
communicated between the crossbars. We demonstrate
the in-feasibility of using a state-of-the-art design-time
approach to re-map online learning SNNs in run-time.
We evaluate the our approach using synthetic and re-
alistic SNN applications. Our algorithm reduces SNN
mapping time by an average 780x when compared to a
state-of-the-art design-time approach with only 6.25%
lower performance.
7 Discussion
In this section we discuss the scalability of our ap-
proach. Each iteration of Algorithm-1 performs basic
math operations. The hill-climbing algorithm computes
as many as 2x(s-2) solutions, and performs a compar-
ison to find the minimum cost across all the solutions.
In our case, the co-domain of the cost function are well-
ordered positive integers. The cost function is also lin-
ear in n, however the hill-climb optimization algorithm
only terminates when the local minimum cost function
is computed. Therefore, it is in our interest to optimize
the number of times the cost function is to be run.
Acknowledgment
This work is supported by 1) the National Science Foun-
dation Award CCF-1937419 (RTML: Small: Design of
System Software to Facilitate Real-Time Neuromorphic
Computing) and 2) the National Science Foundation
Faculty Early Career Development Award CCF-1942697
8 Adarsha Balaji et al.
(CAREER: Facilitating Dependable Neuromorphic Com-
puting: Vision, Architecture, and Impact on Programma-
bility).
References
1. Akopyan, F., Sawada, J., Cassidy, A., Alvarez-Icaza, R.,
Arthur, J., Merolla, P., Imam, N., Nakamura, Y., Datta,
P., Nam, G.J., others: TrueNorth: Design and tool flow
of a 65 mW 1 million neuron programmable neurosynap-
tic chip. IEEE transactions on computer-aided design of
integrated circuits and systems 34(10), 1537–1557 (2015)
2. Akopyan, F., Sawada, J., et al.: TrueNorth: Design and
tool flow of a 65 mw 1 million neuron programmable neu-
rosynaptic chip. IEEE transactions on computer-aided
design of integrated circuits and systems 34(10), 1537–
1557 (2015)
3. Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari,
M., Ayyash, M.: Internet of things: A survey on enabling
technologies, protocols, and applications. IEEE Com-
munications Surveys Tutorials 17(4), 2347–2376 (2015).
DOI 10.1109/COMST.2015.2444095
4. Balaji, A., Adiraju, P., Kashyap, H.J., Das, A., Krich-
mar, J.L., Dutt, N.D., Catthoor, F.: Pycarl: A pynn inter-
face for hardware-software co-simulation of spiking neu-
ral network. In: 2020 International Joint Conference on
Neural Networks (IJCNN) (2020)
5. Balaji, A., Corradi, F., Das, A., Pande, S., Schaafsma, S.,
Catthoor, F.: Power-Accuracy Trade-Offs for Heartbeat
Classification on Neural Networks Hardware. Journal of
Low Power Electronics 14(4), 508–519 (2018)
6. Balaji, A., Das, A., Wu, Y., Huynh, K., Dell’Anna, F.,
Indiveri, G., Krichmar, J.L., Dutt, N., Schaafsma, S.,
Catthoor, F.: Mapping Spiking Neural Networks on Neu-
romorphic Hardware. IEEE Transactions on VLSI Sys-
tems (2019)
7. Benini, L., De Micheli, G.: Networks on chip: A new
paradigm for systems on chip design. In: Proceedings
2002 Design, Automation and Test in Europe Conference
and Exhibition, pp. 418–419. IEEE (2002)
8. Benjamin, B.V., Gao, P., McQuinn, E., Choudhary, S.,
Chandrasekaran, A.R., Bussat, J., Alvarez-Icaza, R.,
Arthur, J.V., Merolla, P.A., Boahen, K.: Neurogrid:
A mixed-analog-digital multichip system for large-scale
neural simulations. Proceedings of the IEEE 102(5), 699–
716 (2014)
9. Chicca, E., Badoni, D., Dante, V., et al.: A vlsi recurrent
network of integrate-and-fire neurons connected by plas-
tic synapses with long-term memory. IEEE Transactions
on Neural Networks 14 (2003)
10. Chou, T., Kashyap, H.J., Xing, J., Listopad, S., Rounds,
E.L., Beyeler, M., Dutt, N., Krichmar, J.L.: Carlsim 4:
An open source library for large scale, biologically de-
tailed spiking neural network simulation using heteroge-
neous clusters. In: 2018 International Joint Conference
on Neural Networks (IJCNN), pp. 1–8 (2018). DOI
10.1109/IJCNN.2018.8489326
11. Cui, J., Maskell, D.L.: A fast high-level event-driven ther-
mal estimator for dynamic thermal aware scheduling.
IEEE Transactions on Computer-Aided Design of Inte-
grated Circuits and Systems 31(6), 904–917 (2012)
12. Das, A., Al-Hashimi, B.M., Merrett, G.V.: Adaptive and
hierarchical runtime manager for energy-aware thermal
management of embedded systems. ACM Trans. Embed.
Comput. Syst. 15(2) (2016). DOI 10.1145/2834120. URL
https://doi.org/10.1145/2834120
13. Das, A., Kumar, A.: Fault-aware task re-mapping for
throughput constrained multimedia applications on noc-
based mpsocs. In: International Symposium on Rapid
System Prototyping (RSP). IEEE (2012)
14. Das, A., Kumar, A., Veeravalli, B., Shafik, R., Merrett,
G., Al-Hashimi, B.: Workload uncertainty characteriza-
tion and adaptive frequency scaling for energy minimiza-
tion of embedded systems. In: Design, Automation &
Test in Europe Conference & Exhibition (DATE) (2015)
15. Das, A., Pradhapan, P., Groenendaal, W., Adiraju, P.,
Rajan, R.T., Catthoor, F., Schaafsma, S., Krichmar, J.L.,
Dutt, N., Van Hoof, C.: Unsupervised heart-rate estima-
tion in wearables with liquid states and a probabilistic
readout. arXiv preprint arXiv:1708.05356 (2017)
16. Das, A., Wu, Y., Huynh, K., Dell’Anna, F., Catthoor, F.,
Schaafsma, S.: Mapping of local and global synapses on
spiking neuromorphic hardware. In: Design, Automation
& Test in Europe Conference & Exhibition (DATE), pp.
1217–1222 (2018). DOI 10.23919/DATE.2018.8342201
17. Das, A., Wu, Y., Huynh, K., Dell’Anna, F., Catthoor,
F., Schaafsma, S.: Mapping of local and global synapses
on spiking neuromorphic hardware. In: 2018 De-
sign, Automation Test in Europe Conference Exhibition
(DATE), pp. 1217–1222 (2018). DOI 10.23919/DATE.
2018.8342201
18. Das, A.K., Catthoor, F., Schaafsma, S.: Heartbeat
classification in wearables using multi-layer percep-
tron and time-frequency joint distribution of ecg. In:
2018 IEEE/ACM International Conference on Connected
Health: Applications, Systems and Engineering Technolo-
gies (CHASE), pp. 69–74. IEEE (2018)
19. Davies, M., Srinivasa, N., Lin, T.H., Chinya, G., Cao, Y.,
Choday, S.H., Dimou, G., Joshi, P., Imam, N., Jain, S.,
et al.: Loihi: A neuromorphic manycore processor with
on-chip learning. IEEE Micro 38(1), 82–99 (2018)
20. Dhiman, G., Ayoub, R., Rosing, T.: PDRAM: A hy-
brid PRAM and DRAM main memory system. In: Pro-
ceedings of the Annual Design Automation Conference
(DAC), pp. 469–664 (2009)
21. Diehl, P.U., Cook, M.: Unsupervised learning of
digit recognition using spike-timing-dependent plasticity.
Frontiers in computational neuroscience 9 (2015)
22. Evans, D.: The internet of things: How the next evolu-
tion of the internet is changing everything. CISCO white
paper 1(2011), 1–11 (2011)
23. Gama, J.a., Zˇliobaitundefined, I., Bifet, A., Pechenizkiy,
M., Bouchachia, A.: A survey on concept drift adapta-
tion. ACM Comput. Surv. 46(4) (2014). DOI 10.1145/
2523813. URL https://doi.org/10.1145/2523813
24. Grzyb, B.J., Chinellato, E., Wojcik, G.M., Kaminski,
W.A.: Facial expression recognition based on liquid state
machines built of alternative neuron models. In: 2009
International Joint Conference on Neural Networks, pp.
1011–1017. IEEE (2009)
25. Ji, Y., Zhang, Y., Li, S., Chi, P., Jiang, C., Qu, P., Xie, Y.,
Chen, W.: NEUTRAMS: Neural network transformation
and co-design under neuromorphic hardware constraints.
In: International Symposium on Microarchitecture (MI-
CRO). IEEE (2016)
26. Ji, Y., Zhang, Y., Li, S., Chi, P., Jiang, C., Qu, P., Xie, Y.,
Chen, W.: NEUTRAMS: Neural network transformation
and co-design under neuromorphic hardware constraints.
In: International Symposium on Microarchitecture (MI-
CRO) (2016)
Run-time Mapping of Spiking Neural Networks to Neuromorphic Hardware 9
27. Kasabov, N.: Evolving fuzzy neural networks for su-
pervised/unsupervised online knowledge-based learning.
IEEE Transactions on Systems, Man, and Cybernetics,
Part B (Cybernetics) 31(6), 902–918 (2001)
28. Lee, J.H., Delbruck, T., Pfeiffer, M.: Training deep spik-
ing neural networks using backpropagation. Frontiers in
neuroscience 10, 508 (2016)
29. Maass, W.: Networks of spiking neurons: the third gener-
ation of neural network models. Neural networks 10(9),
1659–1671 (1997)
30. Mahmood, A., Khan, S.A., Albalooshi, F., Awwad, N.:
Energy-aware real-time task scheduling in multiprocessor
systems using a hybrid genetic algorithm. Electronics
6(2), 40 (2017)
31. Mao, Y., You, C., Zhang, J., Huang, K., Letaief, K.B.:
Mobile edge computing: Survey and research outlook.
arXiv preprint arXiv:1701.01090 (2017)
32. Mao, Y., You, C., Zhang, J., Huang, K., Letaief, K.B.: A
survey on mobile edge computing: The communication
perspective. IEEE Communications Surveys Tutorials
19(4), 2322–2358 (2017)
33. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A.,
Veness, J., Bellemare, M.G., Graves, A., Riedmiller,
M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level
control through deep reinforcement learning. Nature
518(7540), 529 (2015)
34. Mohammadi, M., Al-Fuqaha, A., Sorour, S., Guizani, M.:
Deep learning for iot big data and streaming analytics:
A survey. IEEE Communications Surveys & Tutorials
20(4), 2923–2960 (2018)
35. Moradi, S., Qiao, N., Stefanini, F., Indiveri, G.: A Scal-
able Multicore Architecture with Heterogeneous Mem-
ory Structures for Dynamic Neuromorphic Asynchronous
Processors (DYNAPs). IEEE Transactions on Biomed-
ical Circuits and Systems 12(1), 106–122 (2018). DOI
10.1109/TBCAS.2017.2759700
36. Moradi, S., Qiao, N., Stefanini, F., Indiveri, G.: A scal-
able multicore architecture with heterogeneous memory
structures for dynamic neuromorphic asynchronous pro-
cessors (DYNAPs). Biomedical Circuits and Systems,
IEEE Transactions on 12(1), 106–122 (2018). DOI
10.1109/TBCAS.2017.2759700
37. Mostafa, H.: Supervised learning based on temporal cod-
ing in spiking neural networks. IEEE transactions on
neural networks and learning systems 29(7), 3227–3235
(2018)
38. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE
Transactions on knowledge and data engineering 22(10),
1345–1359 (2009)
39. Shafique, M., Hafiz, R., Javed, M.U., Abbas, S., Sekanina,
L., Vasicek, Z., Mrazek, V.: Adaptive and energy-efficient
architectures for machine learning: Challenges, opportu-
nities, and research roadmap. In: 2017 IEEE Computer
Society Annual Symposium on VLSI (ISVLSI), pp. 627–
632 (2017). DOI 10.1109/ISVLSI.2017.124
40. Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge com-
puting: Vision and challenges. IEEE Internet of Things
Journal 3(5), 637–646 (2016). DOI 10.1109/JIOT.2016.
2579198
41. Shi, W., Dustdar, S.: The promise of edge computing.
Computer 49(5), 78–81 (2016)
42. Song, S., Balaji, A., Das, A., Kandasamy, N., Shackle-
ford, J.: Optimizing tensor contractions for embedded
devices with racetrack memory scratch-pads. In: Pro-
ceedings of the 21st ACM SIGPLAN/SIGBED Interna-
tional Conference on Languages, Compilers, and Tools
for Embedded Systems, LCTES 2020 (2020)
43. Thrun, S.: Lifelong learning algorithms. In: Learning to
learn, pp. 181–209. Springer (1998)
44. Wen, W., Wu, C.R., Hu, X., Liu, B., Ho, T.Y.,
Li, X., Chen, Y.: An eda framework for large scale
hybrid neuromorphic computing systems. In: 2015
52nd ACM/EDAC/IEEE Design Automation Conference
(DAC), pp. 1–6. IEEE (2015)
45. Wijesinghe, P., Ankit, A., Sengupta, A., Roy, K.: An all-
memristor deep spiking neural computing system: A step
toward realizing the low-power stochastic brain. IEEE
Transactions on Emerging Topics in Computational In-
telligence 2(5), 345–358 (2018)
46. Xia, Q., Yang, J.J.: Memristive crossbar arrays for brain-
inspired computing. Nature materials 18(4), 309 (2019)
47. Zhao, W., Cao, Y.: New generation of predictive tech-
nology model for sub-45 nm early design exploration.
IEEE Transactions on Electron Devices 53(11), 2816–
2823 (2006)
