Forward Table-Based Presynaptic Event-Triggered Spike-Timing-Dependent
  Plasticity by Pedroni, Bruno U. et al.
Forward Table-Based Presynaptic Event-Triggered
Spike-Timing-Dependent Plasticity
Bruno U. Pedroni∗, Sadique Sheik∗, Siddharth Joshi∗, Georgios Detorakis†,
Somnath Paul‡, Charles Augustine‡, Emre Neftci†, and Gert Cauwenberghs∗
∗University of California, San Diego
La Jolla, CA, USA 92093, Email: bpedroni@eng.ucsd.edu, gert@ucsd.edu
†University of California, Irvine
Irvine, CA, USA 92697, Email: eneftci@uci.edu
‡Intel Corporation - Circuit Research Lab
Hillsboro, OR, USA 97124, Email: somnath.paul@intel.com
Abstract—Spike-timing-dependent plasticity (STDP) incurs
both causal and acausal synaptic weight updates, for negative
and positive time differences between pre-synaptic and post-
synaptic spike events. For realizing such updates in neuromorphic
hardware, current implementations either require forward and
reverse lookup access to the synaptic connectivity table, or rely
on memory-intensive architectures such as crossbar arrays. We
present a novel method for realizing both causal and acausal
weight updates using only forward lookup access of the synaptic
connectivity table, permitting memory-efficient implementation.
A simplified implementation in FPGA, using a single timer
variable for each neuron, closely approximates exact STDP
cumulative weight updates for neuron refractory periods greater
than 10 ms, and reduces to exact STDP for refractory periods
greater than the STDP time window. Compared to conventional
crossbar implementation, the forward table-based implementa-
tion leads to substantial memory savings for sparsely connected
networks supporting scalable neuromorphic systems with fully
reconfigurable synaptic connectivity and plasticity.
I. INTRODUCTION
Neuromorphic systems are electronic instantiations of bi-
ological nervous systems which seek to mimic behavioral
and structural aspects of real neural networks [1]. The three
main components of neuromorphic systems are: neurons (rep-
resenting processors), synapses (representing memory), and a
learning rule. Neuromorphic systems differentiate themselves
from traditional von Neumann architectures mainly due to
distributed memory and parallel processing. Distributing the
entire neural network of a neuromorphic chip into many
smaller networks (i.e. cores) permits more efficient spike event
routing and memory access [2].
Learning in neuromorphic systems is usually realized by
adapting the synaptic strength (or weight) between neurons.
Among observed forms of synaptic plasticity in biological ner-
vous systems, spike-timing-dependent plasticity (STDP) [3],
[4] is particularly attractive for neuromorphic systems due to its
locality property: weight adaptation requires only information
of neighboring neurons. STDP relies on the relative temporal
difference between pre-synaptic and post-synaptic spike events
to adjust the direction and intensity of synaptic strength
between neurons [3]. In the original STDP formulation, the
intensity of weight updates depends on the temporal difference
between spikes and on the current value of the weight. For
simplicity, digital neuromorphic implementations of STDP
usually consider a variant of the original STDP where the
weight updates are simply a function of the temporal difference
[5]. Fig. 1 illustrates three typical STDP kernels implemented
in digital neuromorphic systems. In the three cases, causal
events (i.e. when pre- precedes post-synaptic spikes) produce
weight increase, while acausal events (i.e. when post- precedes
pre-synaptic spikes) produce weight decrease.
exponential ramp boxᶦpre-ᶦpost
Δw
acausalcausal
Fig. 1: The causal and acausal weight update regions of the
STDP kernels (left) and three typical kernels implemented in
digital neuromorphic systems.
In biological systems, synaptic connectivity is instantiated
at a physical level: there is a physical path between neurons.
Though there have been multiple neuromorphic implementa-
tions of physically-connected neurons using, mainly, memris-
tor [6] and phase-change memory approaches [7], most of the
large scale systems developed to date make use of (digital)
random access memories representing “virtual” synaptic con-
nections. These virtual connections are basically composed of
two tables: routing and weight. The routing tables (RTs) are
used to store the destination address of newly generated spikes,
indicating to which core inputs they should be delivered. The
weight tables (WTs) include pairs of synaptic connectivity
and strength between input spikes (from other post-synaptic
neurons) and their destination post-synaptic neurons in the
core. Lastly, by organizing the WTs in specific manners
inside each core, different neuromorphic architectures can be
obtained. The most straightforward representation is by means
of a crossbar architecture, in which all incoming pre-synaptic
neurons are connected to all post-synaptic neurons in a core
[2]. However, when we deal with sparse representations (i.e.
less than 70% connection density) or non-systematic connec-
tivity patterns between neurons (i.e. each neuron connects
to a different number of neurons inside the same core), we
show that a more optimized solution is to use an index-based
architecture. The specificities of this architecture are described
in the following section. The remainder of the paper includes
ar
X
iv
:1
60
7.
03
07
0v
2 
 [c
s.N
E]
  2
4 J
ul 
20
16
outgoing
post-synaptic events
Pre-syn.
Processor
Weight
Table
Post-syn.
Processor
Routing
TableCore
Pointer
Table
incoming
pre-synaptic events
&pre1
1 w10 0 2 1 w4 0
1 1
w0 0
w10 0 2 1 w4 0
1 1w0 0 w10 0 2 1 w4 0
1
&pre2
2
&pre3
&preA
A
3
...
w1,1
2 3
...
 w10,2
wB,A
0 w4,112 ...
...
1
10
1
n
0 6 1
n+1 n+2
1
m
9
m+1
wB,A-7
(a)
Average connectivity density (d)
0 0.2 0.4 0.6 0.8 1
M
em
or
y 
us
ag
e 
(b
its
)
×105
0
2
4
6
8
10
12
index-based
crossbar
d c
d c
d c
16-bit
4-bit
9-bit
= 0.87
= 0.37
= 0.70
(b)
Fig. 2: (a) The index-based architecture. Besides the weight table (WT) and routing table (RT), data compression is obtained by
means of the pointer table (PT). In the example, the WT uses run-length encoding (RLE) with leading bit ‘0’ indicating the run
count (i.e. the number of consecutive post-synaptic neurons which the pre-synaptic neuron is not connected to), and leading bit
‘1’ indicating an existing connection. (b) Memory usage comparison between the crossbar and the index-based core architectures
based on the connectivity density between pre- and post-synaptic neurons.
a detailed description of our proposed method, along with
simulation and emulation results, and finally conclusions.
II. CORE ARCHITECTURE AND PROCESSING
A digital neuromorphic core can be interpreted as a neu-
rosynaptic processor which includes an input-output weight
map between pre- and post-synaptic neurons. The spikes
produced by post-synaptic neurons are delivered to their des-
tinations (on the same core or another core in the system)
according to the core’s routing table (RT). At the destination
core, an incoming spike is received by one of its inputs,
which in turn is connected to the post-synaptic neurons in
the core via the weight table (WT). Therefore, each core input
line can be interpreted as an axon, capable of fanning-out to
multiple post-synaptic neurons via synaptic connections [2].
Digital neuromorphic systems normally operate in discrete
time steps, or “ticks”. These ticks are usually in the order
of a millisecond, which is similar to the processing timescale
of biological neurons. At the occurrence of a tick, the cores
process only the inputs which have received a spike (i.e.
event-based processing). Analogously, inputs which have not
received spikes are skipped, reducing processing time and
power consumption.
A. Index-based core architecture
The organization of the memory in the WT defines the
type of core architecture. The two basic architectures are:
crossbar and index-based. In the crossbar architecture every
connection between an input and a post-synaptic neuron has
a reserved space in the memory, even if the connection is not
used in a given neural network. For very dense networks, the
crossbar is the ideal solution. On the other hand, for sparser
networks, where some connections between neurons are non-
existent (i.e. they are of weight zero and will remain this
way after STDP learning), or for topologies requiring each
neuron to have different fan-outs, the index-based architecture
is a better solution. Data stored in index-based architectures
is compressed such that only WT memory for connections
that exist in the neural network is used. This flexibility arises,
however, at the cost of an additional table, the pointer table
(PT), which contains a pointer for each core input to indicate
its starting position in the WT. Fig. 2a exemplifies an index-
based core with A inputs and B post-synaptic neurons.
To further improve processing and memory demands,
lossless compression of the connections in the WT can be
realized by using run-length encoding (RLE), where sequences
of consecutive non-existent connections can be stored as run
counts – instead of a zero-valued weight per connection.
An additional bit before each existing weight to indicate if
the connection between an input and a post-synaptic neuron
exists. For non-existent connections, the opposite valued bit
can precede the run-length to indicate how many post-synaptic
neurons should be skipped. In Fig. 2a, the WT was constructed
with RLE using a ‘0’ to indicate how many post-synaptic
neurons to skip, and a ‘1’ to indicate a connection that actually
exists. With this, using RLE compression has the additional
advantage over the crossbar architecture in that it decreases
the number of post-synaptic neurons which must be analyzed
for every incoming spike, reducing once again processing time
and power consumption.
B. Core architectures comparison
For comparison purposes between the crossbar and index-
based approaches, Fig. 2b shows the plots of memory usage
(in bits) based on the connectivity density, d, between inputs
and post-synaptic neurons. Core parameters of 256 inputs
× 256 neurons with 9-bit weights were chosen as reference
based on the work in [2]; 4-bit and 16-bit weights were also
analyzed. For networks with density greater than the critical
density (dc), the crossbar architecture consumes less memory
to map the connections and weights; this is due mainly to the
additional pointer table and to the RLE overhead in the index-
based approach. However, for connectivity densities below the
critical density (e.g., dc = 0.70 for 9-bit weights), the index-
based architecture shows better results. Naturally, cores with
high bit-precision weights favor the index-based architecture
since compression of non-existent connections becomes even
more meaningful. The “left over” memory obtained using the
index-based approach can be used, for example, to increase
the weight bit-precision further, to better represent weights in
the network, or perhaps, if an entire sector of the memory will
not be required, power gating might also be a possibility as a
means of reducing power consumption.
In theory, a neuromorphic system could use either type
of memory organization: crossbar or index-based. This simply
requires that the compiler efficiently choose the best archi-
tecture based on the core parameters and network topology
being mapped, along with hardware being able to realize both
types of memory accesses. One important aspect is that the
crossbar approach has the advantage of being able to realize a
reverse lookup, which is vital for the original STDP algorithm.
However, as Fig. 2b shows, many networks are more efficiently
mapped using the index-based approach, where we have access
only to the forward connectivity. Therefore, as we will see in
the next section, our novel implementation of STDP can be
realized using simply this forward mapping and still produce
results nearly identical to those of the original algorithm.
III. SPIKE-TIMING DEPENDENT PLASTICITY WITHOUT
REVERSE CONNECTIVITY TABLE LOOKUP
A fundamental aspect of the crossbar and index-based
architectures for STDP is that a core has information about
the pre-synaptic neuron spike times: an event delivered to an
input in the core carries the temporal information about the
pre-synaptic neuron, independently of where it is located in
the system. This feature is vital for the STDP algorithm since
it requires knowledge of spike times of both the post- and
pre-synaptic neurons. In the digital neuromorphic core, this
is realized by including an STDP timer for each input and
post-synaptic neuron, representing the STDP learning window.
Pre-synaptic (i.e. input) timers are initialized at the arrival
of a spike, while post-synaptic timers are initialized at the
generation of a new spike by post-synaptic neurons. If the
causal and acausal STDP windows are not symmetric, then the
STDP timer is set to the longest of the two windows. At each
system tick, the STDP timer is decremented until it reaches
zero (or a value greater than zero if the specific window is the
shorter of the two). This implementation using a single timer
for each neuron realizes the nearest-neighbor STDP learning
rule since only the most recent spike events are considered.
In the traditional STDP algorithm, whenever a post-
synaptic neuron fires, the causal weight updates are imme-
diately executed. This is only possible, however, by knowing
all the pre-synaptic neurons the specific post-synaptic neuron
is connected to (i.e., by being able to perform reverse lookup
access to the connectivity table) [8]. Since the index-based
architecture has access only to forward connectivity, in our
method the causal update is delayed until the pre-synaptic
STDP timer expires. With this, all weight updates are per-
formed at the start or the end events of the pre-synaptic timer.
Previous work [9] demonstrated forward table-based STDP
by deferring both causal and acausal updates, making use of
system delays in spike time storage allowed by constraints on
firing rates of pre- and post-synaptic neurons. The proposed
method here performs the acausal weight updates at the onset
of a pre-synaptic spike. This results in simplified spike time
storage and reduces memory requirements by using the same
STDP window for both causal and acausal updates.
Figure 3 illustrates the moments when weight updates are
performed using our method; for clarity, we used causal and
Acausal updates
Every leaving post-synaptic event (i.e. spike) 
increases the weight at the destination core.
pre1
Epre
New post-synaptic event (i.e. spike):
- causal decreases the weight at the 
destination core.
Epre
Epost
Ending pre-synaptic event:
- causal weight update
Epost
STDP window
Causal updates
time
Causal and 
acausal updates
post n1 n2 n3 n3 n5n4 n6 n7 n8
newold
Fig. 3: The proposed STDP weight update rule. (left) New pre-
synaptic input events produce acausal updates. (center) Causal
updates must be processed just before acausal updates when
a previous event is still present. (right) Ending events produce
delayed causal updates.
acausal windows of same duration. In the figure, the spike
event queue of pre-synaptic input 1 is represented in the top
row (pre1), while the spike event queue of all the post-synaptic
neurons in this same core is represented in the bottom row
(post). The indices below the bottom row spikes in icate the
post-synaptic neuron address. The algorithm for our STDP
learning rule is summarized by the following cases:
1) When a new pre-synaptic input spike arrives, perform
the acausal weight updates;
2) If the pre-synaptic STDP timer has not expired as
this same pre-synaptic input receives a new spike,
perform the causal weight updates, followed by the
acausal updates in case 1;
3) When a pre-synaptic STDP timer expires, perform
the causal weight updates.
As shown in Fig. 3 (left), upon a new event at pre-synaptic
input 1 (depicted by the red arrow in pre1), we realize
the acausal updates by sweeping through this input’s entries
in the WT using forward lookup and updating the weights
between all its post-synaptic neurons (post) which present
STDP timers greater than zero (depicted by black arrows).
If, however, pre1’s timer were greater than zero, as depicted
in Fig. 3 (center), then we would first realize the causal
updates between pre1 and post-synaptic neurons who spiked
after preold1 , followed by the acausal updates for pre
new
1 (as
performed in Fig. 3 (left)). Lastly, the causal updates shown
in Fig. 3 (right) occur in a similar fashion, except only when
the pre-synaptic input STDP timer expires (i.e. when the red
arrow leaves the STDP window).
The proposed method delays the causal weight updates
since they only occur at the end of pre-synaptic timers.
Nonetheless, this effect on the convergence of the learning
rule can be mediated by choosing small enough learning rates.
The only other aspect in which our method differs from the
original STDP algorithm is for a particular causal update: In
the case when a pre-synaptic input event has already entered
the STDP window, and was followed by a post-synaptic spike,
no updates should occur between the input and this neuron
until the pre-synaptic input timer expires. However, if the post-
synaptic neuron spikes again while its timer is greater than
zero, the “old” post-synaptic spike would be overwritten by
the new one (by reinitializing the timer). With this, the causal
update of the pre-synaptic input with the first (i.e. nearest-
neighbor) post-synaptic spike would be lost.
A rather costly solution to realizing these nearest-neighbor
causal updates would be to go through every pre-synaptic input
whose timer is greater than zero yet smaller than the STDP
timer of the post-synaptic neuron of interest, and going through
each of these inputs’ table to verify if they are connected to the
specific post-synaptic neuron. The worst case time expenditure
for this scenario is “number of inputs” × “number of post-
synaptic neurons”. A more objective solution is to ignore these
updates altogether if we consider that a single pre-synaptic
spike should not have a strong causal relation to a post-synaptic
neuron which is firing very frequently. As we will later show
in our results, this can indeed be considered. A final and exact
solution would be to configure the neurons in the system with
a refractory period of same or longer duration than that of the
STDP time window. This would, therefore, impede a bursting
behavior, and neurons would not be able to spike more than
once during the STDP time window.
IV. RESULTS
For validation of our proposed method, we designed a
64-input × 64-neuron core, operating at 1-ms timesteps, on
a Xilinx Spartan-6 FPGA. The emulation parameters were
chosen to match that of biological scale. The anti-symmetric
ramp STDP kernel depicted in Fig. 1, with causal and acausal
windows, Tstdp, of 20 ms each and max(∆w) = ±1, was
used for our learning. The 4096 9-bit weights were initialized
to value zero. Poisson spike trains with firing rate of 10 Hz
and refractory periods, Tref , varying between 5 and 20 ms
were used to externally produce the pre- and post-synaptic
spikes. The system was run for 60 seconds, after which we
compared the weights obtained from the FPGA with simulation
results of the original STDP algorithm. Fig. 4 compares the
evolution of training with the original algorithm (in blue)
and our proposed method (in red) for 4 different connections,
with varying refractory periods. The figure shows how, as the
refractory period approximates the duration of the causal and
acausal windows, both algorithms converge.
Time (s)
W
ei
gh
t
Time
W
ei
gh
t
Time
T
ref=15 ms
T
ref=5 ms
T
ref=10 ms
T
ref=20 ms
Fig. 4: Evolution of training for the original STDP algorithm
(blue) and our proposed method (red). As the refractory period
approximates the STDP window duration of 20 ms, both
methods converge.
After running the system, using Tref = 5 ms, for 60 sec-
onds, we also compared the final value of the weights produced
by both algorithms. In Fig. 5, the blue points are pairs of
weights produced by the original STDP algorithm (wo) versus
those produced by the FPGA after training (wp); the red dashed
line is the ideal scenario, where both methods match. The
figure shows that our method produces slightly more negative
weights. This is because in the FPGA implementation we are
not performing the nearest-neighbor causal updates for post-
synaptic neurons firing in rapid succession, resulting in a not
so intense final update. However, as we see in the histogram
in the figure, final weights with small difference between the
ideal case and our method are much more frequent, and final
weights with difference larger than −4 seldom occur.
Original algorithm weight (w
o
)
-30 -20 -10 0 10 20 30Pr
op
os
ed
 m
et
ho
d 
w
ei
gh
t (
w p
)
-30
-20
-10
0
10
20
30
-5 0
0
0.1
0.2
wp - wo
Fig. 5: Comparison of final weights. Each blue point is
a (wo,wp)-pair of weights produced by the original STDP
algorithm and our proposed method, respectively.
V. CONCLUSIONS
While the traditional spike-timing-dependent plasticity al-
gorithm uses both forward and reverse connectivity tables, we
presented a novel method of implementing STDP which ben-
efits from event-driven operation in a very memory-efficient
environment. Using simply forward lookup access to the
connectivity table (halving the normal memory requirements
for STDP implementation), causal weight updates can be
performed at the moment of the pre-synaptic STDP timer
expiration. Though this delayed update possesses an inherent
caveat, we showed that, by ignoring certain weight updates
because of the stronger influence of neighboring spikes, this
issue can be mitigated. Additionally, if we are to configure
neurons with refractory period equal to or longer than the
STDP time window, then our method converges to the original
algorithm.
REFERENCES
[1] G. Indiveri, B. Linares-Barranco, T. J. Hamilton, A. Van Schaik,
R. Etienne-Cummings, T. Delbruck, S.-C. Liu, P. Dudek, P. Ha¨fliger,
S. Renaud, et al., “Neuromorphic silicon neuron circuits,” Frontiers in
neuroscience, vol. 5, p. 73, 2011.
[2] P. A. Merolla, J. V. Arthur, R. Alvarez-Icaza, A. S. Cassidy, J. Sawada,
F. Akopyan, B. L. Jackson, N. Imam, C. Guo, Y. Nakamura, et al., “A
million spiking-neuron integrated circuit with a scalable communication
network and interface,” Science, vol. 345, no. 6197, pp. 668–673, 2014.
[3] G. Q. Bi and M. M. Poo, “Synaptic modifications in cultured hip-
pocampal neurons: dependence on spike timing, synaptic strength, and
postsynaptic cell type,” The Journal of neuroscience, vol. 18, no. 24,
pp. 10464–10472, 1998.
[4] J. Sjo¨stro¨m and W. Gerstner, “Spike-timing dependent plasticity,” Spike-
timing dependent plasticity, p. 35, 2010.
[5] A. Cassidy, A. G. Andreou, and J. Georgiou, “A combinational digital
logic approach to STDP,” in 2011 IEEE International Symposium of
Circuits and Systems (ISCAS), pp. 673–676, IEEE, 2011.
[6] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder, and W. Lu,
“Nanoscale memristor device as synapse in neuromorphic systems,”
Nano letters, vol. 10, no. 4, pp. 1297–1301, 2010.
[7] D. Kuzum, R. G. Jeyasingh, B. Lee, and H.-S. P. Wong, “Nanoelectronic
programmable synapses based on phase change materials for brain-
inspired computing,” Nano letters, vol. 12, no. 5, pp. 2179–2186, 2011.
[8] R. J. Vogelstein, F. Tenore, R. Philipp, M. S. Adlerstein, D. H. Gold-
berg, and G. Cauwenberghs, “Spike timing-dependent plasticity in the
address domain,” in Advances in Neural Information Processing Systems,
pp. 1147–1154, 2002.
[9] X. Jin, A. Rast, F. Galluppi, S. Davies, and S. Furber, “Implementing
spike-timing-dependent plasticity on SpiNNaker neuromorphic hard-
ware,” in The 2010 International Joint Conference on Neural Networks
(IJCNN), pp. 1–8, IEEE, 2010.
