LETTER Communicated by Dan Hammerstrom Programmable Logic Construction Kits for Hyper-Real-Time Neuronal Modeling by Ruben Guerrero-rivera et al.
LETTER Communicated by Dan Hammerstrom
Programmable Logic Construction Kits for Hyper-Real-Time
Neuronal Modeling
Ruben Guerrero-Rivera
rg66@leicester.ac.uk
Center for Bioengineering, University of Leicester, Leicester LE1 7RH, U.K.
Abigail Morrison
abigail@biologie.uni-freiburg.de
Markus Diesmann
diesmann@biologie.uni-freiburg.de
Computational Neurophysics and Bernstein Center for Computational Neuroscience,
Institute of Biology III, Albert-Ludwigs-University, Freiburg, Germany
Tim C. Pearce
t.c.pearce@le.ac.uk
Center for Bioengineering, University of Leicester, Leicester LE1 7RH, U.K.
Programmable logic designs are presented that achieve exact integration
of leaky integrate-and-ﬁre soma and dynamical synapse neuronal mod-
els and incorporate spike-time dependent plasticity and axonal delays.
Highly accurate numerical performance has been achieved by modifying
simpler forward-Euler-based circuitry requiring minimal circuit alloca-
tion, which, as we show, behaves equivalently to exact integration. These
designs have been implemented and simulated at the behavioral and
physical device levels, demonstrating close agreement with both numer-
ical and analytical results. By exploiting ﬁnely grained parallelism and
single clock cycle numerical iteration, these designs achieve simulation
speeds at least ﬁve orders of magnitude faster than the nervous system,
termed here hyper-real-time operation, when deployed on commercially
available ﬁeld-programmable gate array (FPGA) devices. Taken together,
our designs form a programmable logic construction kit of commonly
used neuronal model elements that supports the building of large and
complex architectures of spiking neuron networks for real-time neuro-
morphic implementation, neurophysiological interfacing, or efﬁcient pa-
rameter space investigations.
1 Introduction
Numerical simulation of large-scale networks of spiking neurons and
real-time neuromorphic system implementation requires signiﬁcant
Neural Computation 18, 2651–2679 (2006) C   2006 Massachusetts Institute of Technology2652 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
computational power. On the other hand, existing general-purpose micro-
processors, although extremely versatile, rely largely on serial processing
of data, which severely limits their computational throughput. This serial
dependence comes about through centralized arithmetic resources (such as
arithmetic logic unit or ﬂoating point unit), which are restricted to one or
a small number of concurrent operations. This class of numerical simula-
tion problem, due to its inherent parallelism, is challenging for single-core
processorarchitectures,renderingreal-timeoperationimpossibleforallbut
the simplest of neural systems. The fact that cluster computing approaches
are so successful in speeding up neuronal simulations demonstrates this
serial bottleneck in computation. Clearly, then, single-core microprocessor-
based neural simulators offer ﬂexibility but are limited by serial processing
constraints.
Fully customized hardware (such as analog VLSI) potentially offers
high computational power at the expense of ﬂexibility and design itera-
tion times. Consequently there is a need for a rapid prototyping platform
forneuronalmodelsthatisextremelyfastwithsimilarﬂexibilitytogeneral-
purpose microprocessors. Field-programmable hardware (in the forms of
ﬁeld programmable gate arrays (FGPAs) or ﬁeld programmable analog ar-
rays (FPAAs)) is an ideal technology to fulﬁll these requirements, since
devices are fast (up to 1 GHz clock speeds), can be reprogrammed in mil-
liseconds, and offer vast numbers of individual circuit elements that are
inherently parallel in nature and may be conﬁgured arbitrarily.
Programmable logic promises to deliver computational speeds ap-
proaching that of custom silicon solutions while providing a ﬂexible sub-
strate for numerical simulation. Delivering on this promise, however, re-
quires that ﬁeld-programmable neuronal element circuit designs exploit
the inherent parallelism of both the problem domain and target technol-
ogy. Without this ﬁnely grained parallelism, this approach cannot compete
in terms of processing performance with single-core microprocessors, in
the form of pipelined and reduced instruction set compute architectures
optimized for serial computation. Hence, to achieve the necessary perfor-
mance in programmable logic requires deploying robust, extremely low-
complexity neuron element designs that are inherently self-contained (i.e.,
no shared resources) and operate independently and in parallel. Further-
more, to achieve efﬁcient operation, one should consider only single clock
cycle iteration solutions (one clock cycle equals one numerical iteration)
without sacriﬁcing numerical performance. Furthermore, to achieve efﬁ-
cient operation, one should consider only single clock cycle iteration solu-
tions without sacriﬁcing numerical performance. Multicycle architectures
of neuronal models have been previously discussed (Graas, Brown, & Lee,
2004). Single-cycle architectures have independently been investigated by
Weinstein and Lee (2006).
FPGAs are digital integrated circuits (IC) that internally contain con-
ﬁgurable blocks of logic, as well as programmable interblock connectionsProgrammable Logic Construction Kit for Neuronal Modeling 2653
(Xilinx, 2002). Such devices can be conﬁgured in an arbitrary fashion to
perform a variety of signal and data processing tasks. As a ﬁrst step in
the implementation, designs must satisfy speciﬁc criteria to guarantee a
functional circuit, which is free from logical errors. Generally designs are
speciﬁed in a hardware description language (HDL), such as VHDL or
Verilog, although schematic-level entry is also possible. The HDL program
must then be synthesized, which implies a process of minimization and op-
timization,whichendsintheconversionofthesequentialcodeddescription
intoacollectionofparallelregisters(memorystorage)andBooleanrelation-
ships.Thegate-levelfunctionsobtainedfromthissynthesisprocessarethen
mapped onto the physical layer of the device, which means assigning the
logical design to available chip resources, depending on the chosen target
device. From this process, a map ﬁle is created, which deﬁnes the placing
of the logic onto the physical device, as well as the routing of the signals
between logic elements (so-called place and route). Finally a “bitstream” is
created, which is a ﬁle containing the information to internally conﬁgure
the FPGA device (see Xilinx, 2003, and Maxﬁeld, 2004). Design tools to aid
the FPGA synthesis process are in a rapid state of development.
Herewepresentasetofreducedcomplexityprogrammablelogicdesigns
for exponential decay and alpha and beta function synapse with spike-time
dependent plasticity (STDP) learning, as well as integrate-and-ﬁre soma
complete with axonal delays. Together these designs form a neuronal mod-
eling kit, simulating all the major components commonly used in neuronal
modeling of large-scale networks. The designs are of sufﬁciently low com-
plexitytoberealizableinmassivenumbersonasingleFPGAdevicethrough
multiplexing, yet feature single clock cycle numerical iteration.
We begin by making mathematical comparisons between forward-Euler
(FE) and exact integration (EI) numerical iteration schemes in the case
of linear ordinary differential equation (ODE) solvers. We show that this
comparison leads to a surprisingly simple solution for implementing neu-
ronal elements with exact numerical properties. By simulating and eval-
uating these designs, we demonstrate FPGAs as a viable technology for
large-scale spiking neuron model implementation. The designs are ﬁnally
tested against numerical and analytical results, verifying exact integration
behavior.
2 Numerical Considerations
Neuronal dynamics are commonly modeled in digital hardware using nu-
merical methods for solving ODEs. The simplest scheme for numerical
simulation of dynamical system behavior is the FE approach. For an initial
value problem (IVP) of the form
˙ y(t) = f (y,t), y(0) = y0, (2.1)2654 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
a ﬁrst-order (linear) approximation to its solution is given by the FE ap-
proximation (Lambert, 1973)
˜ yk+1 = ˜ yk +  f (yk,tk), (2.2)
which is an iteration scheme that begins from an initial value (y0), where k
is an integer deﬁning the iteration number (k = 0,1,···), and   is the ﬁxed
stepsize,inpartdeterminingtheaccuracyoftheapproximation.Ingeneral,
this numerical integration scheme has a truncation error of order O( 2).
More sophisticated iteration schemes can be used that reduce the single
step error (such as Runge-Kutta), but these require greater computational
effort and hence more complex implementations.
In this context, considering a homogeneous ﬁrst-order ODE of the form
˙ y(t) = ay(t), y(0) = y0, (2.3)
we can approximate its solution by applying the FE method as follows:
˜ yk+1 = ˜ yk(1 +  a), ˜ y(0) = y0. (2.4)
As an alternative, there is an exact way to perform digital simulation of
general linear-time-invariant systems, which is described by Rotter and
Diesmann (1999). In order to obtain an EI solution of equation 2.3, we make
use of this method, yielding the following expression:
yk+1 = e ayk, y(0) = y0. (2.5)
In general, both methods will yield different values. However, in order
to ﬁnd a relation between the FE and the EI method for the linear ODE case,
we make use of the parameter a, which determines the time constant of the
system as the relational parameter. Let us now deﬁne ˜ a and a as the factors
determining the time constants for the FE and EI solutions, respectively.
If both iteration schemes begin from the same initial value y0, then clearly
they will give identical results if the following condition is met:
1 +  ˜ a = e a. (2.6)
To satisfy this condition, we must ﬁrst solve for ˜ a and then substitute
this value into equation 2.4 instead of a. Hence, for ﬁrst-order linear-time-
invariant dynamical systems such as we consider here, the FE solution is
simply a timescaled version of the exact solution, which can be corrected
by adopting the ˜ a parameter in equation 2.4. In the following circuit im-
plementations, we will exploit this fact to derive extremely low-complexityProgrammable Logic Construction Kit for Neuronal Modeling 2655
circuit designs capable of implementing an EI of common neuronal model-
ing components.
3 Neuron Model
We ﬁrst consider the two main neuronal elements: a synapse model that
reproducespostsynapticdendriticcurrentdynamicsresultingfromapresy-
napticactionpotentialandasomamodelthatintegratesthesecurrentsover
time to generate a membrane potential. We describe the implementation of
both models in turn, providing the numerical details on which they are
based, as well as their optimized programmable logic circuit implementa-
tions.
3.1 Synapse Model
3.1.1 Exponential Decay Synapse Model. One of the most common meth-
ods to model dendritic currents generated by a synapse in response to
a spike train is through an exponential decay over multiple spike inputs
occurring at times (t1,t2,...,tj,...tl) to give the dendritic current
I(t) = w
l  
j=1
H(t − tj)e
−
t−tj
τe , (3.1)
where H(•) is the Heaviside function, w is the synaptic efﬁcacy (weight)
that determines the current increment in response to the arrival of each
presynaptic action potential, and τe is the time constant of the exponential
decay resulting from the arrival of each action potential.
Equation 3.1 is the solution to the ﬁrst-order ODE of the form
˙ I(t) = w
l  
j=1
δ(t − tj) −
1
τe
I(t), (3.2)
which can be approximated using the FE scheme for a =−1
τe as
Ik+1 = wδk+1,j + Ik
 
1 −
 
τe
 
. (3.3)
This numerical scheme may be implemented directly in programmable
logic by keeping a running accumulation of I over time by adding the
current value to itself and subtracting a fraction of Ik at each time step.
When a time step occurs in which a spike is received, the constant factor
w must also be added to the accumulated value. This scheme makes for a2656 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
very simple architecture with the exception of the multiplication involved
in the fractioning process, which in its most general form is expensive to
implement in programmable logic.
The resulting register transfer level (RTL) description of the synapse
that can be directly implemented in programmable logic is shown in
Figure 1A, as derived from the FE iteration scheme given in equation 3.3.
The implementation assumes that within a single clock cycle (identical to
the step time  ), only a single action potential may be received at the input.
This is reasonable if we assume that every synapse is connected to a single
presynaptic cell, which typically has an absolute refractory period far ex-
ceeding a single time step. n−bit integer arithmetic may be used without
loss of precision as long as we scale the circuit weight input wB according
to wB = kIw, by choosing kI such that the bit count output of the circuit,
B, extends across the integer number line {0,1,2,...,2n−1}. In this case, the
dendritic current can be recovered from the circuit output via I(t) = B/kI.
The real-time response of this circuit to a single action potential shown in
Figure 1B is an exponential decay, with starting value equal to the weight of
the synapse as shown in Figure 1C. The simulation proceeds at least three
orders of magnitude faster than biological time, requiring a dimensionless
speed up to factor kt to translate between biological time constants and
those achieved by the FPGA. The speed-up factors kt are calculated based
on a biological time constant for the synapse of 5 ms for Figures 1 and 2 and
a biological time constant for the soma of 20 ms for Figures 3, 4, and 7.
Using the same arguments as previously, it is clear that the FE scheme
describing the synaptic dynamics is equivalent to the EI solution as long as
the following corrected time constant is substituted for τe in equation 3.3:
˜ τe =
 
1 − e
−  
τe
. (3.4)
ThebehaviorofthiscircuittotheregularspiketrainshowninFigure1Dis
showninFigure1E.Thelimitingvalueofpeaksynapticcurrentiscompared
to the asymptotic value (dotted line), which in the case of a regular spiking
input at ﬁxed frequency, fsp,c a nb es h o w nt ob e
Ipeak =
w
1 − e
− 1
fsp˜ τe
. (3.5)
The asymptotic peak response of the circuit is found to be within one least
signiﬁcant bit (LSB) of the theoretical value, Ipeak.
3.1.2 Alpha and Beta Function Synapse Models. Alpha and beta functions
are also popular physical models for synaptic dynamics (Jack, Noble, &
Tsien, 1975; Gerstner & Kistler, 2002; Tuckwell, 1988). The dendritic currentProgrammable Logic Construction Kit for Neuronal Modeling 2657
0 2 4 6 8 10 12
0
5000
10000
15000
20000
25000
30000
Time (µs)
S
y
n
a
p
t
i
c
c
u
r
r
e
n
t
(
B
i
t
c
o
u
n
t
)
E
1
0
024681 0 1 2
Time (µs)
D
S
y
n
a
p
s
e
i
n
p
u
t
(
B
i
t
v
a
l
u
e
)
C
0
4000
8000
12000
16000
S
y
n
a
p
t
i
c
c
u
r
r
e
n
t
(
B
i
t
c
o
u
n
t
)
Time (µs)
0 1 02 03 0 51 5 2 5
B
0 10 20 30
1
Time (µs)
51 5 2 5
S
y
n
a
p
s
e
i
n
p
u
t
(
B
i
t
v
a
l
u
e
)
++
10
X
Ik+1
Ik
Zeros
Synapse
input
Synaptic current
X
W
∆
τe
~
A
+-
Figure 1: Exponential decay synapse implementation. (A) RTL description of
the synapse architecture. Spiking synaptic input is represented by 0-1 logic
levels, which controls the addition of weight value at each clock cycle. Thick,
solid lines represent m-bit digital buses (representing unsigned integers), and
the thin, solid line represents an individual control line. (B) A single spike event
at time t = 0 occurs at the input of the synapse. (C) An exponential decay is
generated in the circuit as response to the input. For a synapse with a current
incrementof50pA,thebitcountoutput, B,canbeconvertedtosynapticcurrent
using I(t) = B/kI,w h e r ekI = 3.28 × 1014 counts A−1. (D) Spiking input at a
regular ﬁring frequency, fsp = 1 MHz, used to test the synapse implementation.
(E) Real-time synapse response to this regular spiking input for FPGA synapse
implementation. The theoretical asymptotic value of peak synaptic current is
shown by the upper dotted line. The clock frequency was set to fclk = 100 MHz,
giving a step time,   = 10 ns, and a speed-up factor kt = 1953 resulting in time
constants, ˜ τe = 2.56µs and weight w = 16,384 for C and w = 8192 for E.2658 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
generatedbyanalphafunctionsynapserespondingtomultiplespikeinputs
occurring at times (t1,t2,...,tj,...tl) is described by
I(t) =
we
τα
l  
j=1
H(t − tj)(t − tj)e
−
t−tj
τα , (3.6)
where w is again the efﬁcacy of the synapse and τα is the characteristic time
constant for the synapse.
In a beta function model, current in a postsynaptic dendrite gen-
erated in response to multiple presynaptic spikes occurring at times
(t1,t2,...,tj,...tl) is described by
I(t) = wβ
l  
j=1
H(t − tj)
 
e
−
t−tj
τβ1 − e
−
t−tj
τβ2
 
, (3.7)
where τβ1 and τβ2 now determine the time constant of the synapse and β is
a parameter adjusted to produce a (dimensionless) beta function with unit
amplitude.
Rotter and Diesmann (1999) show how a matrix exponential can be
used to describe the exact solution to greater than ﬁrst-order linear ODEs
(see appendix A), such as alpha and beta function dynamics. Due to their
second-order dynamics, the matrix exponential for either the alpha (see
equation A.2) or beta function (see equation A.3), consists of two expo-
nential decay terms in the diagonal and a third off-diagonal term. This
fact provides us with a simple method for implementing either func-
tion by deploying in cascade two exponential decay elements detailed in
section 3.1.1.
There is, however, a nonzero off-diagonal term in both matrices, which
act as a scaling function that must ordinarily be applied when coupling
both exponential decay elements. We see that this matrix element is in fact
a constant factor and so can be implemented by means of a multiplier in
our circuit. Yet to keep the design as simple as possible, this constant factor
may be directly applied to the weight of the synapse instead. Adjusting the
synaptic weight in this way does not change the dynamics of the function
but eliminates the necessity of multipliers between exponential decay ele-
ments, resulting in a simpler circuit. Thus, the resulting adjusted weight for
the alpha function is
wadj =  e
−  
τα w, (3.8)Programmable Logic Construction Kit for Neuronal Modeling 2659
C
0 5 10 15 20 25 30
0
1000
2000
3000
4000
5000
6000
7000
S
y
n
a
p
t
i
c
c
u
r
r
e
n
t
(
B
i
t
c
o
u
n
t
)
Time (µs)
Time (µs)
1
B
0 5 10 15 20 25 30
S
y
n
a
p
t
i
c
i
n
p
u
t
(
B
i
t
v
a
l
u
e
)
Output
1
τe1
τe2
A
10
0
Wadj
ρ
++
Figure 2: Alpha and beta function synapse implementation. (A) Block diagram
of the implementation of an alpha and beta function generator implemented
by connecting in cascade two exponential decay synapses. When τe1 = τe2, the
function generated is an alpha function; otherwise, it is a beta function. Each
time an action potential arrives at the synapse, both the adjusted weight and
the value of ρ are added to the circuit. The parameter ρ is used only when
the alpha or beta function is ﬁtted to a soma circuit to produce a combined
neuron model; in such cases, the value of ρ is determined using equations 3.12
and 3.13. When simply performing isolated alpha function synapses, ρ has no
effect in the circuit; therefore, set ρ = 0, and no value of ρ will be added during
spikes, whereas wadj is determined by equations 3.8 or 3.9 depending on the
type of synapse. (B) A single spike event at time t = 0 occurs at the input of
the synapse. (C) An alpha function is generated as real time response to the
input. A 32-bit representation was used in the circuit with parameters clock
frequency fclk = 100 MHz, step time,   = 10 ns, kt = 1953, ˜ τα = 2.56µs, ρ = 0,
and adjusted weight wadj = 64.
whereas for the beta function, it is
wadj =
wτβ1τβ2
τβ1 − τβ2
 
e
−  
τβ1 − e
−  
τβ2
 
. (3.9)
The schematic for this class of synapse is shown in Figure 2A. In this
case, action potentials, acting as input, are received at the ﬁrst exponential
decay element, while the output from the second element follows alpha
and beta function dynamics. The parameter ρ, shown, will be proved to2660 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
be an important parameter when the circuit is used in combination with a
soma circuit to reproduce the membrane potential of a neuron, but is not
relevant for isolated synapses and can then be set to zero. Figure 2C shows
an alpha function generated by this circuit as a response to a single input
spikeappliedattimet = 0(seeFigure2B).Notethatwehaveagainadjusted
each time constant according to equation 3.4 to guarantee an EI solution.
3.2 Soma Model. The popular integrate-and-ﬁre model that receives
synapticinputoftheform I(t)hasamembranepotential,V(t),thedynamics
of which are described in between generated action potentials by
˙ V(t) =−
V(t)
τm
+
I(t)
Cm
, (3.10)
where the membrane capacitance, Cm,i sac o n s t a n ta n dτm is the character-
istic time constant for the cell. Once V(t) reaches a ﬁxed threshold potential,
Vth,s a y ,a tt i m et , V(t )− = Vth, the soma then resets to the afterhyperpolar-
ization potential, V(t )+ = Vahp, and a spike is generated by the soma and
integration of the current input continues. Again, using the FE approach,
the solution of equation 3.10 may be approximated as
Vk+1 = Vk
 
1 −
 
τm
 
+
 
Cm
Ik, (3.11)
resulting in a similar implementation to that of the exponential decay
synapse, except that the dendritic input is added at each time step. In
the next section, we show how, again, EI can be implemented within the FE
scheme if the dynamics of I are appropriately taken into account.
The RTL description for the soma implementation, complete with spike
generating mechanism, is shown in Figure 3A. A comparator is used to
detect threshold crossings, which gates a single ﬂip-ﬂop producing a 0 →
1 → 0 transition on the axon output lasting one clock cycle. If required,
a ﬁxed input bias may be added at each time step in order to determine
the resting potential of the cell. As before, the bit count held in the soma
potentialregister, B,isalinearlyscaledrepresentationofthesomapotential,
such that V(t) = B/kV. The soma model responding to two current pulses
of different magnitudes (shown in Figure 3B) is shown in Figure 3C.
3.3 CombinedNeuronModel. Wehavenowdevelopedprogrammable
logic circuits that implement EI solutions for all the main neuronal model-
ingcomponentsdescribedbyequations3.1,3.6,3.7,and3.10.Thesesynapse
a n ds o m ac i r c u i t sm u s tn e x tb ec o m b i n e di ns u c haw a ya st oo b t a i na nE I
solution for the complete neuron model. This process is not immediate,
since, unfortunately, combining individual elements with EI performanceProgrammable Logic Construction Kit for Neuronal Modeling 2661
0 5 10 15
0
10000
20000
30000
40000
Time (µs)
S
o
m
a
p
o
t
e
n
t
i
a
l
(
B
i
t
c
o
u
n
t
) C
0
250
350
0 5 10 15
Time (µs)
B
S
o
m
a
t
i
c
i
n
p
u
t
c
u
r
r
e
n
t
(
B
i
t
c
o
u
n
t
)
A
X
++
10
+-
Somatic input
current
Afterhyper-
polarization
potential
Comparator
Threshold voltage
Soma
output
Soma potential
Vk+1
Vk
Vth
>= Vk Vth
∆
τm
~
Figure3: Integrate-and-ﬁresomaimplementation.(A)RTLdescriptionofsoma
architecture.Somaticinputcurrent, I(t),afterhyperpolarization(reset)potential,
V ahp, and the threshold potential, V th, are each represented by a p-bit signed
integer(thick,solidlines).Aﬁreevent(spike)isrepresentedbya0→1transition
on the soma output line lasting a single clock cycle. (B) Somatic current pulses
(see Rotter & Diesmann, 1999, sec. 3.2.3, for the appropriate EI coefﬁcient) of
I = 250 and 350 bit count, respectively, lasting and separated by 4 µsw e r eu s e d
to test the soma implementation. (C) Real-time soma response to the current
input for the FPGA implementation with a speed-up factor kt = 7813. For a
soma with a threshold of 20 mV above resting potential, the soma potential bit
count, B, can be converted to soma potential V(t) = B/kV,w h e r ekV = 2 × 106
counts V−1. A 32-bit representation was used with parameters clock frequency
fclk = 100MHz,steptime,  = 10ns,resultingin ˜ τm = 2.56µs,and V th =40,000
which is indicated by the horizontal dotted line.
does not guarantee EI system performance when coupled together. Thus,
some important considerations must be taken into account before connect-
ing these components.
First, we use the matrix exponential of the combined system given by
Diesmann, Gewaltig, Rotter, and Aertsen (2001), as in equation B.1. In this
case the matrix exponential describes the state-space representation of a
combined neuron comprising synapse and soma. We see from this repre-
sentation that a simple cascade connection of these components will not
sufﬁce, since there exist nonvanishing off-diagonal elements that act as
constant factors between stages, again requiring the use of multipliers.
In order to optimize the combined neuron model for the case of alpha
function synapse and soma, we apply a linear transformation (see
appendix B), which permits the direct cascade of synapse and soma2662 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
elements without the need of coupling factors. An important consequence
of this transformation, however, is the necessity to again adjust the synap-
tic weight and also apply a constant addition factor (ρ) in between stages
according to
wadj =
τατm e
−  
τα (e
−  
τα − e
−  
τm )
C(τα − τm)
w, (3.12)
and
ρ =
τατm((τα − τm) e
−  
τα − τατm(e
−  
τα − e
−  
τm ))
C(τα − τm)2 w. (3.13)
These parameters may again be calculated off-line to avoid the necessity
of multipliers in the circuit, yielding the far simpliﬁed combined neuron
circuit shown in Figure 4A.
Using similar arguments for the exponential decay synapse and soma
case (see appendix C), we can again ﬁnd an adjusted weight that results in
combined EI performance:
wadj =
wτeτm
C(τe − τm)
(e
−  
τe − e
−  
τm ). (3.14)
Now that we have determined the conditions required to perform an
exact integration for each case, synapse and soma may now be combined
to form a self-contained, single clock cycle operation, spiking neuron im-
plementation. Figure 4A shows a generalized multisynapse scheme for a
singleneuron. Theneuron comprisesr synapses,which receivespikesfrom
different presynaptic cells, generating dendritic currents that are summed
in a single clock cycle and integrated to produce the membrane potential of
the soma. In general, a greater number of bits will be need to be deployed at
the soma as compared to the synapse, since integration of the input occurs
at each time step (not just after spike arrival) and multiple synapse inputs
may be summed. In order to minimize the total number of bits required in
thesoma,wecanlimitthemaximumandminimumweightsinthesynapses
to avoid overloading the soma, which, as we will see, is also a desirable
characteristic for the learning algorithm we consider later.
In order to demonstrate the EI performance of the combined designs,
we tested an exponential decay type synapse and soma. Input spikes were
applied to the combined neuron model as seen in Figure 4B which gave
rise to the resulting membrane potential shown in Figure 4C. We see that
in this case, the membrane potential crosses the threshold (Vth)t h r e et i m e s ,
generating the same number of spikes (see Figure 4D).Programmable Logic Construction Kit for Neuronal Modeling 2663
C
0 10 20
0
10000
20000
30000
40000
50000
60000
51 5
Time (µs)
S
o
m
a
p
o
t
e
n
t
i
a
l
(
B
i
t
c
o
u
n
t
)
0
Time (µs)
01 02 0 51 5
1
B
S
y
n
a
p
s
e
i
n
p
u
t
(
B
i
t
v
a
l
u
e
)
0
1
Time (µs)
D
01 0 2 0 51 5
S
o
m
a
o
u
t
p
u
t
(
B
i
t
v
a
l
u
e
)
Synapse
model
Synapse
model
Synapse 1
Synapse r
Zeros
Excitatory/
Inhibitory
Excitatory/
Inhibitory
Threshold potential
Afterhyperpolariza-
tion voltage
Soma
model
Soma
output
Presynaptic
cell r
Presynaptic
cell 1
+/-
+/-
+/-
+/-
+
+
A wadj1
wadjr
Figure 4: Combined neuron implementation. (A) RTL description of the com-
bined neuron architecture. Spiking input is represented by 0-1 logic levels re-
ceived at r synapses. Weights, wi, for synapses (i = 1,...,r), the afterhyperpo-
larization(reset)potential, V ahp,andthethresholdpotential, V th,arerepresented
bym-bitunsignedintegers(thicksolidlines).(B)Spikinginputataregularﬁring
frequency, fsp = 200 KHz, used to test the combined neuron implementation
(only one synapse input (i = r = 1) was considered). (C) Real-time soma re-
sponse to the synapse input for the FPGA combined neuron implementation.
(D) Spikes generated by the combined neuron model in response to the spik-
ing input deﬁned in B. A 32-bit representation was used for both circuits with
clock frequency fclk = 100 MHz, step time,   = 10 ns, and kt = 7813. For the
synapse, time constant of ˜ τe = 2.56µs and adjusted weight wadj ≈ 0.61681; for
the soma, ˜ τm = 2.56µs, C = 12 pF, and V th = 60,000 which is again indicated by
the horizontal dashed line.
3.4 Axonal Delays. Axonal delays play an important role in the dy-
namics of spiking neuron network models (Crook, Ermentrout, Vanier, &
Bower, 1997) and are simple to implement in our design. In general, axonal
delayswillnotbeconstantacrossallcellsofthenetwork.Instead,eachaxon
should have associated with it a particular delay (see Figure 5A).
When a spike occurs at a soma output, this should not be delivered in-
stantly to the target synapse. Rather, the action potential must be presented
in some n  time steps after the spike occurred. In our design, axonal delays
areimplementedusingaringbuffer(seeFigure5B).Thespikehistory(upto
time n ) is stored in the ring buffer from each soma, and at each time step,
the nth element is transmitted to the target synapse, while the current state
of the soma is written to the buffer. Ring buffers have the advantage that
spikehistorydataareautomaticallyoverwrittenastheybecomeredundant.2664 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
Figure 5: Axonal delay and STDP learning implementation. (A) Each axon is
programmed to have a speciﬁc delay (deﬁned by Dn for the n-axon), which is
implemented using a ring buffer. (B) Action potentials are read from the ring
buffer at time t; the synapses from other neurons receive the delayed spikes
according with the programmed delay for each neuron. (C) RTL design of the
STDP learning unit. Superpositions of exponential decays with height equal to
the amount of weight modiﬁcation ( W+ to increment or  W− to decrement)
are generated; the order of presynaptic and postsynaptic ﬁre times deﬁnes both
the amount and the sign of the weight modiﬁcation. (D) Neuron model with
STDP learning. The STDP unit receives presynaptic spikes arriving at a speciﬁc
synapse as well as the action potentials generated by its respective soma. The
adjusted weight is fed to the synapse every time a presynaptic or postsynaptic
spike takes place.Programmable Logic Construction Kit for Neuronal Modeling 2665
3.5 Learning by STDP. There are many Hebbian (correlation)-based
plasticity mechanisms that can be used for learning purposes in spik-
ing neuron models (Abbott & Nelson, 2000; G¨ utig, Aharonov, Rotter, &
Sompolinsky, 2003). One of the most common of these is spike-timing de-
pendent plasticity (STDP) proposed by Song, Miller, and Abbott (2000).
This plasticity mechanism relies on the difference in presynaptic and post-
synaptic spike times to adjust the strength of excitatory synapses.
As an example of how plasticity mechanisms may be conveniently com-
bined with our neuronal element designs for on-chip learning in real time,
we have implemented STDP with our synapses. Figure 5C shows the RTL
design of the STDP unit, which contains two exponential decay elements
reused from our synapse designs. The exponential decay unit situated in
the upper section of the diagram receives presynaptic spiking input in the
same way as the synapse itself. Each time a spike is received at this ex-
ponential decay unit, a value equal to the maximum amount of change of
the synapse strength ( W+) is added to its output, while during latency
periods, this value decays exponentially according to equation 3.1. Once a
spike is generated in the postsynaptic neuron, the strength of the synapse
is increased by the current value output by this exponential decay unit. The
second exponential decay unit behaves in the opposite way. That is, the
second element has as input postsynaptic action potentials that add to the
current value an amount equal to the maximum possible change of synapse
weakening ( W−). When a presynaptic action potential is received, the
current value of this exponential decay unit is subtracted from the weight
of the synapse. In this way, synapses compete to control the postsynaptic
spike timing of the neuron. Figure 5D shows the implementation of the
STDP unit in the generalized neuron model.
The weight of the synapse is limited to the range [0,Wmax]. Stable synap-
tic modiﬁcation requires that  W+ <  W− < Wmax, and experimental re-
sultssuggestthat W− W+ ≈ 1.1(Songetal.,2000).Themaximumweight
Wmax should be chosen so as not to overload the soma.
4 Results
RTLdesignsforeachneuronalcomponentwereimplementedinVHDLand
executed on a PCI-based FPGA development board (model ADM-XRC-
II Pro, manufactured by Alpha Data Systems, UK), which hosts a Xilinx
Virtex II Pro device (product code XC2VP100-5). Numerical results were
tested against the equivalent exact numerical solution as calculated on a
standard PC with Intel Pentium IV running at a clock speed of 3.06 GHz
with 1 GB of external memory, programmed in C++. All responses shown
in Figures 1B and 1C, 2B and 2C, and 3B and 3C were compared against
theircorrespondingEIsolutionandthedifferencesplottedinFigure6(left).
In all cases, the errors show a random behavior about zero, which suggests
that the difference is due to round-off as a result of the ﬁnite-length number2666 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
representation. This conclusion is conﬁrmed by the histograms of the error
(see Figure 6, right), which show that in all cases, the distributions closely
resemble the shape of the uniform probability density function, which is
the expected behavior generated by a round-off process (Barnes, Tran, &
Leung, 1985).
With increasing time, the response error becomes more regular as the
output of the circuits approaches zero, since the time derivative becomes
very small (itself an exponential decay). In the case of the alpha function
synapse (see Figure 6B), although we see a random uniform distribution
about zero of ±0.5 bits for each of the two stages, when combined, the two
errors may accumulate in the positive direction leading to an error greater
than±0.5bitsforasingleneuron.Howeverthetotalerrorwillneverexceed
±1 bit and is not systematic, since the asymptotic behavior is toward zero.
An additional experiment was carried out using a combined neuron
model comprising an alpha function synapse and a soma, which was ex-
cited by a single spike at time (t = 0). The membrane potential was again
compared against the numerical solution given by the EI method. The error
between the circuit implementation and the EI solution shows the same be-
havior as in the preceding case: a uniform distribution due to the round-off
process (see Figure 6D).
Tofurthertestthenumericalperformanceandrobustnessofourdesigns,
a combined neuron simultaneously integrating signals from ﬁve exponen-
tial decay synapses driven by Poisson processes (see Figure 7A) was im-
plemented. The total resulting synaptic current is shown in Figure 7B. The
difference between the membrane potential obtained by the EI scheme and
t h ev a l u eg e n e r a t e db yt h ec i r c u i t( s e eF i g u r e7 C ) ,a g a i ns h o w sar a n d o m
behavior about zero, ruling out any systematic error in the circuit behavior
Figure 6: Differences between the numerical solution and the circuit responses.
Difference over time (left) and histograms of their corresponding error distribu-
tion (right) for (A) exponential decay synapse response, (B) the alpha function
synapse response, (C) the soma response, and (D) combined neuron model
made up of alpha function synapse and integrate-and-ﬁre model. The error
corresponds to those responses shown in Figures 1 to 3 except for the com-
bined neuron model, whose response comes from a single spike at (t = 0). The
parameters of the three ﬁrst circuits are the same as the ones speciﬁed for
each response, while for the combined model, the parameters are ˜ τα = 25.6 µs,
˜ τm = 4.096 ms, C = 250 pF, adjusted weight wadj ≈ 0.5089, and ρ ≈ 0.2553. All
errors show the same characteristics: a random behavior about zero and a uni-
formly distributed probability density function. The clock frequency was set
to fclk = 100 MHz, giving a time step of   = 10 ns. Numerical solutions were
carried out in C++ using long double precision, using an 80 bits representation
with 64 bits for the mantissa and 14 for the exponent.Programmable Logic Construction Kit for Neuronal Modeling 2667
D
i
f
f
e
r
e
n
c
e
(
B
i
t
C
o
u
n
t
)
0 5 10 15 20 25
0
0.5
1.0
-1.0
-0.5
Time (µs)
A
0 1 02 03 04 05 06 07 0
Frequency
0
-0.4
-0.8
0.8
0.4
D
i
f
f
e
r
e
n
c
e
(
B
i
t
C
o
u
n
t
)
B
D
i
f
f
e
r
e
n
c
e
(
B
i
t
C
o
u
n
t
)
0 1 02 03 04 05 0
0
0.5
1.0
-1.0
-0.5
Time (µs)
02 04 06 08 01 0 0
Frequency
D
i
f
f
e
r
e
n
c
e
(
B
i
t
C
o
u
n
t
)
0
0.4
0.8
-0.4
-0.8
Exponential decay synapse
Alpha function synapse
C
0 5 10 15
-0.8
-0.4
0
0.4
0.8
Time (µs)
D
i
f
f
e
r
e
n
c
e
(
B
i
t
C
o
u
n
t
)
D
i
f
f
e
r
e
n
c
e
(
B
i
t
C
o
u
n
t
)
01 0 20 30 40 50
0.8
0.4
0.0
-0.4
-0.8
Frequency
Soma
D
0 500 1000 1500
-0.8
-0.4
0
0.4
0.8
Time (µs)
D
i
f
f
e
r
e
n
c
e
(
B
i
t
C
o
u
n
t
)
D
i
f
f
e
r
e
n
c
e
(
B
i
t
C
o
u
n
t
)
0.8
0.4
0.0
-0.4
-0.8
Frequency
Combined neuron model with alpha synapse
0 200 400 6002668 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
(see Figure 7D). In this case, 308 action potentials were generated by the
circuits (see Figure 7E), exactly the number of spikes obtained through the
EI scheme. Critically, there were no differences in ﬁring times for both the
circuit and the numerical solution, all spikes were time coincident within a
singleclockcycle(seeFigure7F).Thisconﬁrmsthatforthepurposesofsim-
ulation of integrate-and-ﬁre neuron and exponential decay–based synapse
dynamics, our circuit produces EI performance under realistic simulation
conditions, limited only by the restricted bit length of the representation,
which is the case for any digital neuronal model implementation.
In order to further test the comparative numerical performance of our
FPGA implementation with PC technology, we deployed a medium-scale
neuronal model described previously (Pearce et al., 2005) and tested it
against its software equivalent. Brieﬂy, this model was composed of 100
integrate-and-ﬁre somas and 675 exponential decay synapses describing
two classes of neuron in the mammalian olfactory bulb (see Figure 8A).
Mitral/tufted (M/T) neuron outputs in the vertebrate olfactory system
are under tight regulation through lateral inhibition, mediated by dendro-
dendritic interaction. This interaction effects complex synchronous ﬁring
behavior across the bulb output that is stimulus speciﬁc, reminiscent of
that observed in electrophysiological studies (Friedrich & Laurent, 2001;
Friedrich, Habermann, & Laurent, 2004) and demonstrated in our model
(see Figures 8B and 8C). Due to the similarity of synapse time constants in
this model, it was possible to implement the design on a single Virtex II
Pro device (Xilinx) running at a very conservative clock speed of 33 MHz.
Figure 7: Numerical error for the combined neuron. (A) Random spikes trains
applied at the input of each synapse. (B) Total synaptic current generated
by the circuit. (C) The error between the exact integration numerical imple-
mentation and the circuit response for the membrane potential again shows a
random behavior about zero. (D) Histogram of the error distribution, which
uniform distribution corroborates the round-off effects as the only cause of
the error. (E) Over 300 spikes were generated by the soma. (F) Spike timing
for both the exact integration numerical implementation and circuit response.
All spikes from the circuit were exactly coincident with the spike times given
by the numerical solution. An m-bit representation was used with parameters
˜ τe1 = 5.12µs, ˜ τe2 = 2.56µs, ˜ τe3 = 1.28µs, ˜ τe4 = 0.64µs, ˜ τe5 = 1.28µs, adjusted
weights wadj1 ≈ 0.077026, wadj2 ≈ 0.077101, wadj3 ≈ 0.077253, wadj4 ≈ 0.077558,
andwadj5 ≈ 0.077253,whereas ˜ τm = 2.56µs,forthesoma,capacitanceC = 12pF,
thresholdpotential,V th =65,000.Theclockfrequencywassetto fclk = 100MHz,
giving a step time   = 10 ns and a speed-up factor kt = 7813. Numerical so-
lutions were carried out in C++ using long double precision using an 80 bits
representationwith64bitsoutofthemforthemantissaand14fortheexponent.Programmable Logic Construction Kit for Neuronal Modeling 26692670 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
The same network was coded in C++ using the same set of equations as
implemented by the hardware, compiled using GNU gcc, and executed
on an Intel Pentium IV 3.06 GHz with 1 GB RAM. As before, the error
between the circuit and the software implementation is within a single bit
anduniformlydistributed(seeFigure8D).Spiketiminganalysisrevealsthat
no incorrect translation occurs between the FPGA- and PC-based solutions.
5C o n c l u s i o n
Designs for leaky integrate-and-ﬁre soma and exponential, alpha, and beta
function synapses with STDP learning and axonal delays are presented in
this article, which are suitable for implementation in programmable logic.
Together these designs provide a neuronal modeling construction kit of the
most popular elements that can be deployed in arbitrary conﬁgurations for
Figure 8: Simpliﬁed olfactory bulb neuronal model implementation compris-
ing 100 somas and 675 exponential decay synapses. (A) Schematic of the ol-
factory bulb architecture. Twenty-ﬁve mitral/tufted (M/T) cells, represented
by integrate-and-ﬁre elements provide the main olfactory bulb output (corre-
sponding to the lateral olfactory tract). These cells are reciprocally connected
by exponential decay synapses, which mediate lateral inhibition in the model
representing inhibitory coupling between M/T cells in vertebrates (open cir-
cles: excitatory synapse; closed circles: inhibitory synapse). Excitatory input
to the model is provided by olfactory receptor neurons (ORNs) driven by
population-coded receptor input shown by irregular polygons. Synaptic input
from identical ORNs is summed to represent receptor input convergence at a
single glomerulus (ellipse) (after Pearce et al., 2005). (B) Firing behavior of all
25 M/T cells in the network over time. (C) Membrane potential of a randomly
selected M/T cell. (D) The error between the exact integration numerical im-
plementation and the circuit response for the membrane potential and spike
timing analysis. A 32-bit representation was used with parameters ˜ τeE ≈ 2.1m s
fortheexcitatorysynapsesand ˜ τeI ≈ 17.1msfortheinhibitorysynapses.Thead-
justed weights for the excitatory synapses were ﬁxed to wadjE = 256.8, whereas
the inhibitory weights wadjIwere randomly chosen in a range between 1.5a n d
32, inclusive. The time constants of both the ORN and mitral cells were set
to ˜ τm ≈ 10.2 ms, with a capacitance C = 250 pF and threshold potential V th =
432,640, where kV = 21.6 × 106 counts V−1. For the ORNs, constant values ran-
domly selected in a range between 4240 and 6400 were used to represent a
constant concentration of an odor stimulus. The clock frequency was set to
fclk = 33 MHz, giving a step time   ≈ 30.3 ns and a speed-up factor kt = 3300.
Numerical solutions were carried out in C++ using long double precision using
an 80 bits representation with 64 bits out of them for the mantissa and 14 for the
exponent.Programmable Logic Construction Kit for Neuronal Modeling 26712672 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
hyper-real-time implementation. This programmable logic construction kit
supports the building of large and complex architectures of spiking neuron
networksbymeansofanefﬁcientcommunicationandmultiplexingscheme
of the neuronal elements. The circuits proposed here have the advantage
that they work in parallel rather than depending on serial computation,
being capable of simulating neuronal models in hyper real time. The source
codes (VHDL) for these designs have been made freely available for down-
load at http://www.neurolab.le.ac.uk/fpga/.
The implementation we present is restricted to the class of integrate-
and-ﬁre models where synapses are described by time-dependent currents.
However, with conductance-based synapses, only the differential equation
for the membrane potential is no longer linear time invariant. Future work
will need to explore whether an implementation with exact integration
(EI) for the conductances and an approximate method for the membrane
equation (e.g., FE as described in section 2) can be effectively combined.
Our circuit designs have been demonstrated to perform exact integra-
tion in a singleclockcycleand are also self-contained (no shared resources).
The advantage of single clock cycle operation is that designs may operate
faster than biological timescales (milliseconds); depending on their com-
plexity and the extent of optimization, processing speeds may be able to
approach the clock frequency of the FPGA. This hyper-real-time operation
provides us with the opportunity of multiplexing the physical neuron ar-
chitecture to achieve far greater neuron numbers. This is made possible
since only digital spike events need to be communicated between neurons,
thereby simplifying connectivity circuitry. Thus, programmable logic offers
neuronal simulation speeds approaching those of fully custom silicon solu-
tions, but not at the expense of ﬂexibility, design times, or network capacity,
since once the design process outlined in section 1 is complete, large-scale
models may be downloaded and adapted in milliseconds.
Most important, by using fully parallel single clock cycle implementa-
tions, our neuronal modeling approach leverages the impressive advances
in programmable logic, in terms of clock speeds and device capacities,
specialized DSP components, cost, and ongoing development of advanced
design tools. FPGA capabilities and operating speeds are currently under
exponential growth. Over the past 10 years, capacity has increased more
than200-foldwitheveryindicationthatMoore’slawwillcontinueforthese
devicessomewayintothefuture(Alfke&Hitesh,2005).Thisshouldensure
the possibility for growth of neuronal modeling capability that is commen-
surate with advances in programmable logic.
WecanestimatethetotalneuronalcapacityofanygivenFPGAdeviceby
calculating the amount of resources each neuronal element would require.
FPGA resources are measured in terms of slices, each of which contains
the fundamental digital elements required to implement arbitrary combi-
national logic functions (Xilinx, 2002). Thus, the number of slices required
per neuron element limits the total physical numbers implementable on aProgrammable Logic Construction Kit for Neuronal Modeling 2673
single device. Using a 16 bit representation for the synapse and a 32 bit
representation for the soma requires a total of 70 slices for the (exponential
decay) synapse circuit and 90 slices for the soma. Current FPGA devices
exceed 50,000 slices. Therefore, it is possible to conﬁgure, for instance, over
500 physical synapses without STDP (or 250 with STDP) and 100 physical
somas, running in parallel with single clock cycle iteration.
Such an arrangement would be ideal in the case of small and medium
networkdesignsforwhichwerequirehyper-real-timeoperationinorderto
understand or optimize its performance in different portions of its parame-
ter space. Over 100,000 prototypes of the same network could be simulated
in the time taken for one iteration of the biological network it represents.
Hence, programmable logic provides a powerful technology for under-
standing and optimizing small and medium spiking neuronal models.
Synapses with a common target soma often have similar time constants
in the brain. We can exploit this fact to make further gains in total synapse
numbers by using only one physical synapse and convolving the incom-
ing spikes of many virtual synapses. This is mathematically equivalent
to implementing large numbers of separate synapses with identical time
constants. In this way, a very large degree of synaptic convergence can be
achieved, which is particularly relevant to cortical models.
In order to test the comparative speed performance of our FPGA imple-
mentationwithasinglePC,weranparalleltrialsoftheolfactorybulbmodel
(see Figure 8) on both PC and FPGA. The identical network was coded in C
using the same set of equations as implemented by the hardware, compiled
using GNU gcc and executed on an Intel Pentium IV 3.06 GHz with 1 GB
RAM.ThePCperformed10,000numericaliterationsofthenetworkin0.76s
withoutdiskaccess.Duetosingleclockcyclenumericaliteration,theFPGA
performed the same number of steps on-chip in 303.03 µs, giving a speed-
up factor of about 2500. This provides a speed-up factor of three to four
orders of magnitude, which may be further optimized either by improving
the placement and routing of the logic of the design implementation or by
employing a faster FPGA (e.g., Virtex IV devices offer faster internal speeds
and shorter propagation delays; Xilinx, 2004). In further experiments, we
wereabletoincreasetheclockfrequencyupto50MHz.Inbothcases,about
55% of the FPGA resources were utilized.
We may also choose to exploit this hyper-real-time operation to increase
network sizes through the use of a multiplexing scheme. In this case, we
create virtual exponential decay units by storing and replacing the current
state at each numerical step and, optionally, the associated parameters such
as weight, afterhyperpolarization potential, threshold potential, and time
constant. This requires storage either on or off chip, which, depending on
the access time, will determine the communication overhead associated
with applying a multiplexing scheme.
FPGAshaveon-chipRAMblockstoragewithparallelaccesssingleclock
cycle operation, which minimizes this overhead (Xilinx XC2VP125 device2674 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
has 556 such blocks, which may be variously conﬁgured). In principle, for
this Xilinx device, we can use 556, 1024 × 18-bit buffers to create 105 to
106 virtual exponential decay elements operating in biological time (18-bit
precision).However,inpractice,therearetwoissuesthatmustbeaddressed
when using FPGAs to build large-scale neuronal models that exploit such
multiplexed architectures. Firstly, at each numerical step, we must store,
update,andreplacethestateofeachelement.Thisimposesatimeoverhead,
which is at least one additional clock cycle per numerical iteration step to
move the data to and from on-chip memory. This reduces the total number
of virtual elements that can be operated in biological time. Second, we must
consider connectivity between neurons, which imposes constraints on the
scale of architectures that can be deployed on a single device.
Cortical models represent a particular challenge due to highly conver-
gent synaptic input, which is often thousands of synapses targeting each
neuron.Sincemostcorticalmodelsadoptonlyoneortwotimeconstantsfor
theseconvergentsynapses,arrivinginputspikesmaybeconvolvedthrough
only one virtual synapse for each time constant, leading to a massive re-
duction in resource use. Such scaling issues represent an important focus of
futureresearchforimplementingFPGAneuronalmodelstoachievecortical
model scales in real time.
Appendix A: Alpha and Beta Functions Matrices Exponentials
The general solution of the exact integration (EI) scheme (Rotter &
Diesmann, 1999) for an nth order system of linear-time-invariant ODEs
is described by
yk+1 = eA yk, y(0) = y0, (A.1)
where eA  is the matrix exponential of the system. The diagonal elements
of this matrix describe the step dynamics for each of the exponential decay
stages to be used as part of the solution, which can be cascaded to simulate
thenthordersystem.However,tomaintainanexactsolution,theremaining
off-diagonal elements must be applied as coupling factors between the
cascaded stages.
A.1 Matrix Exponential of the Alpha Function Synapse.
Pα = eA  =
 
e
−  
τα 0
 e
−  
τα e
−  
τα
 
. (A.2)Programmable Logic Construction Kit for Neuronal Modeling 2675
A.2 Matrix Exponential of the Beta Function Synapse.
Pβ = eA  =



e
−  
τβ1 0
τβ1τβ2(e
−  
τβ1 −e
−  
τβ2 )
τβ1−τβ2
e
−  
τβ2


. (A.3)
Appendix B: Matrix Exponential of the Combined Neuron Model
Based on Alpha Function Synapse
B.1 Matrix Exponential of the Combined Neuron Model Based on
Alpha Function Synapse.
Pm = eA  =




e
−  
τα 00
 e
−  
τα e
−  
τα 0
τατm((τα−τm) e
−  
τα −τατm(e
−  
τα −e
−  
τm ))
C(τα−τm)2
τατm(e
−  
τα −e
−  
τm )
C(τα−τm) e
−  
τm



.
(B.1)
B.2 LinearTransformationoftheMatrixExponentialoftheCombined
Neuron Model Based on Alpha Function Synapse. Here we describe a
lineartransformationthatmaybeappliedtothematrixexponentialinorder
to simplify the combined neuron circuit. The goal of this transformation is
to reduce the hardware required to implement the combined neuron model
by removing the requirement for multipliers.
First, for simplicity, let us rewrite the matrix exponential (Pm = eA )i n
a more general form as
Pm =



α 00
r β 0
qp γ


, (B.2)
where α, β,a n dγ deﬁne the three exponential decay circuits connected in
cascade and p, q,a n dr are constant coupling factors, which call for the use
of multipliers. We express the matrix exponential in the form



α 00
1 β 0
01γ


, (B.3)
whichmeansthattheresultingcombinedneuronmodelcircuitwillbemade
up of three exponential decay elements connected in cascade but, crucially,
with no multiplication factors coupling them. The best way to achieve this2676 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
form without altering the dynamics is by applying a linear transformation.
We must ﬁnd a linear transformation that, when applied to the matrix
exponential, such as equation B.1, yields a transformed matrix exponential
of the form shown in equation B.3. The following transformation matrix
fulﬁlls this requirement:
Q =



b 00
fc0
00d


. (B.4)
Thus, we must now ﬁnd the values of b, c, d,a n d f that transform equa-
tion B.1 into the form of equation B.3.
Applying this transformation matrix to the general solution, equa-
tion A.1, gives
y = Qz, (B.5)
which gives rise to the linear transformation of equation A.1 given by
zk+1 = Q−1PmQ · zk, z(0) = z0, (B.6)
where
Q−1PmQ =



α 00
 
−
f
bcα + 1
cr
 
b +
f
c ββ0
b
dq +
f
d p c
d p γ


. (B.7)
As was already mentioned, in order to keep simplicity in the circuit, the
transformed matrix exponential should be in the form of equation B.3. The
following conditions ensure that this is accomplished:
− f α + br + fβ = c,
bq + fp= 0, (B.8)
cp = d.
To maintain the dynamic of the membrane potential, it is required that
d = 1. Now we can easily solve for b, c,a n d f , which results in
b =
−1
(β − α)q − pr
(B.9)Programmable Logic Construction Kit for Neuronal Modeling 2677
c =
1
p
(B.10)
d =1 (B.11)
f =
q
p((β − α)q − pr)
. (B.12)
In turn, the transformed initial conditions for a single presynaptic action
potential should then be
z(0) =



1
b
we
τα
−
f
bc
we
τα
0


. (B.13)
In the context of the combined circuit, the ﬁrst and second row of equa-
tion B.13 correspond to the initial conditions of the exponential decay
units (in the same order), which produces the alpha function synapse (see
Figure2A).Theﬁrstrowrequiresustoadjustthesynapticweightaccording
to
wadj =
1
b
we
τα
, (B.14)
where b is deﬁned in equation B.9 and w is the synaptic weight.
In turn, the second row gives the initial condition that must be applied
to the second exponential decay unit of the alpha function synapse, which
is carried out by making
ρ =−
f
bc
we
τα
, (B.15)
where b, c,a n d f are deﬁned in equations B.9 to B.12. It is important to
point out that ρ must be nonzero in combined neuron models based on
alpha and beta function synapses; otherwise, ρ = 0 and can be neglected.
Furthermore, since equation A.1 describes the dynamics of the combined
neuron model in between spikes, both wadj and ρ should be added every
time an action potential is received at the synapse (see Figure 2A).
Finally, the soma model requires initial conditions set to zero. Meeting
these conditions will guarantee an EI of a leaky integrate-and-ﬁre model
receiving dendritic currents modeled by alpha functions.2678 R. Guerrero-Rivera, A. Morrison, M. Diesmann, and T. Pearce
Appendix C: Exponential Decay Synapse-Based Combined Neuron
Model Matrix Exponential
C.1 Matrix Exponential of the Combined Neuron Model Based on
Exponential Decay Synapse.


e
−  
τe 0
τeτm(e
−  
τe −e
−  
τm )
C(τe−τm) e
−  
τm

. (C.1)
Acknowledgments
This work was funded by the EU-FPV IST program in Future and
Emerging Technologies IST-2001-33066 (to T.C.P.) and ESPRC grant
GR/R37968/01(P)(to T.C.P.). R.G. is funded by CONACYT, Mexico. A.M.
and M.D. are partially funded by DIP F1.2, EU Grant 15879 (FACETS), and
BMBF Grant 01GQ0420 to BCCN Freiburg. Special thanks go to Manuel
S´ anchez-Monta˜ n´ es with whom we had many useful discussions regard-
ing exact integration solutions and axonal delay implementations. Thanks
also go to Carlo Fulvi-Mari, who provided the analytical expression for the
asymptotic peak synaptic output.
References
Abbott, L. F., & Nelson, F. B. (2000). Synaptic plasticity: Taming the beast. Nature
Neuroscience, 3, 1179–1183.
Alfke, P., & Hitesh, P. (2005). Achieving breakthrough performance with Virtex-
4, the world’s fastest FPGA. (Online conference), Accessed February 2005 at
http://www.techonline.com/community/ed resource/webcast/37558.
Barnes, C. W., Tran, B. N., & Leung, S. H. (1985). On the statistics of ﬁxed-point
roundoff error. IEEE Trans. Acoustic, Speech, and Signal Proc., 33, 595–606.
Crook, S. M., Ermentrout, G. B., Vanier, M. C., & Bower, J. M. (1997). The role of
axonal delay in the synchronization of networks of coupled cortical oscillators. J.
Comp. Neurosci., 4, 161–172.
Diesmann, M., Gewaltig, M., Rotter, S., & Aertsen, A. (2001). State space analysis of
synchronousspikingincorticalneuralnetworks.Neurocomputing,38–40,565–571.
Friedrich, R. W., Habermann, C. J., & Laurent, G. (2004). Multiplexing using syn-
chrony in the zebraﬁsh olfactory bulb. Nat. Neuroscience, 7, 862–871.
Friedrich,R.W.,&Laurent,G.(2001).Dynamicoptimizationofodorrepresentations
by slow temporal patterning of mitral cell activity. Science, 291, 889–894.
Gerstner, W., & Kistler, W. (2002). Spiking neuron models. Cambridge: Cambridge
University Press.
Graas, E. L., Brown, E. A., & Lee, R. H. (2004). An FPGA-based approach to high-
speed simulation of conductance-based neuron models. Neuroinformatics, 2(4),
417–436.Programmable Logic Construction Kit for Neuronal Modeling 2679
G¨ utig, R., Aharonov, R., Rotter, S., & Sompolinsky, H. (2003). Learning input corre-
lations through nonlinear temporally asymmetric Hebbian plasticity. J. Neurosci.,
23, 3697–3714.
Jack, J. J. B., Noble, D., & Tsien, R. W. (1975). Electric current ﬂow in excitable cells. New
York: Oxford University Press.
Lambert, J. D. (1973). Computational methods in ordinary differential equations. New
York: Wiley.
Maxﬁeld, C. (2004). The design warrior’s guide to FPGAs. Burlington, MA: Newnes.
Pearce, T. C., Koickal, T., Fulvi-Mari, C., Covington, J. A., Tan, F. S., Gardner, J. W.,
& Hamilton, A. (2005). Silicon-based neuromorphic implementation of the ol-
factory pathway. In 2005 2nd International IEEE Conference on Neural Engineering
(pp. 307–312). Piscataway, NJ: 2005.
Rotter, S., & Diesmann, M. (1999). Exact digital simulation of time-invariant linear
systems with applications to neuronal modeling. Biological Cybernetics, 81, 381–
402.
Song, S., Miller, K. D., & Abbott, L. F. (2000). Competitive Hebbian learning through
spike-timing dependent synaptic plasticity. Nature Neuroscience, 3, 919–926.
Tuckwell,H.C.(1988). Introductiontotheoreticalneurobiology.Cambridge:Cambridge
University Press.
Weinstein, R. K., & Lee, R. H. (2006). Architecture for high-performance FPGA im-
plementations of neural models. Journal of Neural Engineering, 3, 21–34.
Xilinx. (2002). Virtex-II pro: Platform FPGA handbook.S a nJ o s e ,C A .
Xilinx. (2003). Development system reference guide. San Jose, CA. Retrieved
March 17, 2005, from http://toolbox.xilinx.com/docsan/xilinx6/books/docs/
dev/dev.pdf
Xilinx (2004). Virtex-4 family overview. In Virtex-4 handbook. San Jose, CA. Re-
trieved October 21, 2004, from http://direct.xilinx.com/bvdocs/publications/
ds112.pdf
Received August 10, 2005; accepted April 27, 2006.