Floating-Point Multiplication Using Neuromorphic Computing by Dubey, Karn et al.
ar
X
iv
:2
00
8.
13
24
5v
1 
 [c
s.E
T]
  3
0 A
ug
 20
20
Floating-Point Multiplication
Using Neuromorphic Computing
Karn Dubey Urja Kothari Shrisha Rao
Abstract
Neuromorphic computing describes the use of VLSI systems to
mimic neuro-biological architectures and is also looked at as a promis-
ing alternative to the traditional von Neumann architecture. Any new
computing architecture would need a system that can perform floating-
point arithmetic. In this paper, we describe a neuromorphic system
that performs IEEE 754-compliant floating-point multiplication. The
complex process of multiplication is divided into smaller sub-tasks per-
formed by components Exponent Adder, Bias Subtractor, Mantissa
Multiplier and Sign OF/UF. We study the effect of the number of neu-
rons per bit on accuracy and bit error rate, and estimate the optimal
number of neurons needed for each component.
Keywords: IEEE 754, floating point arithmetic, neuromorphic computing,
Neural Engineering Framework (NEF)
1 Introduction
Neuromorphic computing has recently become prominent as a possible fu-
ture alternative to the traditional Von Neumann architecture (Zargham,
1996) of computing. Some of the problems that are commonly faced when
working with classical CMOS-based Von Neumann machines are the limita-
tions on their energy efficiencies, and also the absolute limits to speed and
scaling on account of physical limits (Mead, 1990; Koch and Segev, 2003).
Though Moore’s Law held for long and made possible rapid and sustained
progress in hardware performance (Moore, 1965), it is now quite clear that
this will not last. Hence, there is a need to look for alternative comput-
ing architectures, including neuromorphic computing (aand Youjie Li et al.,
2017; Kim et al., 2015; Esser et al., 2016). The Von Neumann architecture
also has an inherent problem, commonly called the “Von Neumann bottle-
neck,” because of the limited bandwidth between the CPU and the main
1
device memory. Thus, newer architectures often avoid a wide gap between
processing and main memory (Monroe, 2014; Moore, 1965).
Rapid growth in cognitive applications is one of the important motiva-
tions for interest in neuromorphic computing, which promises the ability
to perform a high number of complex functions through parallel operation.
Neural solutions are possible for machine learning problems that involve
complex mathematical calculations (Eliasmith, 2013; Pastur-Romay et al.,
2017). There have been some attempts to develop systems of computation
on neuromorphic architectures (Koch and Segev, 2003; Gosmann and Elia-
smith, 2016) but not much has been done in the specific area of numerical
computations, particularly for floating-point arithmetic.
Floating-point arithmetic (IEEE, 2019) is ubiquitous in scientific as well
as general computing. It is a basic operation that should be supported
by any computational architecture. In this paper, we describe a system
which can perform the multiplication of two IEEE 754-compliant floating-
point numbers on a neuromorphic architecture. Our work is an extension
to George et al. (2019) who showed how floating point addition can be
achieved using neuromorphic computing. We have designed a modular archi-
tecture which performs the conventional multiplication process (Erle et al.,
2009), but instead of logic gates it uses groups of neurons as the basic unit.
The architecture is easily scalable to double-precision floating point num-
bers.
The system is designed on the basis of the Neural Engineering Framework
(NEF) which, as the name suggests, provides a basic framework to develop
a neuromorphic system. For the implementation, simulation and testing of
our design we used Nengo (Nengo, c; Bekolay et al., 2014), a graphical and
scripting-based software package for simulating large-scale neural systems.
To use Nengo, we define groups of neurons called ensembles, and then form
connections between them based on what computation (Nengo, a,b) should
be performed.
The architecture is divided into four components: Exponent Adder, Bias
Subtractor, Mantissa Multiplier, and Sign/Overflow and Underflow. The
Exponent Adder uses a stage-wise adder which takes 8-bit exponents and
produces an 8-bit output along with carry. The Bias Subtractor takes the
output of the Exponent Adder and subtracts the bias and produces 8-bit
output. The subtraction is done using 2’s complement method. The Man-
tissa Multiplier is the core of our system design; it follows a stage-wise
process, taking two 23-bit mantissa inputs, and outputs a 23-bit resultant
mantissa (see Section 3.3). Our system also indicates if there is an overflow
or underflow during the exponent addition process (see Section 3.5).
2
Figure 1: IEEE 754 32-bit floating-point representation
We used two performance analysis metrics: Mean Absolute Error (MAE)
and Mean Encoded Error (MEE) to estimate the performance of our system.
We have also observed the effect on accuracy by varying number of neurons
of each component in our system.
The rest of the paper is structured as follows. We first give a brief de-
scription of the IEEE 754 floating-point multiplication process in Section 2.1,
and then briefly describe the Neural Engineering Framework (NEF) and its
three basic principles: representation, transformation and dynamics, in Sec-
tion 2.2. After this we explain the overall architecture in Section 3 using
Figure 3. The performance analysis metrics in Section 4 deal with the two
metrics that we have used to evaluate our system: the Mean Absolute Error
(MAE) and Mean Encoded Error (MEE). In Section 4.1 we describe the re-
lationship between the number of neurons and accuracy, and in Section 4.2
we describe the relationship between the number of neurons and bit error.
In Section 4.3 we describe how we estimated the optimal number of neurons
required for all the ensembles, and list them in Table 1. Finally, we present
the conclusions of our work in Section 5.
2 Background
First we briefly discuss the floating-point multiplication process as per the
IEEE 754 standard (Erle et al., 2009), then we describe the Neural Engineer-
ing Framework (NEF) which we have used to design, simulate and evaluate
our system (Stewart, 2012).
2.1 IEEE 754 floating-point multiplication
Figure 2 illustrates the overall process of multiplication of two floating-point
numbers Input1 and Input2 represented in binary format. Figure 1 is an ex-
ample of how a 32-bit floating-point number is represented according to the
IEEE 754 standard (IEEE, 2019). A sign bit is used to represent whether
the number is positive or negative. 8 and 23 bits are used to represent
3
Figure 2: Process for multiplication of floating point numbers
the exponent and mantissa values respectively. While designing this sys-
tem we assumed that both inputs, i.e., the two floating-point numbers, are
represented according to the IEEE 754 standard in binary representation.
In Figure 2, The exponents E1 and E2 are added. The Bias value (127)
is subtracted from the sum of E1 and E2. The difference is placed in the
Exponent field (see Figure 1). Each mantissa is of 24 bits (23 bits + 1 hidden
bit). Mantissa M1 and M2 are multiplied and give a 48 bit output; if the
48th bit is 1 then the result is normalized by right shifting and incrementing
the resultant exponent (if it is 0, then nothing further is to be done). To
find the resultant mantissa, we take the first 24 bits (23 bits + 1 hidden
bit). The resultant sign field is the XOR of the two sign bits S1 and S2.
For a better understanding of the above algorithm, see Yi and Ding
(2009).
2.2 Neural Engineering Framework
The Neural Engineering Framework (NEF) (Stewart, 2012; Voelker and Elia-
smith, 2017; Voelker et al., 2017) is a computational framework which is used
for mapping computations to the biological network of spiking neurons. It
provides a general way to generate circuits that have analytically determined
synaptic weights to provide the desired functionality. NEF consists of three
principles: representation, transformation, and dynamics (Nengo, c; Elia-
smith and Anderson, 2002). Using these principles we can implement NEF
for constructing complex neural models.
4
2.2.1 Representation
Neural representations are defined by the combination of nonlinear encod-
ing and weighted linear decoding. (We use the notation given by Stewart
(2012).) If x is the value represented by a neural ensemble and ei is the
encoding vector for which that neuron fires most strongly, then activity ai
for each neuron can be represented as follows:
ai = Gi[αiei · x+ bi], i = 1 . . . n (1)
where G is neural non-linearity, αi is the gain parameter, and bi is the
constant background bias current for the neuron. Given an activity, esti-
mating the value of x can be done by finding a linear decoder di.
xˆ =
∑
aidi (2)
Decoding weights di can be seen as a least-squares minimization prob-
lem, as di is set of weights that minimizes the difference between x and its
estimate (Stewart, 2012).
d = Γ−1Υ (3)
Γij = Σxaiaj (4)
Υj = Σxajx (5)
2.2.2 Transformation
Section 2.2.1 shows how to encode and decode a vector in the distributed
activity of a population of neurons. To perform computation, these neurons
need to be connected and information needs to be transferred from one group
of neurons to another. This is done via synaptic connections. In other words,
we want our connections to compute some functions. Transformation is
used for approximation of these functions (Stewart, 2012). Transformation
is another weighted linear decoding for approximating function f(x); the
decoded weights df(x) can be computed as:
df(x) = Γ−1γf(x) (6)
Γij = Σxaiaj (7)
Υj
f(x) = Σxajf(x) (8)
In general, the more non-linear and discontinuous function is, the lower
is the accuracy of its computation. Accuracy also depends on other factors
5
like neuron properties, number of neurons, and the encoding method. The
NEF is using the same trick seen in support vector machines (Cristianini
and Shawe-Taylor, 2000) to allow complex functions to be computed in a
single set of connections as we choose ei, αi and bi. The function f(x) is
constructed by a linear sum of tuning curves of neurons, so a wider variety
of tuning curves leads to better function approximation (Stewart, 2012).
2.2.3 Dynamics
Dynamics of the neural systems can also be modeled in NEF using control-
theoretic state variables. However, NEF also provides a direct method for
computing dynamic functions of the form:
dx
dt
= F (x) +H(u) (9)
where x is the value getting represented, u is some input, and F and G are
some arbitrary functions.
3 System Architecture
We have designed a system that performs floating-point multiplication ac-
cording to the IEEE standard (IEEE, 2019). Figure 3 illustrates the system
architecture. The two inputs are represented as (S1,M1,E1) and (S2,M2,E2)
and the output is represented as (Sout,Mout,Eout). Here Si represents the
sign bit, Mi represents the mantissa bit, and Ei represents the exponent bit,
where i ∈ {1, 2, . . . , out}. This representation follows the IEEE-754 32-bit
floating point standard (IEEE, 2019). Each of the components is described
in the following subsections.
3.1 Simulation
For simulation we use the Leaky Integrate-and-Fire (LIF) neural model.
We create the neural ensembles using the Nengo library to represent input
information. The values of two properties, radius and dimension of the
ensemble are set in the same way as George et al. (2019). We have also used
the same encoding scheme as George et al. (2019) to transfer the output
of one ensemble as an input to another ensemble. For the AND ensemble
(Section 3.3) we have used the following encoding scheme:
E(xˆi) =
{
1, xˆi ≥ 1.5
0, otherwise
(10)
6
Figure 3: Architecture diagram for single precision IEEE floating point num-
ber multiplication
3.2 Exponent Adder
As shown in Figure 3, the Exponent Adder takes three inputs: E1, E2 and a
normalization bit produced by the Mantissa Multiplier (see Section 3.3). It
performs addition of 8-bit E1,E2 and Normalization bit (as Cin) produces an
8-bit output E′ and a carry bit Cout. To implement this stage-wise addition
process, we construct a network that takes two inputs (the corresponding
bits of two exponents, i.e., ai and bi, where 0 ≤ i ≤ 7, and represent them
using two different ensembles, say A ensemble and B ensemble. These two
ensembles are then connected to another ensemble, say C ensemble, through
synaptic connections. Now the sum of A ensemble and B ensemble is rep-
resented by C ensemble. The adder is implemented in same way as in prior
literature (George et al., 2019; Nengo, a). The Cout bit produced by the
Exponent Adder is used in the calculation of overflow and underflow (see
Section 3.5).
3.3 Mantissa Multiplier
The Mantissa Multiplier component is the core of our system. It is a stage-
wise process. Figure 5 shows its working. We use an AND ensemble and
adders as building blocks for multiplication (see Figure 4). The AND En-
semble is used to implement neuromorphic AND logic. The encoding scheme
for it is given in (10). In the AND ensemble we connect two inputs. If both
inputs are 1 then the output is more than 1.5, so the output is set to 1;
7
Figure 4: Building block of Mantissa Multiplier component consisting of
AND Ensemble and Adder
Figure 5: Process for multiplication of floating point numbers
8
otherwise it is 0. The working and connection of each block at every stage
is described below in detail by taking two mantissa A and B:
• Each block j of stage i is given four inputs Ai, Bj, sum sin produced
by block (j + 1) of stage (i− 1) and carry cin from block (j − 1) of
stage i, where 0 ≤ i, j ≤ 23.
• As shown in Figure 5, the last block of each stage i takes cout of the
previous stage’s last block as sin.
• The AND Ensemble of each block of every stage performs AND oper-
ation on Ai and Bj and outputs AiBj .
• The adder of blocks performs 3-bit addition of AiBj , sin and cin and
produces sout and cout (George et al., 2019; Nengo, a)
• sout and cout produced as outputs are fed as input to the next stage
and next block respectively.
The first block of every stage is given cin as 0. The output obtained at
each stage ensemble is encoded and fed to the next stage ensemble as input.
Encoding of the output at each stage helps to filter and boost up the output
signal. At each stage the first block’s sout represents the output bit of the
mantissa as shown in Figure 5. At the end of this process we get a 48-bit
product. If the 48th bit is 1, then we set the normalization bit, right shift
the product by one, which thereby results in incrementing the exponent by
one (see Section 3.2). The resultant product is in the 1.M form as per IEEE
standard. We take the first 23 bits from M and stores it as a resultant
mantissa Mout.
3.4 Bias Subtractor
As shown in Figure 3, this component subtracts the bias from the result
which we get from exponent addition. The subtraction is done using the
2’s complement method (Lilja and Sapatnekar, 2005). This is achieved by
taking the 2’s complement of the bias and then performing addition. To
perform 2’s complement, we design a converter, which takes 8-bit bias and
represents it using a neural ensemble. We take a 1’s complement of bias by
flipping its bits, and then take the 8-bit adder and add 1 to 1’s complement
of bias. The final output is stored as a resultant exponent Eout.
9
Figure 6: Accuracy vs. Number of neurons/Ensemble Graph of Mantissa
Multiplier
3.5 Sout and OF/UF
This component computes Sout bit of the output along with OF/UF (over-
flow/underflow) flag which can then be used for rounding. It computes
output sign bit Sout by performing a neuromorphic XOR operation on two
sign bits S1 and S2 (George et al., 2019). Overflow is indicated by setting
the OF/UF flag as 1 if a carry is found during exponent addition.
4 Observations and Results
We simulated the individual components of the system and integrated them
to arrive at fully functional IEEE floating point multiplication. We probed
the outputs of each component at a time interval of 10ms and computed
errors in each of them. We used the following two techniques for evaluating
the performance of each component.
Mean Absolute Error =
∑
|Computed val−Actual val|
number of values
Accuracy = (1−Mean Absolute Error)× 100
The Mean Absolute Error is the measure of the absolute difference be-
tween the actual bit value and the value computed by our system, averaged
over all the bits. In our case MAE obtains due to approximating a discon-
tinuous function using NEF, plus noise and randomness in spiking neurons.
10
Mean Encoded Error =
∑
|Actual bit⊕ Encoded val|
number of bits
We encoded the output value of each component and compare it with
actual bit value. In other words we calculated hamming distance between
the encoded bit value and actual bit value then averaged it over all the bits.
4.1 Accuracy versus number of neurons
Figure 6 illustrates the accuracy of the Mantissa Multiplier. (For the Bias
Subtractor and Exponent Adder we get very similar graphs.)
We varied the number of neurons starting from 100 to maximum of 800
per bit, and observed the accuracy across all components. We observed that
the accuracy initially increases with the number of neurons but after some
threshold value of neurons, increase in accuracy is not significant. In the
Mantissa Multiplier component we can see that accuracy increases rapidly
until the number of neurons reach 300; after that there is no significant
improvement.
4.2 Bit error v/s number of neurons:
For each Mantissa Multiplier component we observed that bit error is high
when the number of neurons is very low. In the Mantissa Multiplier, when
the number of neurons are below 200, we got 1 bit error out of 48 bits which
is roughly equivalent to 2%. After increasing the number of neurons to 300
we get no bit errors. For the Exponent Adder and Bias Subtractor we get
no bit errors even for number of neurons below 200.
4.3 Total number of neurons
We observed in Section 4.1 that the accuracy increases with an increase
in the number of neurons. We estimated the optimal number of neurons
required in all for all ensembles, as in Table 1
5 Conclusion
In this paper we describe an approach to build an IEEE-754 standard float-
ing point unit using neuromorphic hardware with spiking neurons. Such
11
Table 1: Number of neurons for each ensemble
Component Number of neurons
Exponent Adder 300
Bias Subtractor 300
Mantissa Multiplier 600
Sign and OF/UF 100
devices can mimic aspects of the brain’s structure, and may be an energy-
efficient alternative to the classical Von Neumann architecture. Such a neu-
romorphic floating-point unit is a critical step in developing an alternative,
neuromorphic CPU architecture.
Our architecture comprises a complex floating-point multiplication pro-
cess. The most complex part of the process is the Mantissa Multiplier,
which we have realized successfully by using stage-wise multiplication and
a robust encoding scheme. The architecture is easily scalable to double-
precision floating point numbers also. We have checked the presence of
overflow and underflow errors which than can be handled separately. We
have studied the affect of number of neurons on accuracy and bit error. Fi-
nally we derive the optimal number of neurons required for each component,
giving an indication of the hardware resources required to implement this
approach.
References
Qian Wang aand Youjie Li, Botang Shao, Siddhartha Dey, and Peng Li. En-
ergy efficient parallel neuromorphic architectures with approximate arith-
metic on FPGA. Neurocomputing, 221:146–158, January 2017.
Trevor Bekolay, James Bergstra, Eric Hunsberger, Travis DeWolf, Terrence
Stewart, Daniel Rasmussen, Xuan Choo, Aaron Voelker, and Chris Elia-
smith. Nengo: a Python tool for building large-scale functional brain
models. Frontiers in Neuroinformatics, 7, January 2014.
Nello Cristianini and John Shawe-Taylor. An Introduction to Support Vector
Machines and Other Kernel-Based Learning Methods. Cambridge Univer-
sity Press, 2000. doi: 10.1017/CBO9780511801389.
12
Chris Eliasmith. How to Build a Brain: A Neural Architecture for Biolog-
ical Cognition. Oxford Series on Cognitive, Models and Architectures,
September 2013. ISBN 9780199794546.
Chris Eliasmith and Charles H. Anderson. Neural Engineering: Compu-
tation, Representation, and Dynamics in Neurobiological Systems. MIT
Press, 2002. ISBN 9780262050715.
Mark A. Erle, Brian J. Hickmann, and Michael J. Schulte. Decimal Floating-
Point Multiplication. IEEE Trans. Comput., 58(7):902–916, July 2009.
doi: 10.1109/TC.2008.218.
Steven K. Esser, Paul A. Merolla, John V. Arthur, Andrew S. Cassidy,
Rathinakumar Appuswamy, Alexander Andreopoulos, David J. Berg, Jef-
frey L. McKinstry, Timothy Melano, Davis R. Barch, Carmelo di Nolfo,
Pallab Datta, Arnon Amir, Brian Taba, Myron D. Flickner, and Dhar-
mendra S. Modha. Convolutional networks for fast, energy-efficient neu-
romorphic computing. PNAS, 113(41):11441–11446, October 2016. URL
http://www.pnas.org/content/113/41/11441.
Arun M. George, Rahul Sharma, and Shrisha Rao. IEEE 754 Floating-
Point Addition for Neuromorphic Architecture. Neurocomputing, 366:74–
85, November 2019. URL http://doi.org/10.1016/j.neucom.2019.
05.093.
Jan Gosmann and Chris Eliasmith. Optimizing semantic pointer repre-
sentations for symbol-like processing in spiking neural networks. PLoS
ONE, 11, February 2016. URL https://doi.org/10.1371/journal.
pone.0149928.
IEEE. IEEE Standard for Floating-Point Arithmetic, July 2019. URL
http://doi.org/10.1109/IEEESTD.2008.4610935.
Yongtae Kim, Yong Zhang, and Peng Li. Energy Efficient Approximate
Arithmetic for Error Resilient Neuromorphic Computing. IEEE Trans.
VLSI Syst., 23(11):2733–2737, November 2015. doi: 10.1109/TVLSI.2014.
2365458.
Christof Koch and Idan Segev, editors. Methods in Neuronal Modeling:
From Ions to Networks. MIT Press, Cambridge, MA, 2 edition, January
2003.
David J. Lilja and Sachin S. Sapatnekar. Designing Digital Computer Sys-
tems with Verilog. Cambridge University Press, 2005.
13
Carver Mead. Neuromorphic Electronic Systems. Proc. IEEE, 78(10):1629–
1636, October 1990.
Don Monroe. Neuromorphic computing gets ready for the (really) big time.
Communications of the ACM, 57(6):13–15, 2014.
Gordon E. Moore. Cramming more components onto integrated circuits.
Electronics, 38(8):114–117, April 1965.
Nengo. Addition example. https://www.nengo.ai/nengo/examples/
addition.html, a. Accessed June 27, 2020.
Nengo. Multiplication example. https://www.nengo.ai/nengo/examples/
basic/multiplication.html, b. Accessed June 27, 2020.
Nengo. Documentation. nengo.ai/documentation, c. Accessed June 27,
2020.
L. A. Pastur-Romay, A. B. Porto-Pazos, F. Cedron, and A. Pazos. Par-
allel computing for brain simulation. Current Topics in Medicinal
Chemistry, 17(14):1646–1668, 2017. ISSN 1568-0266/1873-4294. doi:
10.2174/1568026617666161104105725. URL http://www.eurekaselect.
com/node/147056/article.
Terrence C. Stewart. A technical overview of the neural engineering frame-
work. AISB Quarterly, 35, October 2012. URL http://compneuro.
uwaterloo.ca/files/publications/stewart.2012d.pdf.
Aaron R. Voelker and Chris Eliasmith. Methods for applying the Neural
Engineering Framework to neuromorphic hardware. arXiv:1708.08133 [q-
bio.NC], August 2017.
Aaron R. Voelker, Ben V. Benjamin, Terrence C. Stewart, Kwabena Boahen,
and Chris Eliasmith. Extending the Neural Engineering Framework for
nonideal silicon synapses. In 2017 IEEE International Symposium on
Circuits and Systems (ISCAS), Baltimore, MD, May 2017.
Kui Yi and Yue-Hua Ding. 32 bit Multiplication and Division ALU Design
Based on RISC Structure. In Twenty-First International Joint Conference
on Artificial Intelligence (IJCAI 2009), Hainan Island, China, April 2009.
Mehdi Zargham. Computer Architecture. Prentice Hall, 1996.
14
