Stochastic Rounding: Algorithms and Hardware Accelerator by Mikaitis, Mantas
Stochastic Rounding: Algorithms and Hardware
Accelerator
Mantas Mikaitis
Department of Mathematics
The University of Manchester
Email: mantas.mikaitis@manchester.ac.uk
Abstract—Algorithms and a hardware accelerator for perform-
ing stochastic rounding (SR) are presented. The main goal is to
augment the ARM M4F based multi-core processor SpiNNaker2
with a more flexible rounding functionality than is available in
the ARM processor itself. The motivation of adding such an
accelerator in hardware is based on our previous results showing
improvements in numerical accuracy of ODE solvers in fixed-
point arithmetic with SR, compared to standard round to nearest
or bit truncation rounding modes. Furthermore, performing SR
purely in software can be expensive, due to requirement of a
pseudorandom number generator (PRNG), multiple masking and
shifting instructions, and an addition operation. Also, saturation
of the rounded values is included, since rounding is usually
followed by saturation, which is especially important in fixed-
point arithmetic due to a narrow dynamic range of representable
values. The main intended use of the accelerator is to round fixed-
point multiplier outputs, which are returned unrounded by the
ARM processor in a wider fixed-point format than the arguments.
Index Terms—stochastic rounding, fixed-point arithmetic,
floating-point arithmetic, bfloat16.
I. INTRODUCTION
SpiNNaker is a 18-ARM968-core (integer only) chip for
simulating neural networks, including ordinary differential
equations (ODEs) of neurons [1]. Previous work on SpiN-
Naker [2] explored numerical accuracy issues in ODE solvers
run in fixed-point arithmetic with the main conclusion that
rounding errors are a major factor in divergence of the
solution from the reference double precision solution, and that
stochastic rounding (SR) helps in reducing divergence. It was
also shown that fixed point with SR is 2.6− 4.2× faster than
emulating floating-point arithmetic in software.
The next generation SpiNNaker, SpiNNaker2 will be based
on an ARM Cortex-M4F processor [3], which does not have
a capability of rounding a fixed-point number to a specified
number of bits. There are three instructions with rounding
available: SMMLAR — multiply two numbers, add a third
number to the top 32 bits of the result and return the rounded
32 top bits; SMMLSR — the same as SMMLAR, but subtract
the third argument; and SMMULR — multiply and return the
rounded 32 top bits of the result [4]. Rounding is done by
adding 0x80000000 to the product, therefore the tie-breaking
rule is round up [5]. While this would work well for s16.15 ×
u0.32 multiplications (where {s/u}X.Y is a signed/unsigned
2’s complement fixed-point format with X integer bits and Y
fractional bits), it is limited in terms of other mixed-format
multiplications demonstrated in [2]. To implement round to
nearest (RN) and stochastic rounding would require multiple
instructions, usually working on two registers containing a
64-bit unrounded value. Furthermore, there is no mention
as to whether there are instructions available on the Cortex-
M4F processor to perform saturation after rounding (return a
maximum representable value on overflow). While saturation
instructions for 32-bit values, with configurable saturation bit
position and saturated addition, are available on the M4F,
saturating a 64-bit value from the multiplication would need
to be done by comparison and because it is a value across two
registers, multiple instructions would be required to obtain
a rounded and saturated value somewhere in the middle of
a 64-bit value. Additionally, since ARM M4F has a single-
precision floating-point (binary32) [6] unit, it is beneficial
to add binary32 to bfloat16 (equivalent to binary32 with the
bottom 16 bits removed — 1 sign bit, 8 exponent bits and 7
significand bits) [7] rounding, which is an elegant format for
storage and can be operated on using binary32 hardware.
The contributions of this paper are as follows.
• Two bit-level algorithms for doing stochastic rounding
and saturation (Section II).
• The architecture of the accelerator for doing rounding and
saturation (Sections IV and V).
• Three accelerators with 8/16/32-bit random number pre-
cisions in stochastic rounding are evaluated in 22nm tech-
nology. Leakage and area comparisons are demonstrated
(Section VI).
II. ALGORITHMS
Stochastic rounding has recently been explored in machine
learning due to substantial improvements in reducing rounding
errors in low precision numerical formats [8], [9], [10], [11].
The first mention of it can be traced back to [12].
Stochastic rounding differs from the standard rounding
modes, such as round to nearest, in that instead of always
rounding to the nearest number, the decision about which way
to round is non-deterministic and the probability of rounding
up is proportional to the residual (value of the trailing bits
that do not fit into the destination format interpreted to be in
the range [0, 1)). Given a real number x, an output fixed-point
format to round the value to, < s, i, p > (where s tells us
whether it is signed or unsigned format, i defines the number
of integer bits and p defines the number of fractional bits);
ar
X
iv
:2
00
1.
01
50
1v
4 
 [c
s.A
R]
  2
9 J
un
 20
20
defining bxc as the truncation operation (cancelling a number
of bottom bits with values smaller than ε = 2−p and leaving
p fractional bits) which returns a number in < s, i, p > format
less than or equal to x; and given a random value P ∈ [0, 1),
drawn from a uniform random number generator, SR is defined
as
SR(x,< s, i, p >) =
{
bxc if P ≥ x−bxcε ,
bxc+ ε if P < x−bxcε .
(1)
For floating-point numbers the definition of SR is slightly
different since it does not use 2’s complement, see for example
[10].
This can be implemented by inspecting residuals and utiliz-
ing a pseudorandom number generator (PRNG); in [2] a simple
linear congruential generator as well as more complex ones
were used without any significant differences in numerical
results. For the current work, SpiNNaker2 already has a gen-
erator from the family of generators called KISS (proposed by
George Marsaglia) implemented in hardware with configurable
seeds [3], from which we fetch 32-bit random bit streams for
rounding. SpiNNaker2 hardware implementation is after [13,
p. 3] (algorithm called JKISS32) except one of the internal
variables was modified to be 64 bits to improve the quality of
the generator.
For a fully configurable stochastic rounding routine we have
to be able to round a specified number of trailing bits n of a
64-bit number. This means that bits n− 1 to 0 are zeroed and
0x1 (hex value) is added at the n-th bit location if rounding is
performed. Usually there is no need to return the rounded value
in the original bit width with bottom bits zeroed, therefore an
output number from the rounding routine is provided in lower
precision, at which point it also has to be saturated if the value
is too large to be represented. For SpiNNaker use cases we are
interested in rounding a 64-bit multiplication result to various
32-bit fixed-point formats, therefore we explored an algorithm
and implementation for this routine.
There are two simple ways to round a value stochastically:
by comparing a random number to the residual and rounding
up if it is smaller, or by adding a random number to the resid-
ual and letting the carry out from that control rounding. Algo-
Algorithm 1 Stochastic rounding by comparison
function SATSR INT64 INT32(X,n)
P ← PRNG32()
MASK ← ((1 n)− 1)
P ← P&MASK
RESIDUAL← X&MASK
X ← X  n
if P < RESIDUAL then
X ← X + 1
if X > MAX INT32 then
return MAX INT32
if X < MIN INT32 then
return MIN INT32
return X
Algorithm 2 Stochastic rounding by addition
function SATSR INT64 INT32(X,n)
P ← PRNG32()
P ← P&((1 n)− 1)
X ← (X + P ) n
if X > MAX INT32 then
return MAX INT32
if X < MIN INT32 then
return MIN INT32
return X
Algorithm 3 Rounding to nearest with round up on a tie
function SATRN INT64 INT32(X,n)
X ← (X + (1 (n− 1))) n
if X > MAX INT32 then
return MAX INT32
if X < MIN INT32 then
return MIN INT32
return X
rithms 1 and 2 demonstrate how to do both (in the algorithms
 and  stand for binary shifts left and right). Stochastic
rounding by addition looks shorter, but both algorithms require
5 operations in the main rounding parts (saturation is the same
in both cases). Saturation logic is a standard check for overflow
at both ends of the dynamic range and note that if the input
and output numbers would be unsigned, only one comparison
instead of two would be required. It is also worth noting that
RN mode can be implemented similarly to Algorithm 2, as
shown in Algorithm 3.
By comparing Algorithms 2 and 3, notice that SR has an
overhead of a PRNG plus one operation to mask off the top
bits of the random number, compared with RN.
III. NUMERICAL EXPERIMENTS
The algorithm of choice for the proposed hardware acceler-
ator is Algorithm 2 and here we test it first in software simu-
lation on an ARM968 processor using fixed-point arithmetic.
The main advantages of SR are in summation algorithms, with
the data with rounding errors biased into one direction which
dominates the final error in the result of the sum. Following
the approach taken by [14] we have applied the Algorithm 2 of
stochastic rounding in software on a basic recursive summation
algorithm evaluating the harmonic series; this series is a
divergent series but converges when implemented in limited
precision arithmetic using recursive summation. The series
is defined as
∑∞
i=1 1/i = 1 +
1
2 +
1
3 + · · · — it can be
seen that the addends are getting smaller while the total
sum keeps increasing and as [14] reported the sum converges
in floating-point arithmetic when the addends become small
enough that they do not change the total sum anymore (due to
very different exponents and round off error on addition). This
issue is called stagnation in [15], a problem which happens
in summing algorithms in floating-point arithmetic.
TABLE I
ITERATIONS UNTIL CONVERGENCE OF THE HARMONIC SERIES FOR DIFFERENT ARITHMETICS. SUMS AND ERRORS RELATIVE TO BINARY64 RESULT
(DOUBLE PRECISION FLOATING-POINT) AT FIVE MILLIONTH ITERATION. FLOATING-POINT DATA FROM [14]. FLOATING POINT FORMATS ARE DEFINED IN
[6]. AVERAGED SUMS ARE FROM RUNNING THE EXPERIMENT 50 TIMES IN S16.15 AND S8.7 ARITHMETICS WITH SR, EACH TIME WITH DIFFERENT
PRNG SEED. RD REFERS TO ROUND DOWN MODE.
Arithmetic Sum at i = 5× 106 Error at i = 5× 106 Iterations to converge
binary64 16.002 0 2.81...× 1014
binary32 15.404 0.598 2097152
binary16 7.086 8.916 513
s16.15 RN 11.938 4.064 65537
s16.15 RD 10.553 5.449 32769
s8.7 RN 6.414 9.588 257
s8.7 RD 5.039063 10.963 129
s16.15 SR Mean = 16.002 −0.000135765 232 + 1
(50 runs) std.dev. = 0.012
s8.7 SR Mean = 11.205
4.797 216 + 1
(50 runs) std.dev. = 0.242
This experiment was run in 32- and 16-bit fixed-point
arithmetics. The sum has a numerical type of s16.15 or s8.7
and is initialized to 1. Then the series is started from i = 2 and
the division is done in either 32-bit or 16-bit fractional type
(u0.32 or u0.16), followed by the addend being rounded to
the sum’s format with various rounding routines. While fixed-
point addition is known to be exact if there are no overflows,
in this case it is not since the addends have more fractional
precision than the sum.
Table I demonstrates the results with various fixed-point
types; floating-point results are also provided for comparison.
Five million iterations were chosen to have a manageable
run time, but the number of iterations to convergence is also
reported. As expected, most of the fixed-point types converge
as soon as the addends in the series become small enough to
be evaluated at lower than s16.15 precision, when the values
cross 0.5ε. However, it can be seen that fixed point with SR
can accurately replicate the sum of the binary64 format in
5 million iterations without converging. Given that stochastic
rounding is probabilistic rounding, it might still produce some
effect in later iterations stochastically and therefore it can be
said that the harmonic sum with stochastic rounding never
converges — there is a diminishing, but non zero probability
of rounding up the addends and affecting the sum. In practice
it converges also when the numerical type of the addends
runs out of bits and the probability of rounding up becomes
0. This can also happen if there is a limited amount of
random bits available for performing stochastic rounding and
can especially be significant in rounding the double precision
adder/accumulator results as these can be held in thousands of
bits before rounding [16].
This experiment provides a confirmation that Algorithm 2
works as expected. In summary, running the harmonic series
50 times in s16.15 with SR, it is shown that the averaged
result has a very small error compared to the sum computed
in binary64, while s16.15 with RN stagnates just after 65536
steps.
IV. SPECIFICATION
In this section we describe the specification of the proposed
accelerator that was designed and included in the upcoming
SpiNNaker2 neuromorphic chip [3].
The rounding and saturation accelerator is a memory
mapped unit, connected through an AHB bus — a set of
memory addresses are allocated, for different rounding rou-
tines and numerical formats, to which arguments are written,
and from which the rounded values are read out. For rounding
multiplication results, it is useful to have a 64-bit → 32-bit
number rounding, with configurable rounding bit position from
0 to 31. Given that the ARM M4F processor has 32-bit wide
interfaces, two memory cycles are required for inputting 64-
bit arguments through AHB into the accelerator. For other use
cases, 32-bit → 32-bit, 32-bit → 16-bit, and 16-bit → 16-bit
round-and-saturate operations are also supported. For these
configurations, one memory cycle is required for input, there-
fore, aiming at single cycle for the main part of rounding, the
accelerator will either have a 4- or a 3-cycle delay for a write-
round-read operation for 64- or 32-bit arguments respectively.
Both signed and unsigned number types are supported,
given a wide range of use cases for both types shown in
[2]. Furthermore, as the ARM M4F has single precision
floating-point hardware support, it might be beneficial to
round binary32 to bfloat16 [7]. This format can be useful for
representing and storing neural network weights for example,
which can be operated on using the floating-point unit by
inputting into a higher part of the floating-point registers and
then rounded back to bfloat16 before writing to memory.
Finally, given that an adder is required in stochastic rounding,
we can also include RN mode (with rounding up on ties, for
cheaper implementation), which can reuse the adder to add
0x1 shifted to the required rounding bit position.
V. DESIGN
Figure 1 gives an architectural diagram of the accelerator
that performs rounding and saturation. The functionality of the
Data in (64/32 bits) Config (5 bit)
000...
31 bit
32/64
signed
arithmetic
000...32/64{sign}
01
data inputsign extension
32/64
Pick 32/16
bits of
unrounded
result
Pick top
32 bits of
residual to
round
+ PRNG
127
Detect
overflow
in
the input
Pick the
top 32 bits
after result
32
32
c_out
bit_31
round mode
01
+
32
Saturation
32
4
32
Fig. 1. Architectural diagram for the rounding and saturation accelerator.
accelerator includes a combination of SR and RN modes, as in
the Algorithms 2 and 3. Signals signed arithmetic and round
mode are derived from the address supplied by the AHB bus,
depending on which address was written by the processor.
The main mechanism is to pick the top 32 bits of the
residual depending on the configuration register, which is set
up beforehand and contains the number of bits to round, 0
to 31 (0 means round 1 bit, 31 means round 32 bits). Then,
the 32 bits after these residual bits are also isolated which is
an unrounded result at this point. The minimum number of
bits to round is 1, therefore the data input is extended to the
right by 31 bits to support Verilog’s base minus 32 bit slicing
functionality. For the same reason the input data is extended
by 32 bits to the left, for overflow detection by operating on
32 bits after the result.
100 200 300 400
1
1.5
2
2.5
Clock frequency constraint fclk (MHz)
A
re
a
(n
or
m
al
iz
ed
)
8-bit SR
16-bit SR
32-bit SR
Fig. 2. Circuit area of the accelerator when synthesized with different clock
constraints. Three accelerator versions are shown with 8-, 16-, and 32-bit
stochastic rounding.
100 200 300 400
100
101
102
103
Clock frequency constraint fclk (MHz)
L
ea
ka
ge
(n
or
m
al
iz
ed
)
8-bit SR
16-bit SR
32-bit SR
Fig. 3. Leakage of the accelerator when synthesized with different clock
constraints. Three accelerator versions are shown with 8-, 16-, and 32-bit
stochastic rounding.
A pseudorandom number is added to the residual and the
carry bit c out is captured from that. Then, depending on the
round mode, either the top bit of the residual (in case of RN
mode) or the c out (in case of SR mode) is added to the
unrounded result which performs round-up if it is 1 and round-
down if it is 0. Finally the rounded result and the overflow bits
are used to saturate the result if required.
VI. EVALUATION
The main logical path of the accelerator contains two adders
— one 32-bit for rounding and one 8-, 16-, or 32-bit for the
stochastic rounding part when a random number is added to the
residual. The architectural diagram in Figure 1 demonstrates a
32-bit version, but it is worth evaluating the three versions as
there is some evidence that not all of the 32 bits are needed in
SR, as shown in the previous work [2, Sec. 5c(iii)]. All of the
logic, except some saturation checks, are performed in a single
cycle. Saturation logic contains basic checks of the overflow
flags depending on the address input from the AHB bus and
in our implementation is done on the AHB output cycle.
Following the synthesis study approach taken by us before
[17], we have executed it on the current accelerator using
the makeChip hosted design service platform [18] for the
GLOBALFOUNDRIES 22FDX technology [19] for which the
SpiNNaker2 chip is being developed. An ultra-low voltage
8t-CNRX standard-cell library with multiple voltage threshold
options is used for implementation. The standard cells use the
adaptive body biasing (ABB) technique for post-silicon adap-
tation of transistor threshold voltage [20], [21]. Namely, two
main categories of cells are used: Low-Voltage-Threshold (fur-
ther called LVT) and Super-Low-Voltage-Threshold (further
called SLVT) cells — the former with the larger propagation
delay but significantly less leakage than the latter, much faster,
cells. A nominal supply voltage of 0.5V is considered for low
power operation. Due to manufacturing variations, synthesis
is performed in a worst case speed condition at 0.45V and
−40 ◦C. Three versions of the accelerator are synthesized
varying the clock frequency constraint
fclk = {50, 100, 150, 200, 250, 300, 350, 400}MHz,
and the leakage power as well as area is measured.
Figure 2 shows the area comparison of the three accelerators
for different clock constraints and Figure 3 shows leakage.
From this data it can be seen that at low clock frequencies, the
adder width can save some area and leakage (more than an or-
der of magnitude less leakage with 8-bit SR at f = 150MHz),
but at higher frequencies other costs dominate and the savings
are not that evident anymore. Especially for leakage; the
leakage of the circuit apart from the adder dominates the total
and changing to a smaller adder does not produce significant
improvements. Notice that the circuit area is largest at the
intermediate frequency of fclk = 250 MHz, which most likely
can be explained by LVT cells being replaced with SLVT
cells (smaller cells or lower number of cells) on the critical
path in fclk > 250 MHz settings, although a more thorough
investigation of synthesis would be required to explain this.
Figure 4 shows the rounding accelerator highlighted in a
layout of a single PE. The area of the accelerator is estimated
at 1004 µm2.
VII. CONCLUSION
We have presented algorithms and an accelerator for per-
forming rounding and saturation of numbers up to 64 bits, in-
cluding stochastic rounding which is becoming popular in ma-
chine learning. This includes rounding of fixed-point/integer
values at any bit position as well as binary32 to bfloat16
rounding. The chosen SR algorithm was tested on the har-
monic series computed with a basic recursive summation,
demonstrating how SR can help avoid numerical stagnation
in fixed-point arithmetic. Evaluation of the accelerators with
different precisions of SR step was performed, showing an
Fig. 4. Layout of a processing element (PE) after place and route. Cells
marked ...macro bundled at north-west corner belong to local SRAM. The
rest of the cells at the south-east corner belong to an ARM M4F based PE.
Out of that, cells highlighted in white belong to the rounding accelerator
(picture provided by Stefan Scholze).
order of magnitude of leakage improvement with 8-bit SR
at f = 150MHz. The accelerator will be included in the
SpiNNaker2 chip, which is scheduled for 2020 release and is
based on an ARM Cortex-M4F processor. Since this processor
does not provide a wide array of rounding and saturation
instructions to support fast fixed-point arithmetic, especially
mixed-format fixed-point arithmetic, this accelerator will com-
plement it and provide that functionality.
The presented results should also be applicable in imple-
menting stochastic rounding of floating-point arithmetic, such
as rounding the extended precision results from the floating-
point adder or multiplier [22], [23].
VIII. ACKNOWLEDGEMENTS
The author thanks to Sebastian Ho¨ppner, Stefan Scholze,
and Andreas Dixius of Technical University of Dresden for the
help with the synthesis tools, as well as Nicholas J. Higham for
his comments on the manuscript. The work was funded by the
Kilburn studentship and an EPSRC Doctoral Prize Fellowship.
REFERENCES
[1] S. B. Furber, F. Galluppi, S. Temple, and L. A. Plana, “The SpiNNaker
project,” Proceedings of the IEEE, vol. 102, no. 5, pp. 652–665, May
2014.
[2] M. Hopkins, M. Mikaitis, D. R. Lester, and S. Furber, “Stochastic
rounding and reduced-precision fixed-point arithmetic for solving
neural ordinary differential equations,” Philosophical Transactions
of the Royal Society A: Mathematical, Physical and Engineering
Sciences, vol. 378, no. 2166, Jan. 2020. [Online]. Available:
https://royalsocietypublishing.org/doi/abs/10.1098/rsta.2019.0052
[3] S. Ho¨ppner and C. Mayr, “SpiNNaker2—towards extremely efficient
digital neuromorphics and multi-scale brain emulation,” in 2018 Neuro-
inspired Computional Elements Workshop, Portland, OR, US, 2018.
[Online]. Available: https://niceworkshop.org/wp-content/uploads/2018/
05/2-27-SHoppner-SpiNNaker2.pdf
[4] ARM, “ARM Cortex-M4 processor, technical reference manual,” 2015.
[5] ——, “Cortex-M4 devices, generic user guide,” 2010.
[6] IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2019 (re-
vision of IEEE Std 754-2008). Piscataway, NJ, USA: Institute of
Electrical and Electronics Engineers, Jul. 2019.
[7] Intel, “BFLOAT16 — hardware numerics definition,” Online (ac-
cessed 15/6/20) https://software.intel.com/sites/default/files/managed/40/
8b/bf16-hardware-numerics-definition-white-paper.pdf, Nov. 2018.
[8] M. Ho¨hfeld and S. E. Fahlman, “Probabilistic rounding in neural
network learning with limited precision,” Neurocomputing, vol. 4, no. 6,
pp. 291 – 299, Dec. 1992.
[9] S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, “Deep
learning with limited numerical precision,” in Proceedings of the 32nd
International Conference on International Conference on Machine
Learning, ser. JMLR Workshop and Conference Proceedings, vol. 37,
Lille, France, Jul. 2015. [Online]. Available: http://proceedings.mlr.
press/v37/gupta15.pdf
[10] N. Wang, J. Choi, D. Brand, C.-Y. Chen, and K. Gopalakrishnan,
“Training deep neural networks with 8-bit floating point numbers,”
in Advances in Neural Information Processing Systems 31, Montre´al,
Canada, Dec. 2018.
[11] N. Mellempudi, S. Srinivasan, D. Das, and B. Kaul, “Mixed precision
training with 8-bit floating point,” arXiv preprint arXiv:1905.12334, May
2019.
[12] G. Forsythe, “Reprint of a note on rounding-off errors,” SIAM
Review, vol. 1, no. 1, pp. 66–67, 1959. [Online]. Available:
https://epubs.siam.org/doi/10.1137/1001011
[13] D. Jones, “Good practice in (pseudo) random number generation for
bioinformatics applications,” Online (accessed 15/6/20) http://www0.cs.
ucl.ac.uk/staff/D.Jones/GoodPracticeRNG.pdf, May 2010.
[14] N. J. Higham and S. Pranesh, “Simulating low precision floating-point
arithmetic,” SIAM J. Sci. Comput., vol. 41, no. 5, pp. C585–C602, 2019.
[15] P. Blanchard, N. J. Higham, and T. Mary, “A class of fast and accurate
summation algorithms,” SIAM J. Sci. Comput., vol. 42, no. 3, pp. A1541–
A1557, Jan. 2020.
[16] Y. Uguen and F. de Dinechin, “Design-space exploration for the
Kulisch accumulator,” Mar. 2017, working paper or preprint. [Online].
Available: https://hal.archives-ouvertes.fr/hal-01488916
[17] M. Mikaitis, D. R. Lester, D. Shang, S. B. Furber, G. Liu, J. D.
Garside, S. Scholze, S. Ho¨ppner, and A. Dixius, “Approximate fixed-
point elementary function accelerator for the SpiNNaker-2 neuromorphic
chip,” in IEEE 25th Symposium on Computer Arithmetic, Amherst, MA,
USA, Jun. 2018, pp. 37–44.
[18] “makeChip.” [Online]. Available: www.makechip.design
[19] R. Carter, J. Mazurier, L. Pirro, J. U. Sachse, P. Baars, J. Faul,
C. Grass, G. Grasshoff, P. Javorka, T. Kammler, A. Preusse, S. Nielsen,
T. Heller, J. Schmidt, H. Niebojewski, P. Y. Chou, E. Smith, E. Erben,
C. Metze, C. Bao, Y. Andee, I. Aydin, S. Morvan, J. Bernard, E. Bourjot,
T. Feudel, D. Harame, R. Nelluri, H. J. Thees, L. M-Meskamp, J. Kluth,
R. Mulfinger, M. Rashed, R. Taylor, C. Weintraub, J. Hoentschel,
M. Vinet, J. Schaeffer, and B. Rice, “22nm FDSOI technology for
emerging mobile, internet-of-things, and RF applications,” in 2016 IEEE
International Electron Devices Meeting, San Francisco, CA, USA, Dec.
2016.
[20] S. Ho¨ppner, H. Eisenreich, D. Walter, U. Steeb, A. S. Clifford Dmello,
R. Sinkwitz, H. Bauer, A. Oefelein, F. Schraut, J. Schreiter, R. Niebsch,
S. Scherzer, U. Hensel, J. Winkler, and M. Orgis, “How to achieve
world-leading energy efficiency using 22FDX with adaptive body bi-
asing on an Arm Cortex-M4 IoT SoC,” in 49th European Solid-State
Device Research Conference, Cracow, Poland, Sep. 2019, pp. 66–69.
[21] S. Ho¨ppner, H. Eisenreich, D. Walter, A. Scharfe, A. Oefelein,
F. Schraut, J. Schreiter, T. Riedel, H. Bauer, R. Niebsch, S. Scherzer,
T. Hocker, S. Scholze, S. Henker, M. Nossmann, U. Hensel, and
H. Prengel, “Adaptive body bias aware implementation for ultra-low-
voltage designs in 22FDX technology,” IEEE Transactions on Circuits
and Systems II: Express Briefs, Dec. 2019.
[22] I. B. M. Corporation, “Stochastic rounding floating-point add in-
struction using entropy from a register,” Online: http://patents.com/us-
20170220344.html, 2017.
[23] ——, “Stochastic rounding floating-point multiply instruction using en-
tropy from a register,” Online: https://patents.com/us-20170220343.html,
2017.
