Truncated Binary Multipliers with minimum Mean Square Error: analytical characterization, circuit implementation and applications by Garofalo, Valeria
TESI DI DOTTORATO
UNIVERSITA` DEGLI STUDI DI NAPOLI “FEDERICO II”
DIPARTIMENTO DI INGEGNERIA BIOMEDICA,
ELETTRONICA E DELLE TELECOMUNICAZIONI
DOTTORATO DI RICERCA IN
INGEGNERIA ELETTRONICA E DELLE TELECOMUNICAZIONI
TRUNCATED BINARY MULTIPLIERS
WITH MINIMUM
MEAN SQUARE ERROR:
ANALYTICAL CHARACTERIZATION,
CIRCUIT IMPLEMENTATION
AND APPLICATIONS
VALERIA GAROFALO
Il Coordinatore del Corso di Dottorato Il Tutore
Ch.mo Prof. Niccolo` RINALDI Ch.mo Prof. Ettore NAPOLI
A. A. 2008–2009

To Fabrizio and
to my grandparents

Acknowledgments
First of all I would like to thank my tutor, Professor Ettore Napoli, for his
constant support, his helpful suggestions, his teaching that has been crucial to
achieve this goal.
Many thanks to Davide De Caro, who has always been helpful and able to
solve all my problems.
A special thanks to Nicola Petra, for stimulating my activity and my interest
with suggestions and discussions, for his friendship and support. The time
spent in conferences wouldn’t have been so great without him!
Many thanks to Professor Antonio Strollo for his priceless lessons.
I also want to thank Professor Florin Udrea for decisive help during the time
spent in Cambridge and very useful meetings.
Without these people my experience through the years of the Ph.D wouldn’t
have been possible at all.
Thanks to HVM group of the University of Cambridge, Sumita, Prasanta, Ma-
rina, Maryline, Hatice, Wesley, Zeeshan, Floran, and to Serena and Dominik.
A really big thanks goes to the dear friends of the DIBET, Ilaria, Grazia,
Michele, Maurizio, Pierluigi, Marino, Enzo, Salvatore, Matteo, Lucio, Dino,
for the pleasant and funny moments we spent together, but most of all for their
friendship.
I would like to thank all the people who support me since always, with great
patience, great love... Alessio, Marco, Sara, Daniela, Gianfranco, Federica,
Valentina, Marianne.
The final most important thanks goes to my parents and my sister. They make
everything possible, they are my strength, my joy.
v

Contents
Acknowledgments vi
Contents ix
List of Figures xvii
List of Tables xx
Notations xxi
Introduction xxv
1 Binary Multiplication 1
1.1 Partial-Product Generation . . . . . . . . . . . . . . . . . . . 1
1.1.1 Unsigned Multiplication . . . . . . . . . . . . . . . . 2
1.1.2 Two’s Complement Multiplication . . . . . . . . . . . 3
1.1.3 Mixed-Operand Multiplication . . . . . . . . . . . . . 4
1.2 Full-width Multiplier . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Wallace-Tree Multiplier . . . . . . . . . . . . . . . . 5
1.2.2 Dadda-Tree Multiplier . . . . . . . . . . . . . . . . . 7
1.2.3 Array Multiplier . . . . . . . . . . . . . . . . . . . . 8
1.2.4 Three Dimensional Minimization (TDM) method . . . 11
1.3 Truncated Multiplier . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 Full Rounded (Round-to-Nearest) Multiplier . . . . . 14
1.3.2 Constant Correction Methods . . . . . . . . . . . . . 15
1.3.3 Variable Correction Methods . . . . . . . . . . . . . . 16
1.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
vii
viii CONTENTS
2 LMS Truncated Multiplier 23
2.1 Definitions and Assumptions . . . . . . . . . . . . . . . . . . 23
2.2 Error Analysis for Truncated Multiplier . . . . . . . . . . . . 26
2.2.1 Statistical Properties of the Truncation Error (etrunc) . 27
2.2.2 Statistical Properties of the Erasing Error (eerasing) . . 28
2.2.3 Optimal Compensation Function and Error Lower
Bound . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Optimal Compensation Function . . . . . . . . . . . . . . . . 31
2.3.1 The Intrinsic Error . . . . . . . . . . . . . . . . . . . 35
2.4 Linear compensation function. . . . . . . . . . . . . . . . . . 38
2.5 Linear Coefficients Quantization . . . . . . . . . . . . . . . . 42
2.5.1 Optimal Quantized Coefficients . . . . . . . . . . . . 45
2.6 Mean Square Error . . . . . . . . . . . . . . . . . . . . . . . 47
2.6.1 Analytical calculation of "2total for LMS1b function . . 48
2.6.2 Analytical calculation of "2total for LMS2b function . . 51
2.6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.7 Maximum Absolute Error . . . . . . . . . . . . . . . . . . . . 54
2.8 Signed and Mixed-Operand Multipliers . . . . . . . . . . . . 64
2.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3 VLSI implementation and Performances 67
3.1 Truncated Multipliers Implementations . . . . . . . . . . . . 67
3.1.1 Implementation of LMS1b truncated multipliers . . . 69
3.1.2 Implementations of LMS2b truncated multipliers . . . 72
3.2 Truncated Multipliers Performances . . . . . . . . . . . . . . 79
3.2.1 Mean Square Error Performances . . . . . . . . . . . 81
3.2.2 Maximum Absolute Error Performances . . . . . . . 84
3.2.3 Electrical Performances (Area Occupation, Power
Dissipation, Propagation Delay) . . . . . . . . . . . . 86
3.2.4 Area versus Accuracy Trade-off . . . . . . . . . . . . 88
3.3 Experimental Verification . . . . . . . . . . . . . . . . . . . 90
3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4 LMS Truncated Squarer 93
4.1 Folded squarer . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.1.1 Unsigned folded squarer . . . . . . . . . . . . . . . . 94
4.1.2 Signed folded squarer . . . . . . . . . . . . . . . . . 96
4.2 Truncated squarer . . . . . . . . . . . . . . . . . . . . . . . . 97
4.3 Optimal compensation function . . . . . . . . . . . . . . . . . 99
CONTENTS ix
4.3.1 Optimal compensation function fopt(IC) when neq is
even . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.2 Optimal compensation function fopt(IC) when neq is
odd . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.4 Optimal Linear Compensation function . . . . . . . . . . . . 106
4.5 VLSI Implementation . . . . . . . . . . . . . . . . . . . . . . 109
5 FIR filter 113
5.1 FIR filter with truncated MAC . . . . . . . . . . . . . . . . . 114
5.2 VLSI implementation . . . . . . . . . . . . . . . . . . . . . . 118
5.2.1 FIR synthesis . . . . . . . . . . . . . . . . . . . . . . 118
5.2.2 Comparison with the analytical results . . . . . . . . . 119
5.2.3 Effect of the probability distribution of the input on the
error . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.2.4 Example of FIR filter . . . . . . . . . . . . . . . . . 124
5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6 Temperature Control for Gas Sensors 129
6.1 Resistive gas sensors . . . . . . . . . . . . . . . . . . . . . . 129
6.1.1 Interface Circuitry . . . . . . . . . . . . . . . . . . . 131
6.2 Temperature Control . . . . . . . . . . . . . . . . . . . . . . 132
6.2.1 On/Off control . . . . . . . . . . . . . . . . . . . . . 133
6.2.2 PI control . . . . . . . . . . . . . . . . . . . . . . . . 135
6.2.3 Mixed control . . . . . . . . . . . . . . . . . . . . . . 137
6.3 Silicon Implementation . . . . . . . . . . . . . . . . . . . . . 138
Conclusion 143
A Intrinsic Error 147
A.1 Computing 2i;j . . . . . . . . . . . . . . . . . . . . . . . . . 147
A.2 Computing the covariance COVi;j;l;m(A) . . . . . . . . . . . 147
A.3 Computing the intrinsic error . . . . . . . . . . . . . . . . . . 149
B Calculation of Linear function 151

List of Figures
1.1 Unsigned Partial Products matrix, n = 8. . . . . . . . . . . . 2
1.2 Two’s complement Partial Products matrix, n = 8. . . . . . . 3
1.3 Mixed-operand Partial Products matrix, n = 8. Y is n bit two’s
complement fractional number, hence y1 has weight equal to
 2 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Wallace reduction for an 8  8 multiplier. Full circle: Full
Adder. Dashed circle: Half Adder. . . . . . . . . . . . . . . . 6
1.5 Dadda reduction for an 88multiplier. Full circle: Full Adder.
Dashed circle: Half Adder. . . . . . . . . . . . . . . . . . . . 7
1.6 Unsigned array multiplier, n = 8. . . . . . . . . . . . . . . . 8
1.7 Unsigned array multiplier, n = 8, rectangular shape. . . . . . 9
1.8 Signed array multiplier, n = 8. . . . . . . . . . . . . . . . . . 10
1.9 Subdivision of the matrix of Partial Products, for unsigned
multiplier, n = 8. MSP (Most Significant Part) of the PPs Ma-
trix is constituted by the first n columns. LSP (Less Significant
Part) of the PPs Matrix is constituted by the last n columns. IC
(Input Correction) vector is the (n+ h+ 1)th column. . . . . 13
1.10 Unsigned array truncated multiplier, n = 8. Dashed lines are
for the not computed cells. . . . . . . . . . . . . . . . . . . . 14
1.11 Full-rounded multiplier, n = 8. Kround is the rounding con-
stant. The n less significant bits of the results are discarded. . . 15
1.12 General scheme of a truncated multiplier. The compensation
function f(IC) is estimated from the value of IC. The LSPminor
is not computed. The result is reported on n bits with a trunca-
tion operation. . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1 Partial Products matrix for unsigned truncated multiplier, n =
8 and h = 2, with variable-correction scheme. . . . . . . . . . 24
xi
xii LIST OF FIGURES
2.2 Computation of LSP for n = 8 unsigned multiplier with IC =
0. a) Computation of the mean value of x6y7. b) Mean value
of the elements of LSPminor. . . . . . . . . . . . . . . . . . . 32
2.3 Mean values of the elements of the LSP when IC is non-zero
for a 8  8 multiplier (h = 2). a) Only one element of the IC
is equal to 1. b) Two or more elements of the IC are equal to 1. 33
2.4 Coefficients of the optimal error compensation function
fopt(IC) for n = 12; h = 1. . . . . . . . . . . . . . . . . . . . 34
2.5 Comparison between the optimal error compensation function
fopt(IC), red dashed line, and the SLSPminor , green full line, for
all the possible input values (n = 6; h = 0, collected accord-
ing to the IC value. . . . . . . . . . . . . . . . . . . . . . . . 35
2.6 Comparison between the exact value of "2intrinsic (red full line)
and the approximate value of "2intrinsic (black dashed line). . . 37
2.7 "2low bound values varying n and h, compared with "
2
round. . . . 38
2.8 Comparison between the coefficients of the linear compensa-
tion function flin(IC), black line with asterisk, and of the opti-
mum compensation function fopt(IC), red line with circle, for
n = 12; h = 1. . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.9 Comparison between the optimal error compensation function
and the linear compensation function, for all the possible in-
put values (n = 6, h = 0), collected according to the IC value.
Red full line: fopt(IC). Green full line: flin(IC). . . . . . . . 41
2.10 Normalized total mean square error as a function of n and h.
Red full line: error of optimal (quadratic) compensation func-
tion. Blue dashed line: linear compensation function with op-
timal coefficient. Green dotted line: mean square error of the
full-rounded multiplier. . . . . . . . . . . . . . . . . . . . . . 42
2.11 Levels of quantization for li. When lsbq = lsbIC, only the
blue full lines are admitted. When lsbq = 1=2lsbIC, also the
green dashed levels can be used. . . . . . . . . . . . . . . . . 43
2.12 Comparison between the optimal error compensation function
fopt(IC), red full line, flin(IC), green full line, fLMS1b(IC),
blue dashed-dotted line, fLMS2b(IC), black dashed line. . . . . 47
2.13 Comparison between total mean square errors. Theoretical
values are computed by using the formulae described in the
chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
LIST OF FIGURES xiii
2.14 Correlations between the terms of the IC and the AP
(LSPminor   IC. The gray part of the AP is correlated with
the blue part of the IC. As example the partial products shown
in bold depend on x6 and y7. If x6y6 = 0 this means that the
bold row or the bold diagonal or both are identically equal to
zero. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.15 LSPminor partial product matrix for 10 10 bit multiplier with
h = 2. The partial products of the AP that are uncorrelated
with the central terms of the IC have been fixed to 1. . . . . . 56
2.16 LSPminor partial product matrix for 10 10 bit multiplier with
h = 2. The central terms of the IC are fixed to zero. Each i
term fixed to zero fixes one row and one diagonal of the AP to
zero, alternatively. . . . . . . . . . . . . . . . . . . . . . . . . 57
2.17 LSPminor partial product matrix for 10 10 bit multiplier with
h = 2. The remaining available bits of the AP shown in
Fig. 2.16 are fixed to 1. . . . . . . . . . . . . . . . . . . . . . 58
2.18 One of the possible configurations of the LSPminor for a 1010
bit LMS1b truncated multiplier with h = 2 that maximizes the
punctual error. . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.19 One of the possible configurations of the LSPminor for a 1010
bit LMS1b truncated multiplier with h = 1 that maximizes the
punctual error. . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.20 Probability distribution of the punctual error for a n = 12 bit
LMS1b truncated multiplier, h = 0. The maximum absolute
error is only present twice on the 224 = 16:8  106 different
inputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.21 Partial Products matrix for a signed multipliers with
n = 8; h = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.1 Signed truncated multiplier with n = 8, h = 2. In this exam-
ple lsbq = 12 lsbIC . . . . . . . . . . . . . . . . . . . . . . . 69
3.2 Implementation of the proposed (signed) LMS1b truncated
multiplier, lsbq = lsbIC, n = 8 and h = 2. . . . . . . . . . . 70
3.3 Straightforward implementation of the proposed signed
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8 and h =
2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.4 Straightforward implementation of the proposed signed
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8 and h =
3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
xiv LIST OF FIGURES
3.5 Implementation Method 1 (IM1) of the proposed (signed)
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8; h = 2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.6 Implementation Method 1 (IM1) of the proposed (signed)
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8; h = 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.7 Implementation Method 2 (IM2) of the proposed (signed)
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8; h = 2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.8 Implementation Method 1 (IM1) of the proposed (signed)
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8; h = 3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.9 Signed Multiplier performances for n = 16 and h = 1 by
varying the delay constrain imposed during circuit synthesis
(0:18m technology). . . . . . . . . . . . . . . . . . . . . . . 87
3.10 Signed Multiplier performances for n = 16 and h = 1 by
varying the delay constrain imposed during circuit synthesis
(0:18m technology). . . . . . . . . . . . . . . . . . . . . . . 87
3.11 Trade off between Area Occupation and Error (mean square
total error - "2total) for different Signed Truncated Multipliers,
n = 8. The mean square error is obtained through simulation. 89
3.12 Trade off between Area Occupation and Error (mean square
total error - "2total) for different Signed Truncated Multipliers,
n = 12. The mean square error is obtained through simulation. 89
3.13 Trade off between Area Occupation and Error (mean square
total error - "2total) for different Signed Truncated Multipliers,
n = 16. The mean square error is obtained through simulation. 90
3.14 Trade off between Area Occupation and Error (mean square
total error - "2total) for different Signed Truncated Multipliers,
n = 24. The mean square error is obtained by the theoretical
formulas of Ch. 2. . . . . . . . . . . . . . . . . . . . . . . . 91
3.15 Trade off between Area Occupation and Error (mean square
total error - "2total) for different Signed Truncated Multipliers.
For n = 16 the mean square error is simulated. For n = 24
the mean square error is obtained by the theoretical formulas
of Ch. 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
LIST OF FIGURES xv
4.1 Partial Products Matrix of unsigned squarer, n even (n = 8).
The bold elements form the antidiagonal. a) Full original ma-
trix of the squarer. b) Reduced matrix after applying (4.3). . . 94
4.2 Partial Products Matrix of unsigned squarer during the folding
process, n even (n = 8). a) Partial Product Matrix to which
(4.4) can be applied; in each column the circled elements will
be grouped. b) Final folded matrix. . . . . . . . . . . . . . . 95
4.3 Final folded matrix for unsigned squarer, n odd (n = 9). . . . 96
4.4 Partial Products Matrix of signed squarer during the folding
process, n even(n = 8). a) Partial Product Matrix to which
(4.8) can be applied; in each column the circled elements will
be grouped. b) Final folded matrix. . . . . . . . . . . . . . . 97
4.5 Partial Products Matrix of unsigned squarer, (n = 10). a) Par-
tial Product Matrix before the last folding operation. In green
the matrix elements formed by a single bit. b) Subdivision
of the partial product matric in MSP, LSPmajor and LSPminor
when neq is even (n = 10; h = 2). c) Subdivision of the par-
tial product matric in MSP, LSPmajor and LSPminor when neq is
odd (n = 10; h = 1). . . . . . . . . . . . . . . . . . . . . . . 99
4.6 LSPminor of unsigned squarer, (n = 10, h = 0). a) LSPminor. In
yellow, elements of IC vector, in green elements of the matrix
composed by a single bit. b) LSPminor in which the dependence
by the elements of IC is shown. . . . . . . . . . . . . . . . . . 101
4.7 LSPminor of unsigned squarer, (n = 10, h = 1). a) LSPminor. In
yellow, elements of IC vector, in green elements of the matrix
formed by a single bit. b) LSPminor in which the dependence
by the elements of IC is shown. . . . . . . . . . . . . . . . . . 104
4.8 Optimal coefficients and constant of flin(IC) when neq is odd
(red line) and even (blue line). The green lines represents the
possible quantization levels. . . . . . . . . . . . . . . . . . . 108
4.9 Mean square error obtained using a full-width folded squarer
with final rounding (blue line§), LMS truncated squarer (red
line), Walters [33] squarer (green line) and LMS1b truncated
multiplier (violet line), varying n for h = 0. . . . . . . . . . . 109
4.10 Mean square error obtained using a full-width folded squarer
with final rounding (blue line§), LMS truncated squarer (red
line), Walters [33] squarer (green line) and LMS1b truncated
multiplier (violet line), varying n for h = 1. . . . . . . . . . . 110
xvi LIST OF FIGURES
5.1 MAC with n bits inputs and 2n bits output. The error provided
by this configuration is zero. . . . . . . . . . . . . . . . . . . 114
5.2 MAC with n bits inputs and n bits output. It uses a full-width
multiplier and the output is rounded to n bits summing a round-
ing constant to the 2n bits output. In this configuration a round-
ing error is present. . . . . . . . . . . . . . . . . . . . . . . . 115
5.3 MAC with n bits inputs, w bits output. It uses a truncated mul-
tiplier with n bits inputs and w < 2n bit output. This config-
uration presents the rounding error as in Fig. 5.2 and the error
introduced by the truncated multiplier. . . . . . . . . . . . . . 115
5.4 Mean square error of low pass filter implemented with differ-
ent truncated multipliers varying the number of outputs bits
w= n+ h (n = 16; h = 0; 2; 4; 6; 8; 10). The applied input is
a signal with uniform distribution. The lines represent the the-
oretical mean square error, obtained with (5.11), the symbols
the simulation results. The theoretical value is experimentally
verified. The orange line is the mean square error of the full
precision MAC with a final rounding. . . . . . . . . . . . . . 120
5.5 Mean square error of FIR implemented with different trun-
cated multipliers varying the number of outputs bits w = n+h
(n = 16; h = 0; 2; 4; 6; 8; 10). The applied input is a sinusoid.
The orange line is the mean square error of the full precision
MAC with a final rounding. . . . . . . . . . . . . . . . . . . . 123
5.6 Mean square error of FIR implemented with different trun-
cated multipliers varying the number of outputs bits w = n+h
(n = 16; h = 0; 2; 4; 6; 8; 10). The applied input has a gaus-
sian probability distribution (rised cosine frequency spectrum).
The orange line is the mean square error of the full precision
MAC with a final rounding. . . . . . . . . . . . . . . . . . . . 124
5.7 Mean square error of FIR implemented with different trun-
cated multipliers varying the number of outputs bits w = n+h
(n = 16; h = 0; 2; 4; 6; 8; 10). The applied input has a expo-
nential probability distribution. The orange line is the mean
square error of the full precision MAC with a final rounding. . 125
5.8 Frequency Response of the FIR filter implemented using the
architecture of Fig. 5.2 and Fig. 5.3 with w = n = 20, h = 2. . 126
6.1 Tungsten SOI chip: gas sensor and integrated CMOS circuitry. 131
6.2 Layout of the microhotplate device. . . . . . . . . . . . . . . 132
LIST OF FIGURES xvii
6.3 Schematic of the circuit composed by the gas sensor (mem-
brane), cascode current mirror to drive heater and temperature
sensor, A/D converter and a temperature controller. y(t) is the
measured temperature, R(kTS) the desired temperature, u(t)
is the control variable and TS is the sampling period. . . . . . 133
6.4 On/Off controller. . . . . . . . . . . . . . . . . . . . . . . . . 134
6.5 Digital implementation of PI algorithm. . . . . . . . . . . . . 136
6.6 Digital implementation of an optimized PI algorithm. . . . . . 136
6.7 Mixed controller. . . . . . . . . . . . . . . . . . . . . . . . . 137
6.8 Digital chip with controllers. . . . . . . . . . . . . . . . . . . 140
6.9 Mixed signal chip. . . . . . . . . . . . . . . . . . . . . . . . 141

List of Tables
3.1 Optimal quantized coefficient (Ch. 2). The (REM(x; y)) sym-
bol indicates the remainder of the integer division x=y. . . . . 68
3.2 Comparison of the performance of proposed truncated (signed)
multipliers. Circuits are implemented in TSMC 0:18m tech-
nology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.3 Truth table describing the Boolean relationship between
(02i 1; 
0
2i) and (2i; 1i; 0i). . . . . . . . . . . . . . . . . . 75
3.4 Previously proposed truncated multipliers considered in the
comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.5 Theoretical and simulated mean square errors of proposed and
previously proposed truncated multiplier (S=signed multiplier;
U=unsigned multiplier), n = 8; 10; 12 and h = 0; 1; 2; 3. . . . 82
3.6 Theoretical and simulated mean square errors of proposed and
previously proposed truncated multiplier (S=signed multiplier;
U=unsigned multiplier) n = 14; 16; 32; 64 and h = 0; 1; 2; 3. . 83
3.7 Theoretical and simulated maximum absolute error
of proposed and previously proposed truncated mul-
tipliers (S=signed multiplier; U=unsigned multiplier),
n = 10; 12; 14; 20; 24; 32; 64 and h = 0; 1; 2; 3. . . . . . . . . 85
3.8 Experimental performance of the signed 16 bit multipliers re-
alized in the test chip shown in Fig. 3.15. . . . . . . . . . . . . 91
4.1 Comparison between squarer, h = 0 . . . . . . . . . . . . . . 111
5.1 Characteristic of the considered FIR Filters.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
xix
xx LIST OF TABLES
5.2 Performances of the FIR Filters implemented using various
Truncated Multipliers, with n = 16. The results have been
obtained with an uniform distributed signal. Bold numbers in-
dicate the best performing circuit for each w(h) value.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.3 Performances of the Low-Pass Filters implemented using var-
ious Fixed-width Multipliers, for n = 12. The results have
been obtained with a uniform distributed signal. Bold num-
bers indicate the best performing circuit for each m(h) value.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.4 Performances of the Low-Pass Filters implemented using var-
ious Fixed-width Multipliers, for n = 20. The results have
been obtained with a uniform distributed signal. Bold num-
bers indicate the best performing circuit for each m(h) value.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.1 Comparison Between Controllers. . . . . . . . . . . . . . . . 139
Notations
X Multiplicand (x1x2 : : : xn 1xn)
Y Multiplier (y1y2 : : : yn 1yn)
P Product Full-width Multiplier (p1p2 : : : p2n 1p2n)
Pt Product Truncated Multiplier (pt;1pt;2 : : : pt;n 1pt;n)W
Boolean operator ORV
Boolean operator AND
 Boolean operator XOR
MSP n-1 most significant columns of the Partial Products matrix
LSP n less significant columns of the Partial Products matrix
IC Input Correction vector ((n+ h+ 1)th column)
i Generic element of the IC
Kround Rounding constant
f(IC) Compensation function
lsb Weight of the less-significant bit of the result [2 n]
xxi
xxii Notations
lsbIC Weight of the IC vector [2 n h 1]
lsbq Weight of the less-significant bit of f(IC) [2 n m]
neq = n  h Number of columns in LSPminor
 Set of all possible values of the vector IC

(A) Subset of all couples (x; y) which give IC = A
LSP(A) Mean value of LSPminor in 
(A)
2LSP(A) Variance of LSPminor in 
(A)
eround Rounding error
round Mean value of the rounding error
"2round Variance of the rounding error
etotal Error of the truncated multiplier [etotal = P   Pt]
total Mean value of etotal
"2total Mean square error of etotal
2total Variance of etotal
etrunc Truncation error
trunc Mean value of etrunc
2trunc Variance of etrunc
eerasing Error related to f(IC), [eerasing = f(IC)  SLSPminor ]
erasing Mean value of eerasing
"2erasing Mean square error of eerasing
Notations xxiii
2erasing Variance of eerasing
"2intrinsic Intrinsic Error, independent by f(IC)
ecomp Compensation error (f(IC)  LSP(IC))
comp Mean value of ecomp
"2comp Mean square value of ecomp
2comp Variance of ecomp
fopt(IC) Optimal Compensation Function
"2low bound Lower error bound of truncated multiplier
flin(IC) Linear Correction Function
lin Mean value of (flin(IC)  LSP(IC))
2lin Variance of (flin(IC)  LSP(IC))
fq(IC) Quantized-Coefficient correction function
flin(IC) Difference between fq(IC) and flin(IC)
quant Mean value of flin(IC)
2q Variance offlin(IC)
"max Maximum absolute error
"FIR Error of the FIR filter with truncated MAC
FIR Mean error of "FIR
"2FIR Mean square error of "FIR
2FIR Variance of "FIR

Introduction
I
n the wireless multimedia word, DSP systems are ubiquitous. DSP al-
gorithms are computationally intensive and test the limits of battery life
in portable device such as cell phones, hearing aids, MP3 players, digital
video recorders and so on. Multiplication and squaring are the main opera-
tion in many signal processing algorithms (filtering, convolution, FFT, DCT,
euclidean distance etc.), hence efficient parallel multipliers are desirable.
A full-width digital n n bits multiplier computes the 2n bits output as a
weighted sum of partial products. A multiplier with the output represented on
n bits output is useful, as example, in DSP datapaths which saves the output
in the same n bits registers of the input. Note that the truncated multipliers are
useful not only for DSP but also for digital, computational intensive, ASICs
where the bit-widths at the output of the arithmetic blocks are chosen on the
basis of system-related accuracy issues. Hence 2n bits of precision at the mul-
tiplier output are very often more than required.
A truncated multiplier is an n  n multiplier with n bits output. Since in
a truncated multiplier the n less-significant bits of the full-width product are
discarded, some of the partial products are removed and replaced by a suit-
able compensation function, to trade-off accuracy with hardware cost. Several
techniques have been proposed in the Literature following this basic idea. The
difference between the various circuits is in the choice and the implementation
of the compensation circuit.
The correction techniques proposed in the Literature are obtained through
exhaustive search. This means that the results are only available for small n
values and that the proposed approach are not extendable to greater bit widths.
Furthermore the analytical characterization of the error is not possible.
In this dissertation an innovative solution for the design and characteriza-
tion of truncated multipliers is presented.
The proposed circuits are based on the analytical calculation of the error of
the truncated multiplier. This approach allows to have the description of a mul-
xxv
xxvi Introduction
tiplier characterized by a minimum mean square error which gives a fast and
low power VLSI implementation. Furthermore the analytical approach yields
to a closed form expression of the mean square error and maximum absolute
error for the proposed truncated multipliers. In this way the a priori knowledge
of the output error is available. The errors are known for every bit width of the
multiplier and it is also possible to decide, for a given bit width, which cor-
rection circuit has to be used in order to obtain a certain error. This analytical
relation between the error and the parameters of hardware implementation is
extremely important for the digital designer, since now it is possible to select
the suitable implementation as a function of the desired accuracy.
Proposed truncated multipliers overcome the previously proposed trun-
cated multipliers since provide lower error, lower power dissipation, lower
area occupation and also provide higher working frequency. The circuits are
also easily implemented and allow an automatic HDL description as a function
of bit width and desired error. The complete description of the errors for the
truncated multipliers allows the use of these circuits as building blocks for
more complex systems. It will be shown how the proposed multiplier can be
used to design low area occupation FIR filters and an efficient PI temperature
controller.
Dissertation outline
The outline of the thesis is the following:
Chapter 1 summarizes the state of art full-width and truncated multi-
pliers. The two existent methods for the implementation of the truncated
multipliers, constant correction and variable correction, are described and
analyzed.
Chapter 2 describes the analytical characterization of the proposed truncated
multiplier. First the optimal compensation function fopt(IC), that minimizes
the mean square error of the truncated multiplier, is analytically calculated
and characterized. Since fopt(IC) is a quadratic form, with the help of the
developed theory, the derivation of a sub-optimal compensation function,
flin(IC) (first grade approximation) is described. In the Chapter it is shown
that the linear compensation function yields error performances very similar
to the one achievable with fopt(IC) and has the additional advantage of an
Introduction xxvii
easy hardware implementation. Finally the aspects related to the quantization
of the coefficient of flin(IC) are investigated. The expression, in closed form,
of the intrinsic error, mean square error and maximum absolute error of the
multiplier are given.
Chapter 3 deals with the practical implementation of the quantized lin-
ear compensation function proposed in the Chapter 2. The performances
of the new truncated multipliers are extensively compared with previously
proposed circuits in terms of maximum absolute error, mean square error,
hardware performances.
Chapter 4 extends the methodology described in Chapter 2 for the truncated
multipliers to the truncated squarer. After a brief introduction about the state
of the art squarer, the optimal compensation function, its linear approximation
and quantized version are shown. Finally the results of the simulations and
hardware implementation are presented and a comparison with the existent
techniques is highlighted.
Chapter 5 is focused on a typical DSP application, FIR filter. In this
chapter the analysis of the impact of the use of truncated multiplier in FIR
filter is presented. The parameters of the truncated multipliers that are more
significant for the optimization of a digital filter are defined. In order to
evaluate the performances of the proposed LMS multiplier in this application,
a comparison with the state of art multiplier is carried out in terms of error,
power, frequency and area occupation.
Chapter 6 describes another application where is possible to use the
truncated multipliers: temperature control. In this chapter the implementation
of a PI temperature controller for a gas sensor using the LMS truncated
multiplier is described. This implementation is compared with the traditional
PI controller, the OnOff controller, the mixed controller. Finally two chips are
presented.
Publications
• V. Garofalo, E. Napoli, A.G.M. Strollo, ”Code compression for ARM7
embedded systems”. European Conference on Circuit Theory and De-
xxviii Introduction
sign. 26-30 August 2007
• E. Della Sala, E. Sciagura, D. De Caro, A. Caravella, P. Longobardi,
P. Zicari, V. Garofalo, G. Chapuano, P. Corsonello, E. Napoli, A.G.M.
Strollo, ”High Rate Data Down Link.” proc. of 18th ESA Symposium
for Sounding Rockets and Balloons, June, 2007
• V. Garofalo, N. Petra, D. De Caro, A.G.M. Strollo, E. Napoli, ”Low
error Truncated Multipliers for DSP applications”, IEEE International
Conference on Electronics, Circuits, and Systems 1-3 September 2008
• A.G.M. Strollo, D. De Caro, N. Petra, E. Napoli, V. Garofalo, ”Con-
strained Piecewise Polinomial Approximation for Hardware Implemen-
tation of Elementary Functions ”, IEEE International Conference on
Electronics, Circuits, and Systems 1-3 Sept. 2008
• V. Garofalo, ”Fixed-width multipliers for the implementation of efficient
digital FIR filters”, Microelectronics Journal, vol. 39; p. 1491-1498,
ISSN: 0959-8324 (doi:10.1016/j.physletb.2003.10.071)
• N. Petra, D. De Caro, V. Garofalo, E. Napoli, A.G.M. Strollo, ”Trun-
cated Binary Multipliers with Optimal Compensation Function”, ac-
cepted to IEEE Transaction on Circuits and Systems I
• V. Garofalo, P.K. Guha, S.Z. Ali, S. Santra, E. Napoli, F. Udrea ”Mixed
Signal Temperature Control Circuit for On-Chip CMOS Gas Sensor”,
International Semiconductor Conference 2009, October 2009, Sinaia,
Romania
• S.Z. Ali, S. Santra, P.K. Guha, I. Haneef, V. Garofalo, C. Schwandt, J.A.
Covington, R.V. Kumar, J.W. Gardner, W.I. Milne, F. Udrea, ”Nanowire
Hydrogen Gas Sensor Employing CMOSMicro-Hotplate ”, IEEE SEN-
SORS 2009 Conference, 25-28 October 2009, New Zealand
• V. Garofalo, N. Petra, E. Napoli, ”Analytical calculation of the max-
imum error for a family of truncated multipliers providing minimum
mean square error”, submitted to IEEE Transaction on Computers
• V. Garofalo, M. Coppola, D. De Caro, E. Napoli, N. Petra, A.G.M.
Strollo, ”A Novel Truncated Squarer with Linear Compensation Func-
tion”, submitted to IEEE Internat Symposium on Circuits and Systems
(ISCAS) 2010
Introduction xxix
• N. Petra, D. De Caro, A. G. M. Strollo, V. Garofalo, E. Napoli, M. Cop-
pola, P. Todisco, ”Fixed-Width CSD Multipliers With Minimum Mean
Square Error”, submitted to IEEE ISCAS 2010
• A.G.M. Strollo, D. De Caro, N. Petra, E. Napoli, M. Coppola, V. Garo-
falo, ”Non Uniform Piecewise-Linear Approximation For High Perfor-
mance Direct Digital Frequency Synthesizers”, submitted to IEEE IS-
CAS 2010
• D. De Caro, M. Coppola, N. Petra, E. Napoli, A.G.M. Strollo, V. Garo-
falo, ”High Speed Differential Resistor Ladder for A/D Converters”,
submitted to IEEE ISCAS 2010

Chapter 1
Binary Multiplication
M
ultiplication is one of the most area consuming arithmetic operations
in high-performance circuits. As a consequence many research works
deal with low-power design of high-speed multipliers. Multiplication involves
two basic operations, the generation of the partial products and their sum, per-
formed using two kinds of multiplication algorithms, serial and parallel. Serial
multiplication algorithms use sequential circuits with feedbacks: inner prod-
ucts are sequentially produced and computed. Parallel multiplication algo-
rithms often use combinational circuits and do not contain feedback structures.
A full-width digital n  n multiplier computes the 2n bits output as a
weighted sum of partial products. When an application requires a multiplica-
tion output with n bits, it is substituted by a truncated multiplier, which is a
n n multiplier with n bits output. After a short description of the generation
of partial product done in Sec. 1.1, Sec. 1.2 describes the common techniques
used in order to implement full-width multipliers. Finally Sec. 1.3 summarizes
the state of art for truncated multipliers.
1.1 Partial-Product Generation
The following notation is used in our discussion of multiplication algorithm:
X Multiplicand (x1x2 : : : xn 1xn)
Y Multiplier (y1y2 : : : yn 1yn)
P Product Full-width Multiplier (X  Y ) (p1p2 : : : p2n 1p2n)
Pt Product Truncated Multiplier (X  Y )t (pt;1pt;2 : : : pt;n 1pt;n)
1
2 CHAPTER 1. BINARY MULTIPLICATION
x y5 8 x y6 8 x y7 8 x y8 8x y4 8x y3 8x y1 8 x y2 8
x y6 7 x y7 7 x y8 7x y5 7x y4 7x y1 7 x y2 7 x y3 7
x y7 6 x y8 6x y6 6x y5 6x y2 6 x y3 6x y1 6 x y4 6
x y8 5x y7 5x y6 5x y3 5 x y4 5x y1 5 x y2 5 x y5 5
x y8 4x y7 4x y4 4 x y5 4x y2 4 x y3 4x y1 4 x y6 4
x y8 3x y5 3 x y6 3x y3 3 x y4 3x y1 3 x y2 3 x y7 3
x y1 2 x y5 2 x y6 2x y4 2 x y4 2x y2 2 x y3 2 x y8 2
x y2 1 x y6 1 x y8 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
p2 p3 p4 p5 p6 p7 p8 p9 p15p1 p11 p12p10 p13 p14 p16
2
-1
2
-7
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-8
2
-9
2
-6
2
-13
2
-4
2
-11
2
-15
2
-16
2
-5
2
-10
2
-14
2
-12
2
-3
2
-2
WEIGHT
MULTIPLICAND
MULTIPLIER
PARTIAL
PRODUCTS
PRODUCT
Figure 1.1: Unsigned Partial Products matrix, n = 8.
Parallel multipliers first generate partial products, then add them together to
produce the product. In the following it will be described the generation of
Partial Products for unsigned, two’s complement, and mixed operand multipli-
ers.
1.1.1 Unsigned Multiplication
Without loss of generality, let’s assume that the inputs of the multiplier repre-
sent n bit unsigned fractional value in [0; 1):
X =
nX
i=1
xi  2 i (1.1)
Y =
nX
i=1
yi  2 i (1.2)
The output of the multiplier P = X  Y is computed as:
P =
2nX
i=1
pi  2 i =
nX
i=1
nX
j=1
xiyj  2 i j (1.3)
Fig. 1.1 shows the matrix of the Partial Products (PPs) xiyj for a 8 8 multi-
plier. Each row of the matrix corresponds to the products of the multiplicand
X and a single bit of the multiplier Y . Generation of the partial products for
unsigned multiplication can be implemented with n2 AND gates.
1.1. PARTIAL-PRODUCT GENERATION 3
x y5 8 x y6 8 x y7 8 x y8 8x y4 8x y3 8x y1 8 x y2 8
x y6 7 x y7 7 x y8 7x y5 7x y4 7x y1 7 x y2 7 x y3 7
x y7 6 x y8 6x y6 6x y5 6x y2 6 x y3 6x y1 6 x y4 6
x y8 5x y7 5x y6 5x y3 5 x y4 5x y1 5 x y2 5 x y5 5
x y8 4x y7 4x y4 4 x y5 4x y2 4 x y3 4x y1 4 x y6 4
x y8 3x y5 3 x y6 3x y3 3 x y4 3x y1 3 x y2 3 x y7 3
x y1 2 x y5 2 x y6 2x y4 2 x y4 2x y2 2 x y3 2 x y8 2
x y2 1 x y6 1 x y8 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
p2 p3 p4 p5 p6 p7 p8 p9 p15p1 p11 p12p10 p13 p14 p16
-2
-1
2
-7
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-8
2
-9
2
-6
2
-13
2
-4
2
-11
2
-15
2
-16
2
-5
2
-10
2
-14
2
-12
2
-3
2
-2
1
1
WEIGHT
MULTIPLICAND
MULTIPLIER
PARTIAL
PRODUCTS
PRODUCT
Figure 1.2: Two’s complement Partial Products matrix, n = 8.
1.1.2 Two’s Complement Multiplication
The unsigned multiplication matrix can be modified for operation on two’s
complement operands using the technique presented by Baugh and Wooley
[1]. If the inputs of the multiplier represent n bit two’s complement fractional
value in [0; 1):
X =  x1  2 1 +
nX
i=2
xi  2 i (1.4)
Y =  y1  2 1 +
nX
i=2
yi  2 i (1.5)
the output of the full-width multiplier P = XY is computed as:
P =  p1  2 1 +
2nX
i=2
pi  2 i = x1y12 2 +
nX
i=2
nX
j=2
xiyj  2 i j
 
nX
i=2
x1yi  2 i 1  
nX
i=2
xiy1  2 i 1 (1.6)
The first two terms of eq. (1.6) are positive, while the last two terms are either
zero or negative. In order to calculate the product, instead of subtract the last
two terms it is possible to add the opposite values. Since the representation is
in two’s complement, the opposite is easily calculated considering all the bit
4 CHAPTER 1. BINARY MULTIPLICATION
complemented and adding 1 in the less significant column:
 
nX
i=2
x1yi  2 i 1 =
nX
i=2
(x1yi)  2 i 1 + 2 2 + 2 n 1 (1.7)
 
nX
i=2
xiy1  2 i 1 =
nX
i=2
(xiy1)  2 i 1 + 2 2 + 2 n 1 (1.8)
note that in eq. (1.7)-(1.8) it is present also the 1 (whose weight is 2 2) due to
the sign extension. Finally one obtains:
P = 2 1 + 2 n + x1y12 2 +
nX
i=2
nX
j=2
xiyj  2 i j
+
nX
i=2
(xiy1 + x1yi)  2 i 1 (1.9)
Fig. 1.2 shows the matrix of the Partial Products (PPs) xiyj for a 8  8 mul-
tiplier. Generation of the partial products for two’s complement multiplica-
tion can be implemented with
 
(n  1)2 + 1 AND gates and (2n  2) NAND
gates.
1.1.3 Mixed-Operand Multiplication
If X is a n bit unsigned fractional number and Y is n bit two’s complement
fractional number:
X =
nX
i=1
xi  2 i (1.10)
Y =  y1  2 1 +
nX
i=2
yi  2 i (1.11)
following the same reasoning of Sec. 1.1.2, the final product can be computed
as:
P = 2 1 + 2 n 1 +
nX
i=1
nX
j=2
xiyj  2 i j +
nX
i=1
xiy1  2 i 1 (1.12)
Fig. 1.3 shows the matrix of the Partial Products (PPs) xiyj for a 8 8 multi-
plier. Generation of the partial products for mixed-operand multiplication can
be implemented with n(n  1) AND gates and n NAND gates.
1.2. FULL-WIDTH MULTIPLIER 5
x y5 8 x y6 8 x y7 8 x y8 8x y4 8x y3 8x y1 8 x y2 8
x y6 7 x y7 7 x y8 7x y5 7x y4 7x y1 7 x y2 7 x y3 7
x y7 6 x y8 6x y6 6x y5 6x y2 6 x y3 6x y1 6 x y4 6
x y8 5x y7 5x y6 5x y3 5 x y4 5x y1 5 x y2 5 x y5 5
x y8 4x y7 4x y4 4 x y5 4x y2 4 x y3 4x y1 4 x y6 4
x y8 3x y5 3 x y6 3x y3 3 x y4 3x y1 3 x y2 3 x y7 3
x y1 2 x y5 2 x y6 2x y4 2 x y4 2x y2 2 x y3 2 x y8 2
x y2 1 x y6 1 x y8 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
p2 p3 p4 p5 p6 p7 p8 p9 p15p1 p11 p12p10 p13 p14 p16
-2
-1
2
-7
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-8
2
-9
2
-6
2
-13
2
-4
2
-11
2
-15
2
-16
2
-5
2
-10
2
-14
2
-12
2
-3
2
-2
WEIGHT
MULTIPLICAND
MULTIPLIER
PARTIAL
PRODUCTS
PRODUCT
1 1
Figure 1.3: Mixed-operand Partial Products matrix, n = 8. Y is n
bit two’s complement fractional number, hence y1 has weight equal to
 2 1.
1.2 Full-width Multiplier
A Full-width digital nnmultiplier computes the 2n bits output as a weighted
sum of partial products. A parallel-tree multipliers reduces the matrix of par-
tial products to two rows using a combination of full-adders and half-adders.
The remaining two rows are then added by a carry propagate adder to pro-
duce the final result. In this Section at first the most common used techniques,
Wallace and Dadda trees, are described; then details on an efficient VLSI im-
plementation, the array multipliers, are given. Finally Sec. 1.2.4 describes
Three Reduction Methods (TDM), a method for the generation of a parallel
multiplier, optimized for speed, which will be used in the implementation of
the proposed truncated multipliers.
1.2.1 Wallace-Tree Multiplier
The reduction scheme published byWallace [2] begins by grouping the partial-
product matrix into sets of three rows. Each set is reduced to two rows using
half-adders on sets of two bits and full-adders on sets of three bits. Excess
rows that do not belong to a set of three are passed to the next reduction stage
unmodified. Each reduction stage is processed in a similar way until only two
rows remain. Then a final CPA (Carry Propagate Adder) is used. Fig. 1.4
shows the dot diagram illustrating Wallace reduction for an 8  8 multiplier.
Dot diagrams, developed by Dadda, are a convenient means for visualizing
6 CHAPTER 1. BINARY MULTIPLICATION
CPA
Input
Matrixof
Partial
Products
h=8
After
I stage
h=6
After
I stage
h=4
After
I stage
h=3
Figure 1.4: Wallace reduction for an 8 8 multiplier. Full circle: Full
Adder. Dashed circle: Half Adder.
the placement of full-adders and half-adders. In such diagrams, dots represent
bits, given by partial products. For example, the upper-right dot in the mul-
tiplication matrix is x8y8. The circle with three dots represents a Full-Adder
(FA). In the next stage the circle is substituted by its outputs: a bit sum, a dot
in the same column, and a bit carry, a dot in the column on the left. Same
reasoning is valid for the half-adder, represented by a circle with dashed line.
Summarizing in the Wallace tree the number of the operands is reduced at the
earliest opportunity, so, if there are m dots in the column we immediately ap-
ply bm=3c full adder to that column. This tends to minimize the overall delay
by making the final CPA as short as possible.
1.2. FULL-WIDTH MULTIPLIER 7
CPA
Input
Matrixof
Partial
Products
h=8
After
I stage
h=6
After
I stage
h=4
After
I stage
h=3
Figure 1.5: Dadda reduction for an 8  8 multiplier. Full circle: Full
Adder. Dashed circle: Half Adder.
1.2.2 Dadda-Tree Multiplier
Wallace’s method is refined by Dadda [3] into a technique that is optimum
in the sense that it uses a minimum number of full-adders. At each stage
of reduction, the height h of the matrix of Partial Products can be reduced
following the Dadda’s series fd1; d2; : : : ; dj ; : : :g = f2; 3; 4; 6; 9; : : :g, where:
d1 = 2
dj+1 =

3
2
dj

(1.13)
8 CHAPTER 1. BINARY MULTIPLICATION
AND
MHA
AND AND AND AND AND AND AND
AND MHA MHA MHA MHA MHA MHA
MFAAND MFA MFA MFA MFA MFA MFA
MFAAND MFA MFA MFA MFA MFA MFA
MFAAND MFA MFA MFA MFA MFA MFA
MFAAND MFA MFA MFA MFA MFA MFA
MFAAND MFA MFA MFA MFA MFA MFA
MFAAND MFA MFA MFA MFA MFA MFA
CARRY PROPAGATE ADDER
x3x4x5x6x7x8 x2 x1
y3
y4
y5
y6
y7
y8
y2
y1
p2 p3 p4 p5 p6 p7 p8
p9
p15
p1
p11
p12
p10
p13
p14
p16
Critical
path
Figure 1.6: Unsigned array multiplier, n = 8.
Dadda’s reduction uses the full-adder and the half-adder only if the height of
the matrix can be reduced following Dadda series, until 2 rows are reached.
Fig. 1.5 shows Dadda reduction for an 8  8 multiplier. The partial-product
matrix has a maximum height of 8, so the first step of reduction leads to h = 6,
then 4; 3; 2; four reduction steps are necessary.
Dadda reduction of an nnmultiplier requires  n2   2n+ 3 full-adders,
(n  1) half-adders, and a carry-propagate adder with a length of (2n  2) [4].
Dadda reduction uses fewer full-adders and half-adders than Wallace reduc-
tion, but requires a longer carry-propagate adder.
1.2.3 Array Multiplier
Array multiplier is a very common type of parallel multiplier. Array mul-
tipliers are not as fast as tree multipliers, but their regular structure and local
interconnect are advantages for very-large-scale-integration (VLSI) implemen-
tation.
Unsigned Array Multipliers Fig. 1.6 shows an 8 8 unsigned array mul-
tiplier. Each cell in the array receives a bit from the multiplicand, xi, and a
1.2. FULL-WIDTH MULTIPLIER 9
AND
MHA
AND AND AND AND AND AND AND
AND MHA MHA MHA MHA MHA MHA
x3x4x5x6x7x8 x2 x1
y3
y4
y5
y6
y7
y8
y2
y1
p9p15 p11p12 p10p13p14p16
MFAAND MFA MFA MFA MFA MFA MFA
AND
AND
AND
AND
AND
FA FA FA FA FA FA FA
p2
p3
p4
p5
p6
p7
p8
p1
MFA MFA MFA MFA MFA MFA MFA
MFA MFA MFA MFA MFA MFA MFA
MFA MFA MFA MFA MFA MFA MFA
MFA MFA MFA MFA MFA MFA MFA
MFA MFA MFA MFA MFA MFA MFA
Figure 1.7: Unsigned array multiplier, n = 8, rectangular shape.
bit from the multiplier, yj . Cells comprising the top row and left diagonal
are AND gates, which generate the unsigned partial-product bits as described
in Sec. 1.1.1 and pass them to modified half-adders (MHA) or modified full-
adders (MFA). A MHA is a half-adder with an additional AND gate that gen-
erates xiyj , and adds it to a single bit generated by another block. A MFA is
a full-adder with an additional AND gate that generates xiyj , and adds it to
the sum bit and the carry bit generated by two MHAs or MFAs above it. The
least-significant bits of the product are produced directly by cells in the ar-
ray. The most-significant bits in the product are produced by a carry-propagate
adder that adds the sum bits and carry bits from the bottom row of cells in
the array. The critical path of an array multiplier, shown in the lower right of
Fig. 1.6, is through the highest column in the array plus the carry-propagate
adder. Fig. 1.7 shows the array multiplier redrawn as it typically appears in the
literature. In this figure, a ripple-carry adder is used as the final carry-propagate
adder (CPA), although any CPA could be used.
10 CHAPTER 1. BINARY MULTIPLICATION
NAND
x8
y3
y4
y5
y6
y7
y8
y2
y1
p9p15 p11p12 p10p13p14p16
NMHA
SHA
p2
p3
p4
p5
p6
p7
p8
p1NAND
NAND
NAND
NAND
NAND
NAND
AND NMHA NMHA NMHA NMHA NMHA NMHA
MHA
AND AND AND AND AND AND AND
MHA MHA MHA MHA MHA MHA
x3x4x5x6x7 x2 x1
MFA MFA MFA MFA MFA MFA MFA
MFA MFA MFA MFA MFA MFA MFA
MFA MFA MFA MFA MFA MFA MFA
MFA MFA MFA MFA MFA MFA MFA
MFA MFA MFA MFA MFA MFA MFA
FA FA FA FA FA FA
Figure 1.8: Signed array multiplier, n = 8.
Two’s-Complement Array Multipliers Fig. 1.8 shows an 8 8 array mul-
tiplier modified for two’s-complement operation. As described in Sec. 1.1.2,
some of the bits in the partial-product matrix must be inverted, and constant
1 must be added. Half of the bit inversions are performed by replacing AND
gates with NAND gates. The remaining bit inversions are accomplished by re-
placingMFAs with NMFAs. A NMFA is a negating-modified full-adder, which
is simply a MFA with a NAND gate to generate xiyj instead of an and gate
generating xiyj . As the both operands are n bits long, 1 must be added to the
22n 1 and 2n columns. Adding 1 to the 22n 1 column is accomplished by
inverting the p2n 1 bit. Adding 1 to the 2n column is accomplished by setting
the carry into the carry-propagate adder to 1. If a ripple-carry adder is used,
as in Fig. 1.8, then the half-adder can be replaced by a specialized half-adder
(SHA). Specialized half-adders add two bits plus a constant 1.
1.2. FULL-WIDTH MULTIPLIER 11
1.2.4 Three Dimensional Minimization (TDM) method
In [5] Oklobdzija et al. suggested a new approach, the Three Dimensional
Method (TDM) for Partial Product Reduction Tree (PPRT). The authors took
the approach of generation the compressors of maximum possible size (i.e.,
the size of the multiplier): they abandoned the notion of levels and undertook
a design of an optimized one-level compressor which evolved into an opti-
mization process involving the entire array. The method is based on the fact
that not all the inputs and the outputs of a compressor contribute equally to
the delay. Therefore, Oklobdzija et al. sort them in a way which favors the
use of fast inputs and outputs in the path that are critical to the speed, while
they assign slow inputs to the signal paths which belong to the domain where
an increase in the delay is tolerable. The algorithm for Automatic Generation
of Partial Product Array can be read in [5]. Even if the number of cells used
by TDM approach and Dadda’ one is the same, the two approaches are very
different since in TDM algorithm all the partial products are compressed into
a single step, hence no intermediate partial products are considered in TDM
approach. However, TDM still produces a carry save number to be translated
to a conventional form with a fast carry propagate adder (CPA).
Oklobdzija et al. suggested a good heuristic for finding the optimal PPRT
(Partial Product Reduction Tree), but no proof about the performance of this
heuristic are given. Stelling et al. [6] provides a formal characterization of
optimal PPRT circuits and prove a number of properties about them. Further-
more the authors present an algorithm that produces a minimum delay circuit
in time linear with the size of the inputs, providing tight lower bounds on mul-
tiplier circuit delays. These results are combined to create a program that finds
optimal TDM multiplier designs. Using the program, Stelling et al. show that,
while the heuristic used in [5] does not always find the optimal TDM circuit,
it performs very well in terms of overall PPRT circuit delay and they found,
with their new search algorithms, a better PPRT circuits for reducing the delay
of the entire multiplier.
In [7] the authors propose a new algorithm for synthesizing fast arithmetic
circuits. Analyzing the Wallace tree compressor (i.e. bit-level carrysave addi-
tion array) the authors note that the scheme has been applied only in a rather
restrictive way, i.e. for implementing fast multipliers and for generating fixed
structures without considering the characteristic of the input signals. Since the
Wallace algorithm is implemented by two stage, the carry-save additions of the
partial products (implemented with full-adders (FAs)) and a carry-propagate
addition (CPA) to sum the two addends produced by the first stage, in [7] the
12 CHAPTER 1. BINARY MULTIPLICATION
authors focused on the optimization of the first stage. They looked for a solu-
tion to the problem of reducing the addend matrix to a matrix with at most two
addends at each column, by allocating FAs. They proposed a simple construc-
tive procedure for allocating FAs to reduce the single column with m addends
to one with at most two addends. The algorithm can be found in [7].
1.3 Truncated Multiplier
A truncated multiplier is an n  n multiplier with n bits output. As it is
shown if Fig. 1.9 the partial products can be divided into two subsets. The
least significant part (LSP) includes the n less significant columns of the partial
product matrix, while the most significant part (MSP) includes the remaining
columns. The full-width multiplier output, P is given by
P = SMSP + SLSP (1.14)
where SMSP and SLSP represent the weighted sum of the elements of MSP and
LSP respectively.
When a n bits output is needed, the most accurate choice is using the full
rounded multiplier: it computes all the matrix of partial products, add a con-
stant to the result on 2n bits and takes only the first n bits of the sum. The
error introduced by the full rounded multiplier is calculated in Sec. 1.3.1. Un-
fortunately the full rounded multiplier is the solution with the highest area
occupation and power dissipation.
A second possibility is using a truncated multiplier in which the partial
products of the LSP are discarded assuming that their contribution to the n
most significant bits of the output is negligible. This solution is very advan-
tageous in terms of hardware performances. For example in Fig. 1.10 there is
the implementation of the array truncated multiplier with n = 8. The cells for
the LSP matrix are not present and the final circuit halves the number of cells
compared to the full-width one (Fig. 1.7). However, a straightforward analysis
shows that the direct elimination of the partial products of the LSP causes a
very big error bounded by (n=2   1)lsb, where lsb is the weight of the least
significant bit of the result.
Ranging between these two extreme cases, it is possible to devise numer-
ous circuit alternatives: only some of some of the partial products in the LSP
are removed to trade-off accuracy with hardware cost. Several techniques that
follow this idea have been proposed in the Literature [8, 9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19]. These techniques introduce suitable compensation circuits,
1.3. TRUNCATED MULTIPLIER 13
x y5 8 x y6 8 x y7 8 x y8 8x y4 8x y3 8x y1 8 x y2 8
x y6 7 x y7 7 x y8 7x y5 7x y4 7x y1 7 x y2 7 x y3 7
x y7 6 x y8 6x y6 6x y5 6x y2 6 x y3 6x y1 6 x y4 6
x y8 5x y7 5x y6 5x y3 5 x y4 5x y1 5 x y2 5 x y5 5
x y8 4x y7 4x y4 4 x y5 4x y2 4 x y3 4x y1 4 x y6 4
x y8 3x y5 3 x y6 3x y3 3 x y4 3x y1 3 x y2 3 x y7 3
x y1 2 x y5 2 x y6 2x y4 2 x y4 2x y2 2 x y3 2 x y8 2
x y2 1 x y6 1 x y8 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
p2 p3 p4 p5 p6 p7 p8 p9 p15p1 p11 p12p10 p13 p14 p16
2
-1
2
-7
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-8
2
-9
2
-6
2
-13
2
-4
2
-11
2
-15
2
-16
2
-5
2
-10
2
-14
2
-12
2
-3
2
-2
LSP
MSP
LSPmajor
LSPminor
IC
n h n-h-1
Figure 1.9: Subdivision of the matrix of Partial Products, for unsigned
multiplier, n = 8. MSP (Most Significant Part) of the PPs Matrix is
constituted by the first n columns. LSP (Less Significant Part) of the
PPs Matrix is constituted by the last n columns. IC (Input Correction)
vector is the (n+ h+ 1)th column.
that partly compensates for the dropped terms, thereby reducing the approx-
imation error. The proposed approaches differ one another in the choice and
the implementation of the compensation circuit, but they can be classified into
two groups, constant correction methods and variable correction methods. To
discuss these techniques, let’s indicate the h most significant columns of LSP
as LSPmajor , while the remaining neq = n h columns will be named LSPminor
(see Fig. 1.9). The value of h is a design parameter, that can range from h = 0
to h = n. In addition, the leftmost column of LSPminor (highlighted in yellow
in Fig. 1.9) is named Input Correction (IC). The IC, composed by neq partial
products, is used to estimate the weighted sum of the elements of the LSPminor.
For unsigned multiplier and signed multiplier with h 6= 0, IC is given by:
IC = fxh+1yn+1 ig; i = 1 : : : neq (1.15)
while the first and the last elements of IC are complemented if a signed multi-
plier with h = 0 is considered (x1yn; xny1 instead of x1yn; xny1).
14 CHAPTER 1. BINARY MULTIPLICATION
AND
x3x4x5x6x7x8 x2 x1
y3
y4
y5
y6
y7
y8
y2
y1
p9p15 p11p12 p10p13p14p16
MHAAND
MFAAND MHA
MFAAND MFA MHA
MFAAND MFA MFA MHA
MFAAND MFA MFA MFA MHA
MFAAND MFA MFA MFA MFA MHA
HA
p2
p3
p4
p5
p6
p7
p8
p1
FA FA FA FA FA FA
Figure 1.10: Unsigned array truncated multiplier, n = 8. Dashed lines
are for the not computed cells.
1.3.1 Full Rounded (Round-to-Nearest) Multiplier
The output of the full-rounded multiplier is:
Pround = truncn(SMSP + SLSP +Kround) (1.16)
where truncj indicates the truncation to j bits of the result, and Kround is the
rounding constant, whose value is lsb=2. Fig. 1.11 shows the partial product
matrix of 88 unsigned rounded multiplier. The full-rounded multiplier gives
the smallest possible error e = P   Pt, bounded by lsb=2. In the hypothesis
of input bits independent and identically distributed the mean and mean square
value of the error are:
round = E[eround] = 0 (1.17)
"2round = E[e
2
round] =
1
12
lsb2 (1.18)
1.3. TRUNCATED MULTIPLIER 15
x y5 8 x y6 8 x y7 8 x y8 8x y4 8x y3 8x y1 8 x y2 8
x y6 7 x y7 7 x y8 7x y5 7x y4 7x y1 7 x y2 7 x y3 7
x y7 6 x y8 6x y6 6x y5 6x y2 6 x y3 6x y1 6 x y4 6
x y8 5x y7 5x y6 5x y3 5 x y4 5x y1 5 x y2 5 x y5 5
x y8 4x y7 4x y4 4 x y5 4x y2 4 x y3 4x y1 4 x y6 4
x y8 3x y5 3 x y6 3x y3 3 x y4 3x y1 3 x y2 3 x y7 3
x y1 2 x y5 2 x y6 2x y4 2 x y4 2x y2 2 x y3 2 x y8 2
x y2 1 x y6 1 x y8 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
p2 p3 p4 p5 p6 p7 p8 p9 p15p1 p11 p12p10 p13 p14 p16
2
-1
2
-7
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-8
2
-9
2
-6
2
-13
2
-4
2
-11
2
-15
2
-16
2
-5
2
-10
2
-14
2
-12
2
-3
2
-2
1
LSP
MSP
Kround
DISCARDED
Figure 1.11: Full-rounded multiplier, n = 8. Kround is the rounding
constant. The n less significant bits of the results are discarded.
In Literature it is common to refer to this multiplier in order to make a com-
parison in terms of area, delay, power and error. The comparison can be in-
teresting when we speak about hardware performances, but it’s worth noticing
that "2round is a value which can’t be reached by any truncated multiplier. As
it will show in Sec. 2.3.1 the real lower bound, called intrinsic error, is higher
than "2round.
1.3.2 Constant Correction Methods
The Constant CorrectionMethods (CCM) use a constant value, independent on
the actual values of the inputs, in order to estimate the LSPminor. The multiplier
output can be written as:
PCCM = truncn(SMSP + SLSPmajor + constant) (1.19)
where SLSPmajor is the weighted sum of the elements of the LSPmajor.
The simplest approach has been proposed by Kidambi et. al. in [8]. In
this technique the LSP is eliminated and is substituted by a constant term,
calculated considering only the lose carries. This approach reduces up to 50%
the area of the full-width multiplier, but introduces a rather large error, which
rapidly increases with n, resulting impractical in most applications.
16 CHAPTER 1. BINARY MULTIPLICATION
CORRECTION
FUNCTION
f(IC)
ROUNDING
CONSTANT
PARTIAL
PRODUCT
GENERATION
+
TRUNCATION
n n
n
x y
IC
LSPmajor
MSP
f(IC)
PVCM
Figure 1.12: General scheme of a truncated multiplier. The compen-
sation function f(IC) is estimated from the value of IC. The LSPminor
is not computed. The result is reported on n bits with a truncation
operation.
In [9] Lim proposes a technique to improve the accuracy by discarding only
the LSPminor, shown in Fig. 1.9. The authors noticed that the major contribu-
tion to the error was given by the discarded carries of LSPminor which ripples
into LSPmajor. Hence in order to calculate the constant value, Lim only esti-
mates these carries. The value of h is used as a parameter to trade-off accuracy
with complexity. The approach of [9] is refined in [11] where the rounding
error is also taken into account in the computation of the optimal value for the
correction constant.
1.3.3 Variable Correction Methods
The accuracy of truncated multipliers can be significantly improved using
variable-correction truncated multipliers [18, 10, 11, 12, 13, 14, 15, 16, 17, 19],
that compensate the effect of the dropped terms with a non-constant compen-
sation function. The terms in the IC are used to estimate the weighted sum of
the elements of the LSPminor and the multiplier output is computed as:
PV CM = truncn(SMSP + SLSPmajor + f(IC) +Kround) (1.20)
where f(IC) is a suitable compensation function. The Fig. 1.12 shows the gen-
eral scheme of the truncated multiplier designed using a variable-correction
1.3. TRUNCATED MULTIPLIER 17
scheme. f(IC) and the rounding constant are added to MSP and LSPmajor to
obtain the result, which is finally reported on n bits with a truncation operation.
For a given h value, the hardware complexity and approximation error intro-
duced by the scheme in Fig. 1.12 depends on the choice of the compensation
function f(IC).
King, Stine and Park Methods
In [10] King et al. propose to add the partial products of the IC in the right-
most column of the LSPmajor. This is a simple but very effective way to corre-
late the correction value to the LSPminor. Combining [11], constant correction
methods, and [10], variable correction methods, in [12] Stine et al. propose a
so-called hybrid correction method, where only a subset of the IC elements are
summed in the rightmost column of the LSPmajor. In [20] Park et al. improve
the approach of [10] by summing in the rightmost column of the LSPmajor all
the IC terms and a further correcting bit.
Jou et al. Method
In [13] a correction function for truncated multipliers is proposed by Jou et al.,
by manipulating the partial products in IC with a circuit composed by AND-
OR gates. The authors propose two correction functions, depending on the
multiplier.
For signed multiplier with h = 0,
JOU =
n_
i=1
xiyn+1 i =
n^
i=1
xiyn+1 i (1.21)
fJOU(IC) =
"
n 1X
i=2
xiyn+1 i + JOU
#
2 n (1.22)
where
W
and
V
represent the Boolean operator OR and AND operator respec-
tively. Note that JOU is a constant added only when all the elements of the IC
are equal to zero, with the exception of the first and the last elements, which
should be equal to 1.
18 CHAPTER 1. BINARY MULTIPLICATION
For signed (h 6= 0) and unsigned multiplier
JOUi = xiyn+h+1 i ^
0@ i 1_
j=h+1
xjyn+h+1 j
1A i = h+ 2; : : : ; n
(1.23)
fJOU(IC) =
"
nX
i=h+2
JOUi
#
2 n h (1.24)
The technique in [13] is affected by a significant mean and mean square error.
The circuit that calculates f(IC) is characterized by a slow ripple architecture
and the number of output bits is equal to n+ h instead of n.
Curticapean et al. Method
Curticapean et al. [14], use a modified version of the correction circuit of [13]
suitable for unsigned multipliers. The proposed correction function is:
CCPi = xiyn+h+1 i ^
0@ i 1_
j=h+1
xjyn+h+1 j
1A i = h+ 2; : : : ; n  1
(1.25)
CCPn =
n_
i=h+1
xiyn+h+1 i (1.26)
fCCP(IC) =
"
nX
i=h+2
CCPi
#
2 n h (1.27)
Note that the function is proposed only for unsigned multiplier. This technique
provides good error performances but the compensation function is still based
on a slow ripple architecture. Furthermore in [14] no explanation is given to
justify the improved error performances.
Van et al. Method
In [15] Van et al. propose a truncated signed multiplier architecture in which
the correction function proposed in [13] is generalized by considering either
the IC terms or their complements. The optimal pattern of inversions applied
1.3. TRUNCATED MULTIPLIER 19
to the partial products in IC is obtained through exhaustive simulation. The
proposed correction function, for signed multiplier with h = 0 is:
VANO = x1yn ^
 
n 1^
i=2
xiyn+1 i
!
^ xny1 (1.28)
= x1yn _
 
n 1_
i=2
xiyn+1 i
!
_ xny1 (1.29)
fVANO(IC) =
"
n 1X
i=2
xiyn+1 i + VANO
#
2 n (1.30)
The technique has been extended to the case h > 0 in [16]:
VAN =
 
n 1_
i=h+1
xiyn+h+1 i
!
^ xnyh+1 (1.31)
fVAN(IC) =
"
n 1X
i=h+1
xiyn+h+1 i + VAN
#
2 n h (1.32)
In [15] and [16] f(IC) is still implemented with a slow ripple architecture.
Furthermore the optimal value of the function f(IC) is computed through an
exhaustive search. Hence the effectiveness of these techniques is verified only
for n  16 and h = 0; 1; 2. The result obtained by Van is explained in [17].
The authors investigate the dependency of the carry bits on the partial products
and the inputs, and propose three carry estimation schemes. Since the error
performance of [17] does not improve [15], [16] the multipliers proposed in
[17] will not be considered in the following.
Kuang et al. Method
In [18] the authors propose a minimal modification of the technique presented
by Strollo et al. [21]. This correction, done in order to obtain a lower mean
square error or mean error, has been obtained thought a brute force compu-
tation. Hence it is valid only for the analyzed case, unsigned multiplier with
h = 0. The authors don’t explain how the technique can be extended and what
happen if we consider a signed multiplier.
20 CHAPTER 1. BINARY MULTIPLICATION
The correction proposed by Strollo et al. in [21] is
c1 = xny1  xn 1y2  x2yn 1
c2 = (xny1 ^ xn 1y2) _ x2yn 2 ^ (xny1  xn 1y2) x1yn
c3 = (xny1 ^ xn 1y2) _ x2yn 2 ^ (xny1  xn 1y2) ^ x1yn
STROLLO = c1 + c2 + c3; (1.33)
fSTROLLO(IC) =
"
n 2X
i=3
xiyn+1 i + STROLLO
#
2 n
Kuang et al. notice that c1; c2; c3 can’t be 1 simultaneously. Therefore in
[18] the standard addition is modified in order to generate only two outputs
such that the retained adder cells can also be slightly simplified. The authors
propose two different configurable error-compensation circuit.
New I, which provides a lower variance but an higher mean error:
c1 = x1yn _ x2yn 1 _ xn 1y2 _ xny1
c2 = x1yn ^ x2yn 1 _ xn 1y2 ^ xny1
NEWI = c1 + c2; (1.34)
fNEWI(IC) =
"
n 2X
i=3
xiyn+1 i + NEWI
#
2 n
New II, which provides a lower mean error with an higher variance:
c1 = x1yn _ x2yn 1 _ xn 1y2 _ xny1
c2 = x1yn ^ x2yn 1 _ xn 1y2 ^ xny1
c3 = x3yn 3 ^ xn 3y3 (1.35)
NEWII = c1 + c2 + c3;
fNEWII(IC) =
"
n 2X
i=3
xiyn+1 i + NEWII
#
2 n
It is important to underline that the authors can obtain a lower but an higher
mean and viceversa. In the following it will be demonstrated that both param-
eters are important and should be kept as lower as possible together.
1.4. CONCLUSION 21
Michard et al. Method
The approach proposed in [19] is a noteworthy exception with respect to the
scheme of Fig. 1.12. The prediction-selection method of [19] is based on a
logical computation of the values to be added to SMSP and to SLSPmajor (predic-
tion), followed by a simplification process (selection). Results are presented
for truncated multipliers with n = 8; 12; and 16. The approach of [19] is not
applicable to higher n values, due to the fast growing computational cost of
the prediction process.
1.4 Conclusion
From the Literature survey it comes out that the constant correction techniques
[8, 9, 11] are the simplest to be implemented. Moreover, their error properties
can be described analytically rather easily, even if it will not be done in this
dissertation. Unfortunately, the constant correction techniques are also the less
effective in terms of approximation error.
On the other hand, the variable correction techniques proposed to date,
while significantly improving accuracy, show substantial limits. In particu-
lar, they are not derived from an analytical theory but rather heuristically or
with the help of exhaustive searches. Thus, in many cases the proposed tech-
niques can not be applied to multipliers with large bit widths (say, 24 or 32
bits) and/or can not be considered for a possible implementation in automatic
synthesis tools. In addition, the error performances are customarily computed
numerically through exhaustive simulations. This approach can be pursued
only for small n values since the simulation time increases as O(22n), requir-
ing an unreasonable amount of CPU time when n increases. As an example,
the technique proposed in [16] is only verified for n  16 and h = 0; 1; 2.

Chapter 2
LMS Truncated Multiplier
T he optimal compensation function fopt(IC), that minimizes the meansquare error of the truncated multiplier, is analytically calculated in
this chapter. In Sec. 2.3 it is shown that fopt(IC) is a quadratic form of the
elements in IC and that the corresponding minimum mean square error is the
lower error bound for any truncated multiplier.
With the help of the developed theory, the derivation of a sub-optimal com-
pensation function, flin(IC), expressed as a linear combination of the elements
in IC, is described in Sec. 2.4. The linear compensation function yields error
performances very similar to those achievable using fopt(IC) and has the addi-
tional advantage of easy hardware implementation. The aspects related to the
quantization of the coefficients of flin(IC) are investigated in Sec. 2.5. Hence
it is given the expression, in closed form, of the compensation function, Lin-
ear Minimum mean Square error (LMS compensation function). Finally in
Sec. 2.8 it is shown how the results, obtained for an unsigned multiplier can
be extended to signed and mixed multipliers. The hardware implementation
and the performances of the sub-optimal linear compensation function are de-
scribed in the next chapter.
2.1 Definitions and Assumptions
In the following we will consider unsigned multipliers. Without loss of gen-
erality, we will moreover assume that the inputs of the multiplier represent
23
24 CHAPTER 2. LMS TRUNCATED MULTIPLIER
x y1 8 x y2 8
x y1 7 x y2 7 x y3 7
x y2 6 x y3 6x y1 6 x y4 6
x y3 5 x y4 5x y1 5 x y2 5 x y5 5
x y4 4 x y5 4x y2 4 x y3 4x y1 4 x y6 4
x y5 3 x y6 3x y3 3 x y4 3x y1 3 x y2 3 x y7 3
x y1 2 x y5 2 x y6 2x y4 2 x y4 2x y2 2 x y3 2 x y8 2
x y2 1 x y6 1 x y8 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
pt2 pt3 pt4 pt5 pt6 pt7 pt8 pt9pt1 pt11pt10
2
-1
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-n
2
-n-1
2
-n-h-1
2
-2
MSP
LSPmajor
CORRECTION FUNCTION f(IC)
TRUNCATED
2
-n-m
pt12 pt13
x y5 8 x y6 8 x y7 8 x y8 8x y4 8x y3 8
x y6 7 x y7 7 x y8 7x y5 7x y4 7
x y7 6 x y8 6x y6 6x y5 6
x y8 5x y7 5x y6 5
x y8 4x y7 4
x y8 3
LSPminor
IC
NOT FORMED
lsb lsbIC lsbq
ROUNDING CONSTANT
n h
m
Figure 2.1: Partial Products matrix for unsigned truncated multiplier,
n = 8 and h = 2, with variable-correction scheme.
fractional values in [0; 1):
x =
nX
i=1
xi2
 i; y =
nX
i=1
yi2
 i (2.1)
Each bit xi and yi is assumed to be independent and uniformly distributed,
with a probability 1=2 of being one.
We focus on variable correction methods whose configuration, for n = 8
and h = 2, is represented in Fig. 2.1. The LSPminor is not used and is estimated
with the elements of IC. The result is given by the sum of the compensation
function f(IC), the rounding constantKround, MSP and LSPmajor, truncated on
n bits. Note that:
lsb = 2 n is the less significant bit of the result; (2.2)
lsbIC = 2
 n h 1 is the weight of the IC column; (2.3)
lsbq = 2
 n m is the less significant bit of f(IC) andKround. (2.4)
The elements of the IC (see Fig. 2.1) will be indicated as follows:
IC =

1; 2; : : : ; neq

(2.5)
2.1. DEFINITIONS AND ASSUMPTIONS 25
where:
neq = n  h (2.6)
and:
i = xh+iyn+1 i (2.7)
Let’s define  the set of all the possible values of the vector IC (2neq ele-
ments) and 
(A) the subset of the set of all couples (x; y) which gives IC =
A. For instance, in the multiplier of Fig. 2.1, the subset 
([1; 1; 1; 1; 1; 1]) is
composed by 16 possible couples (x; y):
(x8::1 ; y8::1) = (1; 1; 1; 1; 1; 1; ;  ; 1; 1; 1; 1; 1; 1; ; ) (2.8)
while the subset 
([1; 1; 0; 1; 1; 1]) is composed by 48 (x; y) values:
(x8::1 ; y8::1) = (1; 1; 1; 0; 1; 1; ;  ; 1; 1; 0; 1; 1; 1; ; )
(x8::1 ; y8::1) = (1; 1; 1; 1; 1; 1; ;  ; 1; 1; 0; 1; 1; 1; ; ) (2.9)
(x8::1 ; y8::1) = (1; 1; 1; 0; 1; 1; ;  ; 1; 1; 1; 1; 1; 1; ; )
where the symbol ” ” indicates a don’t care value (the variable can be either
1 or 0).
As the elements of IC are formed by partial products obtained by an AND
operator, in general the number nxy(A) of (x; y) values in a set 
(A) depends
by nz, the number of zeros in the IC: nxy(A) = 3nz  22h. For the same reason
the probability of IC to be equal to A depends on the number of zeros in A:
P (IC = A) = nxy(A)  2 2n = 3nz  2 2neq . For instance, in the multiplier
of Fig. 2.1: P ([1; 1; 0; 1; 1; 1]) = 48=216 = 3 2 12.
The weighted sum of the elements in LSPminor is given by:
SLSPminor =
 
2 n h 1
neqX
i=1
i
!
+
0@ nX
i=h+2
nX
j=n+h+2 i
xiyj2
 i j
1A (2.10)
The mean value of LSPminor in 
(A) is:
LSP(A) , E
(x;y)2
(A)
fSLSPminor (x; y)g (2.11)
while the variance of LSPminor in 
(A) is:
2LSP(A) , E
(x;y)2
(A)
n
(SLSPminor (x; y)  LSP(A))2
o
(2.12)
26 CHAPTER 2. LMS TRUNCATED MULTIPLIER
where E
(x;y)2
(A)
fg is the mean conditioned to all couples (x; y) 2 
(A).
In order to simplify the notation in the following this mean will be indicated
with:
E
IC=A
f: : :g , E
(x;y)2
(A)
f: : :g (2.13)
2.2 Error Analysis for Truncated Multiplier
The error etotal introduced by the truncated multiplier of Fig. 2.1 is computed
with respect to the full-width multiplier output. As P is the output of the
full width multiplier and Pt the output of the truncated multiplier, looking at
Fig. 2.1, it is evident that:
P = SMSP + SLSPmajor + SLSPminor (2.14)
Pt = SMSP + f(IC) +Kround + etrunc (2.15)
where SMSP; SLSPmajor and SLSPminor are the weighted sum of MSP, LSPmajor and
LSPminor respectively, f(IC) is the compensation function,Kround the rounding
constant and etrunc represents the error due to the truncation operation (the
output from n+m bits is reported to n bits). Hence the error introduced by a
truncated multiplier is given by:
etotal = Pt   P = f(IC)  SLSPminor + etrunc +Kround (2.16)
Aim of the work is to determine the optimal compensation function, fopt(IC),
that minimizes the mean square error:
"2total = E

e2total
	
(2.17)
The equation (2.16) can be rearranged as follows:
etotal = eerasing + etrunc +Kround (2.18)
where:
eerasing = f(IC)  SLSPminor (2.19)
Since Kround is a constant chosen with the compensation function in order to
minimize "2total, two error sources affect the truncated multiplier. The first
error source, eerasing, is due to the use of the compensation function f(IC) in
place of the sum of the elements in LSPminor. The second error source, etrunc,
is due to the truncation of the result.
2.2. ERROR ANALYSIS FOR TRUNCATED MULTIPLIER 27
Indicating with 2total and total variance and mean value of etotal, the
mean square error is given by:
"2total = 
2
total + (total)
2 (2.20)
where:
total = erasing + trunc +Kround (2.21)
2total = 
2
erasing + 
2
trunc + 2 COV(eerasing; etrunc) (2.22)
and COV(w; t) is the covariance between w and t. The equation (2.22) can
be simplified if we assume that etrunc is independent from eerasing (hence the
covariance is zero):
2total ' 2erasing + 2trunc (2.23)
Note that (2.23) is not exact, since the bits discarded by the truncation oper-
ation depend on the compensation function f(IC) (see Fig. 2.1). Hence, the
truncation error is not independent from the erasing error. Nevertheless we will
shown in Sec. 2.5 that equation (2.23) represents a very good approximation
of 2total.
2.2.1 Statistical Properties of the Truncation Error (etrunc)
The truncation error is given by the weighted sum of truncated bits:
etrunc =  
n+mX
i=n+1
pti  2 i (2.24)
Since the truncated bits are m, etrunc 2 f 2 n+2 n m; 0g. etrunc = 0 when
all the discarded bits are equal to zero. etrunc =  2 n + 2 n m when all the
discarded bits are equal to 1 since  Pn+mi=n+1 1  2 i =  (2 n   2 n m) =
 (lsb  lsbq).
The mean truncation error (trunc) can be easily computed by assuming
that each discarded bit pti has a probability 1/2 of being one:
trunc =  
n+mX
i=n+1
1
2
2 i =   lsb
2
 
1  2 m (2.25)
Similarly, assuming the bits pti independent and identically distributed, we can
compute the variance 2trunc of the truncation error:
2trunc =
n+mX
i=n+1
1
4
2 2i =
lsb2
12
 
1  2 2m (2.26)
28 CHAPTER 2. LMS TRUNCATED MULTIPLIER
2.2.2 Statistical Properties of the Erasing Error (eerasing)
The mean value of the erasing error is given by:
erasing = E feerasingg =
X
A2

E
IC=A
feerasingg

P (A) (2.27)
Since for each fixed IC = A, f(IC) is a constant value, using (2.19) the term
within the summation in (2.27) can be written as:
E
IC=A
feerasingg = f(A)  E
IC=A
fSLSPminor (x; y)g (2.28)
By using the definition (2.11) and by substituting (2.28) in (2.27) we have:
erasing =
X
A2
[f(A)  LSP(A)]  P (A) (2.29)
Before computing the variance of the erasing error, let us calculate first the
mean square erasing error, given by:
"2erasing = E

e2erasing
	
(2.30)
As done in the case of erasing, we can compute "2erasing by performing the
averaging on 
(A), as follows:
"2erasing =
X
A2

E
IC=A

e2erasing
	  P (A) (2.31)
where:
E
IC=A

e2erasing
	
= VAR
IC=A
feerasingg+

E
IC=A
feerasingg
2
(2.32)
Using (2.19), VAR
IC=A
feerasingg = VAR
IC=A
ff(A)  LSP(A)g, but for each fixed
IC=A, f(A) is a constant value as a consequence VAR
IC=A
feerasingg = 2LSP
(2.12). By using (2.28) one has:
E
IC=A

e2erasing
	
= 2LSP(A) + (f(A)  LSP(A))2 (2.33)
By substituting (2.33) in (2.31) one finally obtains:
"2erasing = "
2
intrinsic + "
2
comp (2.34)
2.2. ERROR ANALYSIS FOR TRUNCATED MULTIPLIER 29
where:
"2intrinsic =
X
A2
2LSP(A)  P (A) (2.35)
"2comp =
X
A2
(f(A)  LSP(A))2  P (A) (2.36)
If we define compensation error ecomp = f(A)   LSP(A), "2comp is its
mean square error. Furthermore the mean value and the variance of ecomp are
given by:
comp = E ff(A)  LSP(A)g = erasing (2.37)
2comp =
X
A2
(f(A)  LSP(A)  comp)2  P (A) =
=
X
A2
(f(A)  LSP(A)  erasing)2  P (A) (2.38)
As 2erasing = "
2
erasing   2erasing, using (2.34), (2.37), the variance of the
erasing error (2erasing) is given by:
2erasing = "
2
intrinsic + 
2
comp (2.39)
2.2.3 Optimal Compensation Function and Error Lower Bound
From (2.20),(2.21),(2.23),(2.39) one has:
"2total ' 2trunc + "2intrinsic + 2comp
+ (erasing + trunc +Kround)
2 (2.40)
This equation highlights that the rounding constantKround should be chosen in
order to minimize total. In the ideal case in whichKround is represented on an
infinite number of bits (i.e., m ! 1 in the scheme of Fig. 2.1 and f(IC) and
Kround are real numbers) we can choose:
Kround = Kroundideal =  erasing   trunc (2.41)
and we can compensate the mean values of eerasing and etrunc exactly. In
practice, f(IC) and Kround will be expressed on a finite number of bits (see
Fig. 2.1) and the mean value of etotal can not be exactly zero. We will discuss
this aspect in Sec. 2.5, when we will discuss the effect of quantization. For
now let us observe that in the ideal case in which the compensation function
30 CHAPTER 2. LMS TRUNCATED MULTIPLIER
f(IC) (and Kround) returns a real number, total can be nullified by choosing
Kround following (2.41), and we have:
"2total ' 2trunc + "2intrinsic + 2comp (2.42)
This equation shows that the mean square error of the truncated multiplier can
be reduced to the sum of three positive terms.
The first term, 2trunc, is the variance of truncation error. This error compo-
nent arises from using only n output bits from the multiplier. This error com-
ponent is (almost) independent from the technique used to realize the truncated
multiplier and is also presented in the full-rounded multiplier (see Fig. 1.11).
In the ideal case in which f(IC) is real (m!1), from (2.26), 2trunc is given
by lsb=12 ' 0:083  lsb2.
The second term, "2intrinsic, is not dependent on the compensation function.
This error component arises since we use only the elements in IC to estimate
SLSPminor .
The third term, 2comp, depends from the compensation function and can
be minimized with a suitable selection of f(IC). It is worth noting that 2comp
does not change if we add a constant to the function f(IC). As a consequence,
there are infinite optimal compensation functions differing one another from
a constant value. However, it is easily found from the above discussion that
there is only one optimal value for the sum: f(IC) +Kround.
From (2.38) and (2.29) it can be seen that an optimal choice for the com-
pensation function is:
fopt(IC) = LSP (2.43)
In fact, when (2.43) is verified, all the terms summed in (2.38) are zero and
hence 2comp ' 0. Thus, an optimal compensation function is the one that
returns the mean value of SLSPminor for every value of the IC. The compensa-
tion function in (2.43) is the one that makes equal to zero the mean erasing
error (erasing). From (2.41), the corresponding optimal rounding constant is
Kround ideal =  trunc. The mean square error when the optimal compensa-
tion function and the optimal rounding constant are employed is a lower error
bound for any variable-correction truncated multiplier and is given by:
"2low bound ' 2trunc + "2intrinsic (2.44)
where, as previously discussed, 2trunc ' lsb=12.
2.3. OPTIMAL COMPENSATION FUNCTION 31
2.3 Optimal Compensation Function
The generic element xiyj of LSPminor is correlated to two elements of IC:
i h = xiyn+1+h i and n+1 j = xn+1+h jyj . For example in Fig. 2.1
x6y7 is correlated to 4 = x6y5 and 2 = x4y7. In the hypotheses of input
bits independent and identically distributed, as xiyj is correlated to different
elements of the IC, its mean value is given by:
E
IC=A
fxiyjg = E
IC=A
fxig  E
IC=A
fyjg (2.45)
The generic i h = xiyn+1+h i is given by an AND operation, so if i h = 1,
xi will be 1 with probability equal to 1, otherwise with a probability equal to
1=3 (one of the three left cases):
E
IC=A
fxig = 1
3
(1 + 2i h) (2.46)
Analogous reasoning is valid for the other element of partial product. Hence
the mean value of a generic element of LSPminor can be expressed as:
i;j(IC) , E
IC=A
fxiyjg = 1
9
(1 + 2i h) (1 + 2n+1 j) (2.47)
i;j(IC) 2

1
9
;
1
3
; 1

In order to express LSP in explicit form, let us start by computing the
mean values of LSPminor when all the elements of the IC are zero. In this
case, using 2.47, the mean value of each partial product of LSPminor is 1=9. It
can be easily seen also if we consider all the possible values of the input bits,
constrained by the condition IC = [0; 0; 0; 0; 0; 0]. For example, Fig. 2.2(a)
shows the computation of the mean value of the partial product x6y7, whose
value results to be 1=9. Therefore, if we replace each partial product with its
mean (Fig. 2.2(b)), with simple algebra one obtains:
LSP (IC = 0) = lsb  2 h 1 K (2.48)
where
K =
nX
i=h+2
nX
j=n+h+2 i
1
9
2 i j =
2
9
neq
2
+ 2 neq   1

(2.49)
Next step is to consider the case in which only one element of the IC is
equal to 1. When a generic i = xh+iyn+1 i is equal to 1, xh+i and yn+1 i
32 CHAPTER 2. LMS TRUNCATED MULTIPLIER
0 1/9 1/9 1/9 1/9 1/9
0 1/9 1/9 1/9 1/9
0 1/9 1/9 1/9
0 1/9 1/9
0 1/9
0
LSPminor
IC
b)a)
x y4 7
x y6 5
x y6 7y7 x6 x y6 7y5x4
0 0 0 0
0 0 0 1
0 0 1 0
0 1 0 0
0 1 0 1
0 1 1 0
1 0 0 0
1 0 0 1
1 0 1 0
0
0
0
0
0
1
0
0
0
x y4 7=0 x y6 5=0
Figure 2.2: Computation of LSP for n = 8 unsigned multiplier with
IC = 0. a) Computation of the mean value of x6y7. b) Mean value of
the elements of LSPminor.
are both equal to 1. This change the probability of being 1 of the elements
of LSPminor correlated to them, that is the elements on the same diagonal
xh+iyj (j = n + 2   i; : : : ; n) and on the same row xjyn+1 i (j =
h + i + 1; : : : ; n) (on the grey in Fig. 2.3(a)). Using (2.47) the mean value
of these elements is 1=3. Thus for each i equal to 1, the value of LSP can be
obtained by adding to (2.48) a correction factor which is equal to:
fi = 2
 n h 1
0@1 + nX
j=n+2 i
2
9
2n+1 i j +
nX
j=h+i+1
2
9
2 i
1A (2.50)
in which the first element consider the increment in IC column due to the
i = 1, the second term the increment to (2.48) due to the change in the
elements on the same diagonal, the third the change due to the elements on the
same row. Considering all the possible cases in which only one element of the
IC (i; i = 1 : : : neq) is equal to 1 yields:
LSPjonly one i=1 = LSP (IC = 0) + lsb  2 h 1
neqX
i=1
fii (2.51)
where:
fi =
13
9
  2
9
 
2 i+1 + 2i neq

: (2.52)
The above reasoning can be iterated to compute the value of LSP when two
or more elements of the IC are equal to 1. In Fig. 2.3(b) two elements x4y7 and
2.3. OPTIMAL COMPENSATION FUNCTION 33
0 1/3 1/9 1/3 1/9 1/9
1 1/3 1/3 1/3
0 1/3 1/9 1/9
1 1/3 1/3
0 1/9
0
IC IC
a) b)
0 1/9 1/9 1/3 1/9 1/9
0 1/9 1/3 1/9 1/9
0 1/3 1/9 1/9
1 1/3 1/3
0 1/9
0
LSPminorLSPminor
1
Figure 2.3: Mean values of the elements of the LSP when IC is non-
zero for a 8  8 multiplier (h = 2). a) Only one element of the IC is
equal to 1. b) Two or more elements of the IC are equal to 1.
x6y5 of the IC are equal to 1. In this case the value of LSP differs from (2.51)
due to the mean value of the partial product x6y7 (circled in Fig. 2.3(b)). In
fact, since x4 = y7 = x6 = y5 = 1, the mean value of x6y7 is 1, whereas in
(2.51) it is computed as
 
1
9 +
2
9 +
2
9

. Hence the mean value of x6y7 should
be increased by a factor 49 with respect to (2.51). In the general case, when
i = xh+iyn+1 i and j = xh+jyn+1 j are both equal to 1, for the common
element xh+jyn+1 i (i < j) it should be considered an increment of:
fi;j =
4
9
2 abs(j i) (2.53)
Finally fopt(IC) = LSP(IC) can be computed as:
fopt(IC) = lsb  2 h 1
24K + neqX
i=1
fii +
neqX
i=1
neqX
j=i+1
fi;jij
35 (2.54)
whereK, fi and fi;j are defined in (2.49), (2.52) and (2.53).
Discussion
The optimal compensation function in (2.54) is a quadratic form in the vari-
ables i. The value of (2.54) decreases exponentially with h, as expected, since
larger h means lower weight of the terms in LSPminor.
34 CHAPTER 2. LMS TRUNCATED MULTIPLIER
2 4 6 8 10 12
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
n=12
h=1
i
K
if
i,5f
j
j
ls
b
IC
Figure 2.4: Coefficients of the optimal error compensation function
fopt(IC) for n = 12; h = 1.
Fig. 2.4 shows the behavior of coefficients K, fi and fi;j in (2.54) for
a truncated multiplier with n = 12 and h = 1. The linear terms fi
exhibit a maximum for i =
1
2
(1 + neq) and the maximum fi value is:
fi;max =
13
9
  4
9
 2 12 (1 neq). The value of fi;max tends to 13
9
as neq increases.
As shown in Fig. 2.4, the maximum in the distribution of fi is not very pro-
nounced. The minimum of the fi is reached for both i = 1 and i = neq, is
given by: fi;min =
11
9
  4
9
2 neq and tends to
11
9
as neq increases.
The quadratic terms fi;j decrease exponentially with abs(j  i). The max-
imum of the quadratic coefficients, attained for abs(j   i) = 1, is 2/9. Thus,
the quadratic coefficients are significantly smaller than the linear terms.
The constant term K in (2.49) increases almost linearly with neq. In prac-
tice, K is of the same order of magnitude as fi (in the example of Fig. 2.4 K
is actually smaller than fi;min). Therefore, the optimal compensation function
(2.54) is far from being constant. This explains the rather large approximation
2.3. OPTIMAL COMPENSATION FUNCTION 35
2 4 6 8 1012 20 24 28 32 34 38 42 48 50 54 58 62
0
0.2
0.4
0.6
0.8
1
1.2
1.4
16
f ICopt( )
SLSPminor
ICvalue
ls
b
IC
Figure 2.5: Comparison between the optimal error compensation
function fopt(IC), red dashed line, and the SLSPminor , green full line,
for all the possible input values (n = 6; h = 0, collected according to
the IC value.
error of the constant-correction techniques [8, 9, 11].
In Fig. 2.5 there is the comparison between the value of fopt(IC) and the
SLSPminor , for all the possible input values (n = 6, h = 0), collected according
to the IC value. It is clear that for each IC value fopt(IC) is the mean of all the
possible values assumed by SLSPminor .
2.3.1 The Intrinsic Error
The intrinsic error is given by (2.35). This equation shows that the intrin-
sic error is a weighted sum of the variances 2LSP(A) of SLSPminor in the set

(A). From (2.10), in turn, SLSPminor is a weighted sum of partial products.
For each fixed IC = A the summation of the elements of the IC is a constant
value, so they will not appear in the calculation of the variance of SLSPminor(A).
Hence, 2LSP(A) can be computed as the sum of the variances of each partial
product 2i;j(A) (multiplied by the square of partial product weight) summed
36 CHAPTER 2. LMS TRUNCATED MULTIPLIER
to twice the covariance between every couple of different partial products
COVi;j;l;m(A) (multiplied by the products of their weights) taken once. Since
the variables are binary numbers (xiyj)2 = xiyj and the variances are com-
puted as:
2i;j(A) = E
IC=A
n
(xiyj)
2
o
  2i;j(A) =
= E
IC=A
fxiyjg   2i;j(A) = i;j(A)  2i;j(A) (2.55)
where i;j is given by (2.47).
The covariance terms are given by:
COVi;j;l;m(A) = E
IC=A
fxiyj  xlymg   i;j(A)l;m(A) (2.56)
App. A describes how to compute the mean of the product xiyj  xlym in
every set
(A) and the complete calculation of the explicit form of the intrinsic
error (2.35). The final expression is given by:
"2intrinsic = lsb
2  2 2h

1
24
2 neq   7
324
2 2neq+
+

13
864
  1
48
2 neq

 neq   13
648

(2.57)
The above exact equation can be simplified by neglecting the small terms de-
pending on 2 neq :
"2intrinsic ' lsb2  2 2h
13
216

1
4
 neq   1
3

(2.58)
The relative error of (2.58) with respect to (2.57) is less than 4% for
neq = 5 and rapidly vanishes for larger neq values, as it is possible to see in
Fig. 2.6. In the figure the red full line is the value of the exact expression of
the intrinsic error (2.57), the dotted black line the value of its approximation
(2.58). The two lines are almost indistinguishable, a small difference is visible
only for small values.
In the specific case in which h = 0 the intrinsic error is given by:
"2intrinsic ' lsb2  [0:015  n  0:02] (2.59)
Hence the intrinsic error increases linearly with n, when h=0.
2.3. OPTIMAL COMPENSATION FUNCTION 37
eintrinsic_exact
8 16 24 32 40 48 56 64
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
h=0
h=1
h=2
e
m
s
/
2
2
2
ls
b
eintrinsic_approx
2
0.09
0.10
ZOOM
n
Figure 2.6: Comparison between the exact value of "2intrinsic (red full
line) and the approximate value of "2intrinsic (black dashed line).
In general, for each given h value the intrinsic error increases linearly with
n (with neq = n h), while for each given n value the intrinsic error decreases
exponentially with h, going to zero quickly (see Fig. 2.6).
The total mean square error introduced by the optimal compensation func-
tion, "2low bound, can be obtained in closed form using (2.57) in (2.44). Its
value, varying n and h, is shown in Fig. 2.7 (red full lines) compared to the
error of the full-rounded multiplier (green dotted line). The figure shows that
the lower bound for any truncated multiplier is far from "2round = 0:083  lsb2,
especially for high values of n and low values of h. This is an important point.
Since now it has been thought that using a suitable compensation function, the
lowest error that could be obtained was "2round; but it is impossible since it has
been demonstrated that the lower error is represented by "2low bound > "
2
round
due to the presence of "2intrinsic which is independent by f(IC). For exam-
ple if a designer want to implement a 16 bit truncated multiplier with a mean
square error lower than 0:2  lsb2, the only things that he can do is to look for
a compensation function of a truncated multiplier with h = 1.
Fig. 2.7 shows that the error increases almost linearly with n, following
the slope of "2intrinsic. When h = 0 the overall error is much larger "
2
round =
0:083  lsb2. On the other hand since, from (2.58), the intrinsic error decreases
38 CHAPTER 2. LMS TRUNCATED MULTIPLIER
elow_bound h=0
h=1
h=2
e
2
round
2
8 16 24 32 40 48 56 64
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
e
m
s
/
2
2
ls
b
n
Figure 2.7: "2low bound values varying n and h, compared with "
2
round.
exponentially with h, Fig. 2.7 shows that the overall error approaches "2round as
h increases.
2.4 Linear compensation function.
The optimal compensation function obtained in Sec. 2.3, while being optimal
in terms of mean square error, can hardly be implemented in hardware. The
equation (2.54) requires in fact the sum of a large number of terms, in the
order of n2eq. Furthermore the coefficients K, fi and fi;j in (2.54) should be
quantized with a few bits of precision, to keep the area and power advantages
of truncated multipliers.
As observed before, the quadratic coefficients fi;j are significantly smaller
than the linear terms. This suggests that a good approximation of the best
possible error compensation function can be obtained as a simple linear com-
bination of the elements of the IC:
flin(IC) = lsb  2 h 1
"
Kl +
neqX
i=1
lii
#
(2.60)
The use of the sub-optimal flin(IC) in place of fopt(IC) will increase the mean
square error. On the other hand the implementation of flin(IC) requires only
2.4. LINEAR COMPENSATION FUNCTION. 39
neq sums, thus reducing the hardware complexity of the multiplier. Further-
more, as we will show in the following, the coefficients in (2.60) can be repre-
sented on just one or two bits with a limited error increase.
The expression of the total mean square error from "2low bound = 
2
trunc +
"2intrinsic becomes "
2
low bound = 
2
trunc + "
2
intrinsic + 
2
comp. There is a com-
ponent 2comp 6= 0 which arises from using a linear approximation of fopt(IC)
(hence so the compensation error ecomp = f(IC)  LSP = f(IC)  fopt(IC)
is not equal to zero). The optimal values of the coefficientsKl, li in (2.60) can
be obtained by minimizing this new component 2lin = 
2
compjf(IC)=flin(IC)
(2.38), given by:
2lin =
X
A2
(flin(A)  LSP(A)  lin)2 P (A) (2.61)
Where flin(IC) is given by (2.60), LSP(A) = fopt(IC) by (2.54) and lin
is given by (2.29) with f(IC) = flin(IC). Since Kl is present both in flin(IC)
and in lin, 2lin doesn’t depend on the constantKl (see also App. B). Therefore
the optimal li coefficients are obtained by solving the following system:
@2lin
@lm
= 0 m = 1; :::; neq (2.62)
This is a linear system of neq equations in the neq unknowns li, which
solution is detailed in the App. B. The result is:
li =
5
3
  1
3
 
2 i+1 + 2i neq

(2.63)
The value ofKl is chosen imposing lin = 0, that is:
lin =
X
A2
(flin(A)  LSP(A))  P (A) = 0 (2.64)
Note that by imposing lin = 0, from (2.61) and (2.36), we minimize not only
2comp but also "
2
comp. The solution of (2.64) is again detailed in the App. B.
The result is:
Kl =
1
6
neq
2
+ 2 neq   1

(2.65)
Fig. 2.8 shows the behavior of coefficientsKl and li for a truncated multi-
plier with n = 12 and h = 1. Also in this case, as for Fig. 2.8, the distribution
of linear terms li is symmetric and exhibits a maximum for i = (1 + neq) =2.
40 CHAPTER 2. LMS TRUNCATED MULTIPLIER
2 4 6 8 10 12
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
n=12
h=1
K
i
l
i
K
if
i,5f
j
j
l
ls
b
IC
Figure 2.8: Comparison between the coefficients of the linear compen-
sation function flin(IC), black line with asterisk, and of the optimum
compensation function fopt(IC), red line with circle, for n = 12; h =
1.
The maximum li value is: li;max =
5
3
  1
3
 2 12 (1 neq) and tends to 5/3 as neq
increases. The minimum of the li is reached for both i = 1 and i = neq, is
given by: li;min =
4
3
  2
3
2 neq and tends to 4/3 as neq increases. In the fig-
ure are also shown coefficients and the constant of the optimum compensation
function fopt(IC) for the comparison. Note that li > fi since they have to
compensate the value of fij erased in linear function. The value of the con-
stantKl is lower thanK but it’s worth noticing that it should be add a suitable
rounding constant.
Fig. 2.9 shows the comparison between the value of fopt(IC) and the
flin(IC), for all the possible input values (n = 6, h = 0), collected according
to the IC value. As it is shown in the figure flin(IC) is a good approximation
of fopt(IC).
The value of 2lin obtained with the coefficients (2.63) can be calculated by
2.4. LINEAR COMPENSATION FUNCTION. 41
2 4 6 8 1012 20 24 28 32 34 38 42 48 50 54 58 62
0
0.2
0.4
0.6
0.8
1
1.2
1.4
16
f ICopt( )
ICvalue
f IClin( )
ls
b
IC
Figure 2.9: Comparison between the optimal error compensation
function and the linear compensation function, for all the possible in-
put values (n = 6, h = 0), collected according to the IC value. Red
full line: fopt(IC). Green full line: flin(IC).
substituting (2.60) and (2.63) in (2.61). After some tedious algebraic manipu-
lation, one has:
2lin = lsb
2  1
1296
 2 2h

3
4
 neq + 2 2neq   1

(2.66)
The above exact equation can be simplified by neglecting the small term in
2 2neq :
2lin ' lsb2 
3
1296
 2 2h

1
4
 neq   1
3

(2.67)
The overall error obtained with the optimal linear compensation function
can be written as:
"2total lin ' 2trunc + "2intrinsic + 2lin (2.68)
42 CHAPTER 2. LMS TRUNCATED MULTIPLIER
elow_bound
etot_lineare
8 16 24 32 40 48 56 64
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
h=0
h=1
h=2
n
e
to
ta
l/
l
s
b
2
e
2
round
2
2
2
Figure 2.10: Normalized total mean square error as a function of n and
h. Red full line: error of optimal (quadratic) compensation function.
Blue dashed line: linear compensation function with optimal coeffi-
cient. Green dotted line: mean square error of the full-rounded multi-
plier.
By comparing (2.67) and (2.58) it comes out that, since neq  4=3:
2lin
"2intrinsic
' 1
26
' 0:038 (2.69)
Therefore, the use of the linear compensation function (2.60), in place of
the optimal function (2.54), introduces only a small increase in the mean square
error, lower than 3.8%. This is also shown in Fig. 2.10, where the total error
achieved by using the linear compensation function "2tot linear is reported as a
blue line and compared with the "2low bound.
2.5 Linear Coefficients Quantization
In order to implement the linear compensation function efficiently, the co-
efficients given by (2.63),(2.65) and the rounding constant Kround should be
2.5. LINEAR COEFFICIENTS QUANTIZATION 43
2 4 6 8 10 12
0
0.5
1
1.5
2
n=12
h=1
K
i
l
i
ls
b
IC
l
Figure 2.11: Levels of quantization for li. When lsbq = lsbIC, only
the blue full lines are admitted. When lsbq = 1=2lsbIC, also the
green dashed levels can be used.
quantized on a reduced number of bits. Coefficient quantization will intro-
duce other error components, in addition to the terms reported in (2.68). As
shown in Fig. 2.1, both f(IC) and the rounding constant Kround are repre-
sented on n + m bits, that is with a less significant bit equal to lsbq (2.4).
Clearly, a larger m leads to a lower error due to coefficient quantization, but
the hardware complexity increases. In the following we will consider two
cases: lsbq = lsbIC and lsbq = 12 lsbIC. In the next chapter we will discuss
efficient hardware implementations for both cases. The quantization level are
represented in Fig. 2.11. When lsbq = lsbIC the qi are integer values (blue
full line), whereas the qi are multiple of 1/2 for lsbq = 12 lsbIC, hence it is
possible to consider also the green dashed line.
The quantized-coefficients error compensation function fq(IC) can be ex-
44 CHAPTER 2. LMS TRUNCATED MULTIPLIER
pressed as:
fq(IC) = lsbIC 
"neqX
i=1
qii
#
(2.70)
where:
qi = li +li (2.71)
are the quantized coefficients and li (i=1,. . . ,neq) represent the effect of co-
efficient quantization. Please note that no constant is added in (2.70). As
already discussed in Sec. 2.2.3, what actually matters is the sum of f(IC) and
the rounding constant. Thus, the effect of the constant Kl in (2.60) will be
taken into account when computing the rounding constant Kround.
In order to determine the effect of the quantization on the multiplier error,
let us callflin(IC) the difference between fq(IC) and flin(IC):
flin(IC) = LSBIC 
 neqX
i=1
lii  Kl
!
(2.72)
Since fq(IC) = flin(IC)+flin(IC), the mean erasing error after coefficients
quantization, quant, can be easily obtained from (2.29) as:
quant =
X
A2
[flin(A)  LSP(A)]  P (A)+
+
X
A2
flin(A)  P (A) (2.73)
The first term in (2.73) is zero (see (2.64)) and therefore quant can be written
as:
quant =
X
A2
flin(A)  P (A) = E ffling (2.74)
By using (2.72) and by observing that the mean value of i is 1/4, one obtains:
quant =
X
A2
lsbIC 
 neqX
i=1
lii  Kl
!
 P (A)
= lsbIC
 
1
4

neqX
i=1
li  Kl
!
(2.75)
In order to compute 2comp after coefficient quantization, 
2
comp quant, we
can start from (2.38) and, after a few algebraic manipulation, we obtain:
2comp quant = 
2
lin + 
2
q (2.76)
2.5. LINEAR COEFFICIENTS QUANTIZATION 45
where:
2q =
X
A2
(flin(A)  quant)2  P (A) =
= E
n
(flin(IC)  quant)2
o
= V AR fflin(IC)g (2.77)
where the variance of flin(IC) can be easily computed by using (2.72):
2q = lsb
2
IC
3
16

neqX
i=1
l2i (2.78)
2.5.1 Optimal Quantized Coefficients
Let us consider the total multiplier error, given by (2.40). By using (2.76) we
have:
"2total ' 2trunc + "2intrinsic + 2lin + 2q
+ (quant + trunc +Kround)
2 (2.79)
In this equation the coefficient quantization influences the fourth term 2q (see
(2.78)), and the last term. In fact, Kround is quantized and cannot assume the
ideal valueKround ideal =  quant   trunc (2.41) exactly.
The problem of finding the optimal quantized coefficients qi (that is theli
values) and the optimal quantized constantKround can be written as follows:
2q + (quant + trunc +Kround)
2 min! (2.80)
where ”min!” is used in order to indicate that the quantity should be minimized.
By using (2.78) and (2.75) the optimization problem can be reformulated as:
3
16
neqX
i=1
l2i +
 
Kround + trunc
lsbIC
+
1
4

neqX
i=1
li  Kl
!2
min! (2.81)
Sinceli andKround are quantized, (2.81) is an integer quadratic problem.
In order to simplify this problem we can observe that Kround appears only
in the second term of (2.81). Therefore, for given qi (and hence li) values,
the best choice of Kround is the one that minimizes the second term of (2.81).
By using (2.75) and (2.25) one obtains:
Kround =
lsb
2
+ roundlsbq
"
lsbIC
 
Kl   1
4
neqX
i=1
li
!
  lsbq
2
#
(2.82)
46 CHAPTER 2. LMS TRUNCATED MULTIPLIER
where we have exploited the fact that lsb=(2lsbq) is integer. The Equation
(2.82) gives in closed form the optimal rounding constant Kround, given the
quantized coefficients qi.
The analytical solutions of the integer quadratic optimization problem
(2.81) in the two cases lsbq = lsbIC and lsbq = 12 lsbIC are reported in
the following sections. Note that only the results are shown since the demon-
stration requires long and tedious algebraic manipulations.
Using these optimal coefficients,(2.83)-(2.84) for lsbq = lsbIC and
(2.85)-(2.86) for lsbq = 12 lsbIC, one obtains two different Linear Min-
imum means Square error functions, called LMS1b and LMS2b respec-
tively. In Fig. 2.12 there is the comparison between the value of fopt(IC),
flin(IC),fLMS1b(IC) and fLMS2b(IC) for all the possible input values (n = 6,
h = 0), collected according to the IC value. Note that for some IC values the
fLMS2b(IC) is closer to flin(IC) than fLMS1b(IC) (zoom A in Fig. 2.12), for
others fLMS1b(IC) is closer to flin(IC) than fLMS2b(IC) (zoom B in Fig. 2.12),
but in average, as it will be demonstrated in the following fLMS2b(IC) gives the
best approximation of the flin(IC) and hence of fopt(IC).
Quantization with 1bit lsbq = lsbIC (LMS1b function)
The optimal qi values which solve the problem (2.81) are:
q1 = q2 = qneq 1 = qneq = 1
q3 = q4 = ::: = qneq 2 = 2
(for lsbq = lsbIC) (2.83)
By substituting these coefficients in (2.82) we found:
Kround =
lsb
2
(for lsbq = lsbIC) (2.84)
From (2.83), it can be observed that the quantized coefficients qi are equal to
the coefficients li of the linear function flin(IC) rounded to the nearest integer1
(see also Fig. 2.11).
Quantization with 2bit lsbq = 12 lsbIC (LMS2b function)
The values ofKround and q2; : : : qneq 1 which solve the problem (2.81) are:
q2 = q3 = ::: = qneq 1 = 1:5
Kr =
lsb
2
+
lsbIC
2
jneq
4
k
  1
 (for lsbq = 1=2lsbIC) (2.85)
1 Please note that, from (2.63), l2 = lneq 1 =
3
2
  4
3
2 neq < 1:5
2.6. MEAN SQUARE ERROR 47
2 4 6 8 1012 20 24 28 32 34 38 42 48 50 54 58 62
0
0.2
0.4
0.6
0.8
1
1.2
1.4
16
f ICopt( )
ICvalue
f IClin( )
f ICLMS1b( )
f IC( )LMS2b
ls
b
IC
ZOOM A
ZOOM B
Figure 2.12: Comparison between the optimal error compensation
function fopt(IC), red full line, flin(IC), green full line, fLMS1b(IC),
blue dashed-dotted line, fLMS2b(IC), black dashed line.
for every value of nand hthat verify neq > 3. Multiple solutions exist for q1
and qneq . The solution that we will consider in the following (for neq > 3) is:
q1 = 1:0; qneq = 1:0 if REM(neq; 4) = 0
q1 = 1:0; qneq = 1:5 if REM(neq; 4) = 1
q1 = 1:5; qneq = 1:5 if REM(neq; 4) = 2
q1 = 1:5; qneq = 1:5 if REM(neq; 4) = 3
(lsbq =
1
2 lsbIC)
(2.86)
where REM(a,b) indicates the remainder of the integer division a/b. From
(2.86) we note that, in this case, the quantized coefficients qi corresponds to
the coefficients li rounded to the nearest integer only when REM(neq,4) is
equal to 2 or 3.
2.6 Mean Square Error
The expression of the total mean square error for truncated multiplier is:
48 CHAPTER 2. LMS TRUNCATED MULTIPLIER
"2total ' 2trunc + "2intrinsic + 2lin + 2q
+ (quant + trunc +Kround)
2 (2.87)
where:
2trunc (2.26) is the variance of the truncation error, which arises from using
only n output bits of the result;
"2intrinsic (2.57) is the intrinsic error, independent on the compensation
function, which arises from using only the vector IC in order to estimate the
matrix LSPminor;
2lin (2.68) is the variance of the compensation error, which arises from
using a linear approximation of fopt(IC);
2q and quant are the variance and the mean of flin(IC), which arises
from the quantization of the coefficient;
Kround is the rounding constant.
All the components are note with the exception of 2q and quant, which can
be computed using(2.71):
li = qi   li (2.88)
where li are the coefficient of the linear function (2.63) while qi is the coeffi-
cient of the LMS compensation function, which depend on the chosen quanti-
zation. Let’s see the exact value of these errors, and hence of the "2total consid-
ering the two different quantizations.
2.6.1 Analytical calculation of "2total for LMS1b function
Using the expression of the coefficients qi (2.83) in (2.78) and (2.75), the two
terms 2q and total, can be expressed as:
2q = lsb
2  2 2h 2 

7
72
  5
3
2 neq   1
18
2 2neq +
1
12

2 neq +
1
4

neq

(2.89)
total = lsbIC 
  2 neq 1 (2.90)
Hence all the component of the error are known and the final expression of the
error can be easily computed.
Note that (2.87) is approximated since it does not consider correlation be-
tween eerasing and etrunc (see Sec. 2.2). It is possible to improve the accuracy
by considering part of the correlation existing between erasing and etrunc. This
2.6. MEAN SQUARE ERROR 49
result will be reported only for the case lsbq = lsbIC, but it will be shown
that the hypothesis of independence is quite good, since the simulation results
gives the same results of the analytical formula.
The major contribution of the correlation between eerasing and etrunc exists
on the column with the weight lsbq = lsbIC. In fact this column is the only
one in which we have only terms belonging to f(IC), (see Fig. 2.1). These
terms are, of course, strongly correlated to the terms of the IC. In order to
take into account this correlation we can split the truncation error etrunc in two
contributions:
etrunc = etruncH + etruncL (2.91)
where (consider that, for lsbq = lsbIC, m=h+1):
etruncH =  
n+hX
i=n+1
pti  2 i ; etruncL =  ptn+h+1  2 n h 1 (2.92)
In the following the correlation between eerasing and etruncL will be con-
sidered and the more complicated correlation between eerasing and etruncH will
be neglected.
With position (2.91) the total mean square error can be written as:
"2total = E
n
(Kround + eerasing + etruncL + etruncH)
2
o
'
' E
n
(Kround + eerasing + etruncL + truncH)
2
o
+ 2truncH (2.93)
where the hypothesis of independence between eerasing and etruncH have been
exploited. The variance 2truncH can be computed by assuming the bits of
etruncH independent and equally likely:
2truncH =
n+hX
i=n+1
2 2i 2 = lsb2
1
12

1  2 2h

(2.94)
The first term in (2.93) can be easily evaluated taking into account the
correlation between eerasing and etruncL:
E
n
(Kround + eerasing + etruncL + truncH)
2
o
=
=E
n
(Kround + eerasing + truncH)
2
o
+ "2truncL+ (2.95)
+2  E f(Kround + eerasing + truncH)  etruncLg
50 CHAPTER 2. LMS TRUNCATED MULTIPLIER
The first term in (2.95), following the theory presented in previous sec-
tions, can be written as:
E
n
(Kround + eerasing + truncH)
2
o
=
= "2intrinsic + 
2
lin + 
2
q + (Kround + erasing + truncH)
2 (2.96)
The last term in (2.95) is a correlation coefficient L:
L = 2  E f(Kround + eerasing + truncH)  etruncLg (2.97)
In conclusion we have:
"2total ' "2intrinsic + 2lin + 2q + (Kround + erasing + truncH)2+
+ "2truncL + L + 
2
truncH (2.98)
Some of these components have already been evaluated: "2intrinsic, 
2
lin, 
2
q
and 2truncH are given by (2.57), (2.68), (2.89) and (2.94) respectively. The
other components have been analytically evaluated. For brevity we report only
the results:
(Kround + erasing + truncH)
2 = lsb2  2 2h 2

1
4
  1
2
2 neq +
1
4
2 2neq

(2.99)
"2truncL =
8><>:
lsb2  2 2 17
32
h = 0
lsb2  2 2h 2  15
32
h > 0
(2.100)
L =
8>><>>:
lsb2  2 2h 2

 19
32
+ 2 neq

h = 0
lsb2  2 2h 2 

 13
32

h > 0
(2.101)
It is worth to highlight that the approximations done in this new formula
are those used for (2.93), when the contribution of etruncH is disjoined. For
h=0, etruncH=0, and the formula is exact.
Summing all error components, we have the following expression of "2total:
"2total
lsb2
'
8>>>><>>>>:
1
12
+

  19
576
+
1
48
neq   1
4
2 neq +
1
36
2 2neq

h = 0
1
12
+ 2 2h 

  1
576
+
1
48
neq   1
2
2 neq +
1
36
2 2neq

h > 0
(2.102)
2.6. MEAN SQUARE ERROR 51
Note that, as highlighted before, this expression gives exactly the mean square
error "2total for h = 0. As we will show in Sec. 2.6.3, the expression (2.102) is
a very good approximation for h > 0.
The expression (2.102) can be approximated by neglecting negative expo-
nential terms:
"2total
lsb2
'
8>>>><>>>>:
1
12
+

  19
576
+
1
48
neq

h = 0
1
12
+ 2 2h 

  1
576
+
1
48
neq

h > 0
(2.103)
Employing (2.103) in place of (2.102) introduces only a small increase in the
mean square error, lower than 5.4% for neq  5.
2.6.2 Analytical calculation of "2total for LMS2b function
As done for LMS1b implementation, the resulting 2q and total are:
2q = lsb
2  2 2h 2 

1
288
  1
6
2 neq   1
18
2 2neq
+
1
12

2 neq +
1
16

neq +

  1
64
+
1
8
2 neq

  (neq)

(2.104)
total = lsbIC 

 1
2
2 neq   REM(neq; 4)   (neq)
8

(2.105)
where the function (n) is defined as:
(n) =

REM(n; 4) if : REM(n; 4) = 0; 1; 2
2 if : REM(n; 4) = 3
(2.106)
By adding all error components we can compute the following expression
of the total mean square error of the multiplier with lsbq = 1=2lsbIC:
52 CHAPTER 2. LMS TRUNCATED MULTIPLIER
"2total
lsb2
'
8>>>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>>>:
1
12
+ 2 2h 

 58 + 9 (neq)
2304
+
13
768
neq+ REM (neq; 4)  2
+
 (neq)
32
2 neq +
1
36
2 2neq

1
12
+ 2 2h 

  67
2304
+
13
768
neq+ REM (neq; 4) = 3
+
3
32
2 neq +
1
36
2 2neq

(2.107)
The expression (2.107) can be approximated by neglecting negative expo-
nential terms and assuming (neq) ' 1:
"2total
lsb2
' 1
12
+ 2 2h 

  67
2304
+
13
768
neq

(2.108)
Employing (2.108) in place of (2.107) introduces only a small variation in
the calculated mean square error, lower than 2.8% for neq  5.
2.6.3 Results
Having closed form expressions, it is possible to compute quickly the mean
square error for every value of n and h. This makes very simple to design
the multiplier for a given n by choosing h on the basis of the required accu-
racy. This approach can hardly be employed in previously proposed variable
correction truncated multipliers, where simulations are needed to compute the
error.
If we compare the approximated expression of "2intrinsic (see (2.58)), with
the approximated expressions of "2total for lsbq = lsbIC (see (2.103))
and lsbq = 1=2lsbIC (see (2.108)), we can observe that the intrinsic er-
ror "2intrinsic (and also the error "
2
low bound ' 2trunc + "2intrinsic) increases with
0:015  neq  lsb2  2 2h while the error (2.103) of the proposed multiplier with
lsbq = lsbIC increases with 0:021  neq  lsb2  2 2h. From (2.108) the er-
ror "2total of the proposed multiplier with lsbq = 1=2lsbIC increases with
0:017  neq  lsb2  2 2h. The error of the proposed multiplier with lsbq =
1=2lsbIC, therefore, is very close the lower bound "2low bound.
2.6. MEAN SQUARE ERROR 53
h=2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
8 10 12 14 16 18 20 22 24
h=0
h=1
n
lsb2
ems
e
2
low_bound
fullrounded multiplier
lsbq IC= theoreticallsb
lsb lsbq IC=½· theoretical
lsb lsbq IC= simulations
lsb lsb·q IC=½ simulations
Figure 2.13: Comparison between total mean square errors. Theoreti-
cal values are computed by using the formulae described in the chapter.
Fig. 2.13 shows the total mean square error ("2total) obtained for lsbq =
lsbIC and lsbq = 1=2lsbIC. The figure plots both theoretical values and
simulation results. The simulation time needed to obtain the mean square error
increases as O(22n), therefore simulation results are only presented for n  16.
For comparison the figure also shows the minimum mean square error
"2low bound achievable with a variable correction truncated multiplier (2.44) and
the error of a full-rounded multiplier (2trunc = lsb
2=12). As it can be seen,
the theoretical values for lsbq = lsbIC are almost exact for any h value, the
maximum difference between theoretical and simulation values being lower
than 10 3lsb2. For lsbq = 1=2lsbIC the theoretical values are very close to
simulations for h  1. For h = 0 the maximum difference between theoreti-
cal and simulated values is 7  10 3lsb2. From Fig. 2.13, we note again that
the mean square error obtained with lsbq = 1=2lsbIC is close to the error
"2low bound. Thus, from a practical point of view, the proposed truncated multi-
plier obtains the best performances achievable with a variable error correction
approach.
54 CHAPTER 2. LMS TRUNCATED MULTIPLIER
Data in Fig. 2.13 demonstrate that the hypothesis of independence between
truncation and compensation errors, although not exactly verified, represent a
reasonable assumption.
2.7 Maximum Absolute Error
Using (2.16) the maximum absolute error "max of the LMS1b truncated multi-
plier is:
"max = max
x;y
(jf(IC)  SLSPminor + etrunc +Kroundj) (2.109)
where etrunc is the truncation error given by (2.24). Note that the maximum
absolute error cannot be easily computed for a generic truncated multiplier
due to the dependency of the truncation error on the compensation function.
In the particular case of the LMS1b circuit proposed, "max can be analytically
computed for every n and h value.
Defining AP, the matrix obtained considering LSPminor without IC
(Fig. 2.1), SLSPminor (2.10), can be divided into two parts:
SIC = 2
 n h 1
neqX
i=1
i (2.110)
SAP =
nX
i=h+2
nX
j=n+h+2 i
xiyj2
 i j (2.111)
where SIC is the weighted sum of the IC elements, SAP the weighted sum of
AP. Hence the maximum absolute error is obtained choosing the maximum
absolute value between:
"  = min
x;y
((f(IC)  SIC)  SAP + etrunc +Kround) (2.112)
"+ = max
x;y
((f(IC)  SIC)  SAP + etrunc +Kround) (2.113)
Recalling that etrunc 2 f lsb+ lsbq; 0g, when "  is considered etrunc must
be chosen equal to the lowest value,  lsb+ lsbq, since we want to minimize
(2.112). When we are computing "+, since we want to maximize (2.113),
etrunc must be chosen equal to the highest value, 0.
2.7. MAXIMUM ABSOLUTE ERROR 55
x1
x9
x8
x7
x6
x5
x4
x3
x2
x10
X1..10
y9
y8
y7
y6
y5
y4
y3
y2
y10
Y1..10
y1
LSPmajor
LSPminor
IC
x y3 10x y2 10x y1 10
x y4 9x y3 9x y2 9
x y5 8 x y6 8 x y7 8 x y8 8x y4 8x y3 8
x y6 7 x y7 7 x y8 7x y5 7x y4 7
x y7 6 x y8 6x y6 6x y5 6
x y8 5x y7 5x y6 5
x y9 4x y8 4x y7 4
x y10 3x y9 3x y8 3
x y10 2x y9 2
x y10 1
x y4 10 x y5 10 x y6 10 x y7 10 x y8 10 x y9 10 x y10 10
x y5 9 x y6 9 x y7 9 x y8 9 x y9 9 x y10 9
x y9 8 x y10 8
x y9 7 x y10 7
x y9 6 x y10 6
x y9 5 x y10 5
K
round1
AP
x y10 4
Figure 2.14: Correlations between the terms of the IC and the AP
(LSPminor  IC. The gray part of the AP is correlated with the blue part
of the IC. As example the partial products shown in bold depend on x6
and y7. If x6y6 = 0 this means that the bold row or the bold diagonal
or both are identically equal to zero.
Using the definition of LMS1b function proposed in Sec. 2.5, the compen-
sation function is:
fLMS1b(IC) =
241 + 2 + neq 1 + neq + 2  neq 2X
i=3
i
35  lsbIC (2.114)
hence:
(f(IC)  SIC) = lsbIC 
neq 2X
i=3
i (2.115)
The rounding constant is given by:
Kround =
lsb
2
(2.116)
hence:
"  = min
x;y
0@lsbIC  neq 2X
i=3
i   SAP   lsb
2
+ lsbIC
1A (2.117)
56 CHAPTER 2. LMS TRUNCATED MULTIPLIER
x1
x8
x7
x6
x5
x3
x2
X1..10
y8
y7
y6
y5
y3
y2
Y1..10
y1
LSPminor
IC
x3
x y5 8 x y6 8 x y7 8 x y8 8x y3 8
x y6 7 x y7 7 x y8 7x y5 7
x y7 6 x y8 6x y6 6x y5 6
x y8 5x y7 5x y6 5
x y8 3
1
11
1
1
1
1
1
1
1
1
1
x5 x6 x7 x8
x5 x6 x7 x8
y8
y7
y6
y5
y8
y7
y6
y5
y8
x3
y2
x2
1 1
y1
y3
y2
y3
x8x7
x2x1
y7
AP
K
round1
LSPmajor
Figure 2.15: LSPminor partial product matrix for 10 10 bit multiplier
with h = 2. The partial products of the AP that are uncorrelated with
the central terms of the IC have been fixed to 1.
"+ = max
x;y
0@lsbIC  neq 2X
i=3
i   SAP + lsb
2
1A (2.118)
In oder to compute the maximum absolute error, (2.117) and (2.118) should be
expressed in closed form. These computations, however, are not trivial.
Fig. 2.14 shows the least significant part of the partial product matrix
and the IC of a 10  10 bit multiplier, with h = 2. The grey partial prod-
ucts of the AP have at least one bit shared with the central part of the IC
that is involved in (2.117) and (2.118). As a consequence the value of SAP
and the value of lsbIC 
neq 2X
i=3
i cannot be independently chosen to calculate
(2.117) and (2.118). On the contrary, since the extreme terms of the IC, i
(i = f1; 2; neq   1; neqg), are not involved in (2.117) and (2.118), they can be
chosen in order to independently fix etrunc.
Minimum value of the punctual error " 
The minimum value "  (2.117) is obtained maximizing the AP partial products
that are equal to one while maximizing the central partial products of the IC
that are equal to zero. In the following we will refer to the case in which neq is
2.7. MAXIMUM ABSOLUTE ERROR 57
x1
x7
x5
x3
x2
X1..10
y7
y5
y3
y2
Y1..10
y1
LSPminor
IC
x3
x y7 7x y5 7
x y7 5
AP
1
11
1
1
1
1
1
1
1
1
1
x5 x7
x5 x7
y7
y5
y7
y5
x3
y2
x2
1 1
y1
y3
y2
y3
x7
x2x1
y7
0 0 0 0 0
Rowsfixed
to zero
Diagonals fixed to zero
0 0 0
0
0
0
00
0
0
0
0
0
0
0
0
0
0
0
0
0
00
K
round1
LSPmajor
Figure 2.16: LSPminor partial product matrix for 10 10 bit multiplier
with h = 2. The central terms of the IC are fixed to zero. Each i
term fixed to zero fixes one row and one diagonal of the AP to zero,
alternatively.
even. The extension to the odd case will be provided at the end of the Section.
Figures (2.15), (2.16), and (2.17) show the step by step modification of the
AP and of the IC during the following algorithm that maximizes the punctual
error. They will also show how the input bits that determine the minimum error
condition are calculated.
Firstly, the value of every term of the AP that is not in the grey zone can
be chosen equal to 1 in order to maximize SAP. This means fixing to 1 the
input bits xh+2,xn 1,xn,yh+2,yn 1, and yn. As a consequence the IC terms
xh+2yn 1 and yh+2xn 1 are also 1. The partial product matrix for n = 10 and
h = 2 is modified as shown in Fig. 2.15.
The next step in minimizing the punctual error is fixing to zero the central
terms of the IC that is: i = 0, i = f3; : : : ; neq   2g. When the single term
i = xh+iyn+1 i is fixed to zero, a whole row of AP (if yn+1 i = 0) or a
whole diagonal of AP (if xh+i = 0) or both (if yn+1 i = xi = 0) are equal
to zero. Since the target is having as many terms of the AP equal to one as
possible, it is assumed that each zero i fixes to zero only one row or one
diagonal of the AP.
It is worth demonstrating that posing to zero the central i minimizes the
punctual error. This can be done noting that posing i = 1 increases (2.117)
58 CHAPTER 2. LMS TRUNCATED MULTIPLIER
x1
x3
x2
X1..10
y3
y2
Y1..10
y1
LSPminor
IC
x3
AP
1
11
1
1
1
1
1
1
1
1
1
x3
y2
x2
1 1
y1
y3
y2
y3
x2x1
0 0 0 0 0
Rowsfixed
to zero
Diagonals fixed to zero
0 0 0
0
0
0
00
0
0
0
0
0
0
0
0
0
0
0
0
0
00
1 1
1
1
1 1 1
1
1
1
1 1
1
1 1
1
1
K
round1
LSPmajor
Figure 2.17: LSPminor partial product matrix for 10 10 bit multiplier
with h = 2. The remaining available bits of the AP shown in Fig. 2.16
are fixed to 1.
by lsbIC=2. In order to compensate this increment the complete row or the
complete diagonal of the AP can be posed equal to 1. However, one whole
row or diagonal of AP equal to 1 decreases (2.117) by
lsbIC
2
neq 1X
j=1
1
2j
that is
lower than
lsbIC
2
.
For each i it is therefore necessary to decide if it is more convenient to fix
to zero the correlated row or the correlated diagonal.
To that purpose truncated multipliers with different bitwidths have been
designed and the analysis has shown that, despite the bitwidth of the multiplier,
(2.117) is always minimized according to the following conjecture:
Conjecture
when neq is even (2.117) is minimized when
xh+4, x6,..., and xn 2 are equal to 0
yh+4, y6,..., and yn 2 are equal to 0
when neq is odd (2.117) is minimized when
xh+4, xh+6,..., xn 3 and xn 2 are equal to 0
yh+5, yh+7,..., and yn 2 are equal to 0
It’s worth highlighting that notwithstanding the conjecture is easily ver-
2.7. MAXIMUM ABSOLUTE ERROR 59
X1..10 Y1..10
LSPminor
IC
AP
1
11
1
1
1
1
1
1
1
1
1
1 1
0 0 0 0 0
Rowsfixed
to zero
Diagonals fixed to zero
0 0 0
0
0
0
00
0
0
0
0
0
0
0
0
0
0
0
0
0
00
1 1
1
1
1 1 1
1
1
1
1 1
1
1 1
1
1
1
1
00
1
0
0
0
0
1
1
1
0
0
1
1
K
round1
LSPmajor
Figure 2.18: One of the possible configurations of the LSPminor for a
1010 bit LMS1b truncated multiplier with h = 2 that maximizes the
punctual error.
ified for a particular neq value, but it is not demonstrated for a general neq
bit multiplier. The truth of the conjecture is verified in the following through
numerical simulations. For example it is valid for n = 10, h = 0 and the
resulting matrix is shown in Fig. 2.16, where for 3 = 0 the correlated row has
been fixed to zero (y8 = 0), for 4 = 0 the correlated diagonal has been fixed
to zero (x6 = 0), and so on.
The remaining terms of the AP matrix are now independent on the IC and
hence have to be fixed to 1 in order to maximize the punctual error. The result-
ing matrix is shown in Fig. 2.17.
Finally the x1; : : : ; xh+1 and y1; : : : ; yh+1 bits are chosen minimizing
etrunc, hence imposing an odd number of 1 in each column of LSPmajor and
IC. Remember that in (n+ h)th column there is already a bit equal to 1 given
byKround.
xh+1 and yh+1 must be chosen in order to have an odd number of bits
equal to 1 in IC column. Until now the IC vector has 2 bits equal to 1. Hence
when h = 0, due to the presence of Kround, xh+1 = yh+1; when h 6= 0,
the choice must be xh+1 6= yh+1 (see Fig. 2.17). Finally also x1; : : : ; xh and
y1; : : : ; yh must be also chosen in order to have all 1 in the bits of the sum
(so the truncation error is minimum). The configuration is not simple neither
unique since one must consider also the carry of the sum of each column.
60 CHAPTER 2. LMS TRUNCATED MULTIPLIER
X1..10 Y1..10
LSPminor
IC
AP
0
K
round1
LSPmajor
1 0 1
0 1
1
1 1 0
1 1 010
1
1
1
0
0
0
0
0
0
1
0 0 0 0 0 0
0 0 0 0
1 1
0
1
1
1
0
0
1
1
0
1
0
0
0 0
0 0
0
1
1
1
1
Diagonalsfixed to zero
Rows fixed to zero
1 1
1
0
1
1
0
0
0
0
0
1
1
0
0
0
0
Figure 2.19: One of the possible configurations of the LSPminor for a
1010 bit LMS1b truncated multiplier with h = 1 that maximizes the
punctual error.
Note that the values of these bits are not required for the calculation of the
expression in closed form of the maximum absolute error (see (2.109)). Figure
2.18 shows the values of the partial products of the LSPminor that minimize the
punctual error, for a 10 10 bit LMS1b truncated multiplier with h = 2, with
one of the possible configuration of the inputs.
The next step is calculating the punctual error for this particular configu-
ration of the input bits. The IC component to the minimum punctual error is
zero. Let’s calculate the component provided by the sum of AP terms equal to
1.
The column whose weight is lsbIC=4, composed by neq   1 elements has
the first and the last terms equal to 1 while the other terms are an alternating
sequence of 1 and 0. The contribution of this column to the punctual error is:
lsbIC
4

2 +

neq   3
2

=
lsbIC
8
(neq + 2) (2.119)
The subsequent column, composed by neq 2 terms, whose weight is lsbIC=8,
has only the first and the last terms equal to 1. The contribution of this column
to the punctual error is lsbIC=4.
The next column, composed by neq   3 terms whose weight is lsbIC=16,
is again alternatively composed by 0 and 1 elements. The contribution of this
2.7. MAXIMUM ABSOLUTE ERROR 61
column to the punctual error is
3
16
lsbIC. And so on.
In general, let’s indicate with CODD the
contribute of the odd column of LSPminor
IC; (n+ h+ 3)th column; (n+ h+ 5)th; : : : ; (n+ h+ (2j   1))th;
j = 1; : : : ; neq=2g and with CEVEN the contribute of the even column of
LSPminor

(n+ h+ 2)th column; (n+ h+ 4)th; : : : ; (n+ h+ (2j))th;
j = 1; : : : ; neq=2g.
Since in odd columns there are present only two bits equal to 1:
CODD = lsbIC
24neq=2X
j=2
2  2 2j+2
35 (2.120)
The even columns, except the first and the last, are alternatively composed
by 0 and 1 elements; the first column is composed by 1 in first and last position
and 1 alternate to 0 in the others; the last column is formed by only one 1.
Hence:
CEVEN = lsbIC

3 +
neq   4
2

 2 2 + 2 neq+1+
+
neq=2 1X
j=2
neq   2j
2
2 2j+1
35 (2.121)
Using (2.120) and (2.121), the final expression of "  is:
" even = minx;y
0@lsbIC  neq 2X
i=3
i   SAP   lsb
2
+ lsbIC
1A
=  CEVEN   CODD   lsb
2
+ lsbIC
= lsb 

  1
18
2 h
 
7 + 3neq + 2
1 neq+ 2 h 1   1
2

(2.122)
If neq is odd the partial product matrix that minimize the punctual error is
slightly modified, according to the conjecture. The configuration for the case
n=10 and h = 1 is shown in Fig. 2.19. There are always a number of partial
products uncorrelated with the central terms of the IC that are posed to 1.
Further, the central terms of the IC are 0 and fix to 0 one row or one diagonal
of the AP. The only difference is that for the odd case the most convenient
62 CHAPTER 2. LMS TRUNCATED MULTIPLIER
choice is the one that for the last two terms of the central part of the IC fixes to
zero always the diagonal. The calculus of the minimum error is very similar.
The result for the odd neq case is:
" odd = lsb 

  1
18
2 h
 
7 + 3neq + 2
1 neq+ 2 h 1   1
2
+
+
1
3
2 h

1
8
+
4
3
21 neq

(2.123)
The two expression of the minimum value of the punctual error can be
grouped in only one:
"  = lsb 

  1
18
2 h
 
7 + 3neq + 2
1 neq+ 2 h 1   1
2
+
1
6
2 h

1
8
+
4
3
21 neq

  ( 1)neq 1
6
2 h

1
8
+
4
3
21 neq

(2.124)
Maximum value of the punctual error, "+
The calculation of the minimum value of the punctual error is similar to the
previous section. The details are not reported here. The result is:
"+ = lsb 

1
18
 
1 + 3neq + ( 1)neq  8  2 neq

(2.125)
Maximum absolute error
Since the absolute value of (2.125) is always lower than (2.124) we can state
that the maximum absolute error, "max, for the LMS1b truncated multiplier is:
"max = lsb 

  1
18
2 h
 
7 + 3neq + 2
1 neq+ 2 h 1   1
2
+
1
6
2 h

1
8
+
4
3
21 neq

  ( 1)neq 1
6
2 h

1
8
+
4
3
21 neq

(2.126)
Note that the maximum absolute error is not provided by a trivial con-
figuration of the inputs, like a configuration with x = y = [1; 1;    ; 1] or
x = y = [0; 0;    ; 0]. As example if we consider n = 10 the worst case x; y
configuration is given by x = y = [0; 1; 1; 0; 1; 0; 1; 0; 1; 1], if we consider
n = 9 it is given by x = [0; 1; 1; 0; 1; 0; 0; 1; 1] y = [0; 1; 1; 1; 0; 1; 0; 1; 1]. In
fact the worst case error is provided by a x; y configuration that provides an
2.7. MAXIMUM ABSOLUTE ERROR 63
Punctualerror [lsb]
n=12
h=0
Max. Error=2.39 lsb
Prob.=10
-7
P
ro
b
a
b
ili
ty
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Figure 2.20: Probability distribution of the punctual error for a n = 12
bit LMS1b truncated multiplier, h = 0. The maximum absolute error
is only present twice on the 224 = 16:8 106 different inputs.
AP whose sum is badly approximated by the compensation function. Since
the compensation function, on average, is a good approximation of the sum
of the AP, such configuration is also the less probable to occur. Please note
that the maximum absolute error of a truncated multiplier is only achieved for
a very small number of input values. As an example in fig.2.20 the probabil-
ity distribution of the punctual error for a 12 bit LMS1b truncated multiplier,
h = 0, is shown. As can be seen the maximum absolute error has a very small
probability (10 7) whereas the error values having high probability are con-
centrated around zero. Hence non-exhaustive simulations are very unlikely to
provide the actual maximum absolute error value or an error value near to the
maximum. It is also worthwhile to note that, for each fixed h, the probability
of the maximum absolute error decreases with neq, hence with n. In fact, the
number of possible input configurations that give the maximum absolute error
is two if neq is even or four if neq is odd (they are related to the fact that the
same truncation error can be obtained choosing in two different ways the terms
xh+1 and yh+1). Hence, for increasing neq, the probability of the maximum
absolute error decreases as 2 2neq+1 and 2 2neq+2 respectively.
64 CHAPTER 2. LMS TRUNCATED MULTIPLIER
x y1 8 x y2 8 x y3 8 x y4 8 x y5 8 x y6 8 x y7 8 x y8 8
x y1 7 x y2 7 x y3 7 x y4 7 x y5 7 x y6 7 x y7 7 x y8 7
x y1 6 x y2 6 x y3 6 x y4 6 x y5 6 x y6 6 x y7 6 x y8 6
x y1 5 x y2 5 x y3 5 x y4 5 x y5 5 x y6 5 x y7 5 x y8 5
x y1 4 x y2 4 x y3 4 x y4 4 x y5 4 x y6 4 x y7 4 x y8 4
x y1 3 x y2 3 x y3 3 x y4 3 x y5 3 x y6 3 x y7 3 x y8 3
x y1 2 x y2 2 x y3 2 x y4 2 x y5 2 x y6 2 x y7 2 x y8 2
x y1 1 x y2 1 x y3 1 x y4 1 x y5 1 x y6 1 x y7 1 x y8 1
p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16p1
1 0
MSP
LSP
LSPmajor
LSPminor
rounding
constant
Kr
0 0 0 0 00
h=2
11
IC
n =n-heq =6
sign-ext.
constant
TRUNCATED
-2
-1
2
-n
2
-n-1
2
-n-h-1
2
-2
2
-n-m
lsb lsbIC lsbq
Figure 2.21: Partial Products matrix for a signed multipliers with
n = 8; h = 2.
2.8 Signed and Mixed-Operand Multipliers
Fig. 2.21 shows the partial products matrix of a signed multiplier see Sec. 1.1.2.
As it can be seen, when h > 0 the LSPminor part is equal to the LSPminor part
of an unsigned multipliers. As a consequence, the expressions for the optimal
compensation function (2.54), the intrinsic error (2.57), the linear compensa-
tion function (2.60), the quantized compensation function (2.70) and the total
error statistics still hold. Therefore, in the following of this paragraph we will
focus on the case h = 0, where two elements of the IC are complemented.
If we consider a signed-multiplier the IC vector is defined in the following
way:
IC = [1; 2; : : : ; n] = [x1yn; x2yn 1; : : : ; xn 1y2; xny1] (2.127)
Note that this definition of the IC assumes that the sign-extension prevention
constant, shown in Fig. 2.21, is added to the partial products matrix.
Let us indicate as fsq(IC) the linear compensation function with quantized
2.8. SIGNED AND MIXED-OPERAND MULTIPLIERS 65
coefficients qsi:
fsq(IC) = 2
 n 1
"
nX
i=1
qsi  i
#
(2.128)
The following rule can be employed to transform the optimal coefficients qi
and rounding constant Kround given in previous sections for unsigned mul-
tiplier to the constant (Ksround) and coefficients qsi of the optimal signed
multiplier (with h = 0):
qsi = 2  qi i = 1; n
qsi = qi i 2 f2; :::; n  1g (2.129)
Ksround = Kround + lsbIC  (q1 + qn   2)
Note that equation (2.129) is a valid transformation since it transforms quan-
tized coefficients into quantized coefficients. By using (2.129), the total error
statistics (total, 2total and "
2
total) remains the same between unsigned and
signed multipliers. The error formulas and analysis given in previous section,
therefore, remains still valid also for signed multiplier.
The same reasoning can also be applied to mixed operand multipliers
(signedunsigned Sec. 1.1.3). Also in this case, for h > 0, the LSPminor
correspond to the LSPminor of an unsigned multiplier. For h > 0 coefficients
and error statistics are the same with respect to the unsigned multiplier.
For h = 0 the IC is defined as:
IC = [1; 2; :::; n] = [x1yn; x2yn 1; :::; xn 1y2; xny1] (2.130)
For h = 0, the optimal quantized coefficients qmi and rounding constant
Kmround of the mixed operand multiplier can be obtained from the coeffi-
cients and constant qi and rounding constantKround of the unsigned multiplier
by using the following transformation:
qm1 = 2  q1
qmi = qi i 2 f2; :::; ng (2.131)
Kmround = Kround + lsbIC  (q1   1)
Again, by using the transformation (2.131), the mixed operand multiplier
achieves the same total error statistics with respect to the unsigned multiplier.
66 CHAPTER 2. LMS TRUNCATED MULTIPLIER
2.9 Conclusions
In this chapter a theoretical analysis of truncated multipliers with variable-
correction has been presented.
A first result is showing that the optimal compensation function, which
minimize the mean square error of a truncated multiplier, is a quadratic form
of the partial products of the IC. It has been also evaluated, still in closed
form, a lower bound for the error of any truncated multiplier designed using a
variable correction methods.
The optimal compensation function, being a quadratic form, cannot be ef-
ficiently implemented in hardware. Therefore, it has been investigated the
performance achievable by using a linear compensation function, best suited
for hardware implementation. It is shown that the additional error component
due to using a linear compensation function is negligible, pointing out that a
linear compensation function is the best choice from a practical point of view.
The effect of coefficient quantization is also treated by providing the quan-
tized optimal coefficients. Finally it has been computed, in closed form, the
mean square error and the maximum absolute error of the LMS truncated mul-
tiplier. This is one of the most important characteristic of the proposed LSM
multiplier. The LMS multiplier is the only architecture that can be designed,
for every bit width, using an analytical approach that allows the a priori knowl-
edge of the error committed. When no analytical approach is feasible, it can
only be computed using slow exhaustive simulations, possible only for low n
values.
The result given in the chapter are detailed for unsigned, signed and
signedunsigned multipliers. All results can be applied for any n and h value.
In the next chapter the implementation details of the proposed truncated mul-
tipliers will be discussed.
Chapter 3
VLSI implementation and
Performances
T he practical implementation of the quantized linear compensation func-tion proposed in Ch. 2 is discussed in this chapter. The performances
of the new truncated multipliers are extensively compared with previously pro-
posed circuits. More than 100 truncated multipliers, with 8 different architec-
tures have been synthesized in 0:18m technology, with wordlengths ranging
from 8 to 32 bits. Area, power and accuracy of the multipliers are investi-
gated and compared. Experimental performances on a 0:18m test chip are
also presented. In following Sec. 3.1 describes the hardware implementation
of truncated multipliers with quantized linear compensation function. The per-
formances of the truncated multipliers are compared with previously proposed
architectures in Sec. 3.2. Sec. 3.3 reports the experimental results obtained on
the test chip implemented in 0:18m CMOS technology.
3.1 Truncated Multipliers Implementations
As it will be shown in the following, the truncated multipliers based on a lin-
ear quantized compensation function proposed in Ch. 2, are efficiently im-
plemented by summing a Partial Products Matrix (PPM). This is obtained by
firstly using a carry-save reduction trees, followed by a fast carry-propagate
adder (CPA) [22]. The Three Dimensional Minimization (TDM) method
[5, 7, 6] will be exploited for the carry-save reduction tree; this tecnique min-
imizes the overall delay by compensating the delay asymmetries of full and
half adders. In this section after a description of TDM technique proposed in
67
68 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
Table 3.1: Optimal quantized coefficient (Ch. 2). The (REM(x; y))
symbol indicates the remainder of the integer division x=y.
Signed Multiplier
LMS1b truncated
multipliers
( lsbq = lsbIC)
any n , h values
q1 = q2 = qneq 1 = qneq = 1
q3 = q4 = : : : = qneq 2 = 2
Kr = lsb=2
LMS2b truncated
multipliers
(lsbq = 12 lsbIC)
neq > 3 , h = 0
q1 = 1:0; qneq = 1:0 if REM(neq; 4) = 0
q1 = 1:0; qneq = 0:5 if REM(neq; 4) = 1
q1 = 0:5; qneq = 0:5 if REM(neq; 4) = 2
q1 = 0:5; qneq = 0:5 if REM(neq; 4) = 3
q2 = q3 = : : : = qneq 1 = 1:5
Kr = lsb
2
+ lsbq
 neq
4

+ 3  2q1   2qneq

LMS2b truncated
multipliers
(lsbq = 12 lsbIC)
neq > 3 , h > 0
q1 = 1:0; qneq = 1:0 if REM(neq; 4) = 0
q1 = 1:0; qneq = 1:5 if REM(neq; 4) = 1
q1 = 1:5; qneq = 1:5 if REM(neq; 4) = 2
q1 = 1:5; qneq = 1:5 if REM(neq; 4) = 3
q2 = q3 = : : : = qneq 1 = 1:5
Kr = lsb
2
+ lsbq
 neq
4
  1
Unsigned Multiplier
LMS1b truncated
multipliers
( lsbq = lsbIC)
any n , h values
q1 = q2 = qneq 1 = qneq = 1
q3 = q4 = : : : = qneq 2 = 2
Kr = lsb=2
LMS2b truncated
multipliers
(lsbq = 12 lsbIC)
neq > 3 , 8h
q1 = 1:0; qneq = 1:0 if REM(neq; 4) = 0
q1 = 1:0; qneq = 1:5 if REM(neq; 4) = 1
q1 = 1:5; qneq = 1:5 if REM(neq; 4) = 2
q1 = 1:5; qneq = 1:5 if REM(neq; 4) = 3
q2 = q3 = : : : = qneq 1 = 1:5
Kr = lsb
2
+ lsbq
 neq
4
  1
[5, 7, 6], the TDM approaches of [5] and [7] will be employed because they
provide performances very close to the optimal TDM of [6] while providing a
much simpler implementation. Recalling what has been told in Sec. 1.2.4 the
3.1. TRUNCATED MULTIPLIERS IMPLEMENTATIONS 69
sign-ext.
constant
x y1 8 x y2 8
x y2 7 x y3 7
x y3 6 x y4 6
x y4 5 x y5 5
x y5 4 x y6 4
x y6 3 x y7 3
x y6 2 x y8 2
x y8 1
pt2 pt3 pt4 pt5 pt6 pt7 pt8 pt9pt1 pt11pt10
2
-1
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-n
2
-n-1
2
-n-h-1
2
-2
MSP
LSPmajor
CORRECTION FUNCTION f(IC)
2
-n-m
pt12
x y3 8
x y4 7
x y5 6
x y6 5
x y7 4
x y8 3
IC
lsb lsbIC lsbq
ROUNDING CONSTANT
11
x y1 7
x y2 6x y1 6
x y3 5x y1 5 x y2 5
x y4 4x y2 4 x y3 4x y1 4
x y5 3x y3 3 x y4 3x y1 3 x y2 3
x y1 2 x y5 2x y4 2 x y4 2x y2 2 x y3 2
x y2 1 x y6 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
Figure 3.1: Signed truncated multiplier with n = 8, h = 2. In this
example lsbq = 12 lsbIC
TDM of [7] is only composed by full adders providing the minimum number
of terms to be summed in the final CPA adder. The approach of [7] provides
fast multipliers, even if, in some rare cases, results in ripple structures that
compromise the delay of the whole multiplier. The TDM of [5] uses both full
and half adders, reducing the matrix to a height of two for every column. This
avoids the problem of the carry ripple structures but requires a slightly more
complex CPA.
Tab. 3.1 summarizes the optimal coefficient qi and constant Kround com-
puted in Ch. 2. As it can be seen, the quantized coefficients in the case
lsbq = lsbIC are either equal to 1 or 2, while for the case lsbq = 12 lsbIC the
majority of the coefficients are equal to 1.5.
3.1.1 Implementation of LMS1b truncated multipliers
Let us consider, as an example, a signed multiplier with n = 8 and h = 2, (as
in Fig. 3.1). By using the data in Tab. 3.1, the compensation function in this
70 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
sign-ext.
constant
x y1 8 x y2 8
x y2 7 x y3 7
x y3 6 x y4 6
x y4 5 x y5 5
x y5 4 x y6 4
x y6 3 x y7 3
x y6 2 x y8 2
x y8 1
pt2 pt3 pt4 pt5 pt6 pt7 pt8 pt9pt1 pt11pt10
2
-1
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-n
2
-n-1
2
-n-h-1
2
-2
MSP
LSPmajor
lsb lsbIC=lsbq
11
x y1 7
x y2 6x y1 6
x y3 5x y1 5 x y2 5
x y4 4x y2 4 x y3 4x y1 4
x y5 3x y3 3 x y4 3x y1 3 x y2 3
x y1 2 x y5 2x y4 2 x y4 2x y2 2 x y3 2
x y2 1 x y6 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
x y3 8
x y4 7
x y5 6
x y6 5
x y7 4
x y8 3
1
fq
Kround
Figure 3.2: Implementation of the proposed (signed) LMS1b trun-
cated multiplier, lsbq = lsbIC, n = 8 and h = 2.
case can be written as:
fq(IC) = lsbIC  [1 + 2 + 23 + 24 + 5 + 6] (3.1)
Thus, to sum fq(IC) we simply put the terms 1; 2; 5; 6 on the matrix col-
umn with weight lsbIC, while the terms 3 and 4 are aligned on the next
matrix column (on the left), having a weight 2lsbIC. The corresponding PPM
is shown in Fig. 3.2. In general, the PPM of the LMS1b truncated multiplier
leaves the two extreme couples of IC partial products (1; 2 and neq 1; neq )
on the IC column, while the remaining partial products of the IC, having
i = 2, are placed in the column at the left of the IC. The summation of
Fig. 3.2 is a “conventional” PPM that can be efficiently implemented using the
TDM technique followed by a carry-propagate Adder.
Tab. 3.2 presents the implementation results of the LMS1b truncated mul-
tiplier implemented in a 0:18m technology. For each row the results re-
port the TDM reduction method ([5] or [7]) giving better performances. The
LMS1b truncated multiplier provides a significant area, power and delay im-
provement with respect to the full-rounded multiplier. As an example for the
3.1. TRUNCATED MULTIPLIERS IMPLEMENTATIONS 71
Table 3.2: Comparison of the performance of proposed truncated
(signed) multipliers. Circuits are implemented in TSMC 0:18m tech-
nology.
n, h Multiplier Reduction Tree Performances
Ref. #FA #HA Area
[103 
m2]
Power
W
MHz
 Delay
[ns]
8, - full rounded [5] 37 5 4.51 8.24 2.10
LMS1b truncated [5] 31 1 3.38 6.67 2.10
8, 2 LMS2b truncated [5] 35 1 3.62 7.12 2.10
LMS2b truncated,
IM1
[5] 32 2 3.73 6.97 2.10
LMS2b truncated,
IM2
[7], [5] 33 1 3.62 7.12 2.10
16,- full rounded [5] 197 13 19.6 37.9 2.75
LMS1b truncated [5] 119 1 10.7 22.2 2.75
16,1 LMS2b truncated [7] 135 0 11.8 24.5 2.75
LMS2b truncated,
IM1
[5] 125 2 11.8 24.2 2.75
LMS2b truncated,
IM2
[7], [5] 122 2 11.3 23.8 2.75
24,- full rounded [5] 485 21 45.2 87.1 3.13
LMS1b truncated [5] 275 1 23.8 49.1 3.09
24,1 LMS2b truncated [7] 299 0 26.0 51.9 3.10
LMS2b truncated,
IM1
[5] 285 2 25.1 51.2 3.10
LMS2b truncated,
IM2
[7], [5] 279 2 24.1 50.4 3.10
case n = 24; h = 1 the LMS1b truncated multiplier provides a power and area
reduction of 47% and 42%, respectively, while being also slightly faster than
the full-rounded multiplier. The good performance are due to the considered
compensation function that is well suited to TDM implementation without the
need of additional logic. As it will be shown in Sec. 3.2, this is not true for
other truncated multipliers proposed in the Literature (see, for instance, [13])
that may require slow and power hungry ripple-like logic to calculate the com-
72 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
x y3 8
x y4 7
x y5 6
x y7 4
x y8 3
1
x y3 8
x y4 7
x y5 6
x y6 5 x y6 5
x y7 4
x y8 3
sign-ext.
constant
x y1 8 x y2 8
x y2 7 x y3 7
x y3 6 x y4 6
x y4 5 x y5 5
x y5 4 x y6 4
x y6 3 x y7 3
x y6 2 x y8 2
x y8 1
pt2 pt3 pt4 pt5 pt6 pt7 pt8 pt9pt1 pt11pt10
2
-1
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-n
2
-n-1
2
-n-h-1
2
-2
MSP
LSPmajor
2
-n-m
pt12
lsb lsbIC lsbq
11
x y1 7
x y2 6x y1 6
x y3 5x y1 5 x y2 5
x y4 4x y2 4 x y3 4x y1 4
x y5 3x y3 3 x y4 3x y1 3 x y2 3
x y1 2 x y5 2x y4 2 x y4 2x y2 2 x y3 2
x y2 1 x y6 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
fq
Kround
Figure 3.3: Straightforward implementation of the proposed signed
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8 and h = 2.
pensation function.
3.1.2 Implementations of LMS2b truncated multipliers
Let us consider the same signed multiplier example with n = 8 and h = 2
(neq = n  h = 6). Since REM(neq; 4) = 2, according to Tab. 3.1, every qi is
equal to 1.5. Therefore, the compensation function can be written as:
fq(IC) =
1
2
lsbIC  [31 + 32 + 33 + 34 + 35 + 36] (3.2)
Thus, to sum fq(IC) we can replicate all the terms 1; 2; : : : ; 6 on the two
matrix columns with weight lsbIC and lsbIC=2. This gives the PPM shown in
Fig. 3.3. As it can be seen, in Fig. 3.3 each partial product of the IC is inserted
twice in the multiplier matrix.
The implementation results to be a bit different if we consider a differ-
ent configuration, for example n = 8; h = 3 (neq = n  h = 5), for which
REM(neq; 4) = 1. Hence, from Tab. 3.1, the compensation function can be
3.1. TRUNCATED MULTIPLIERS IMPLEMENTATIONS 73
written as:
fq(IC) = lsbIC  1 + 1
2
lsbIC  [32 + 33 + 34 + 35] (3.3)
Thus, to sum fq(IC), 1 can be put on the matrix column with weight lsbIC,
while the remaining terms 2; : : : ; 6 are replicated on the two matrix columns
with weight lsbIC and lsbIC=2. This gives the PPM shown in Fig. 3.4.
In general, from Tab. 3.1, in LMS2b truncated multipliers the partial prod-
ucts of the IC can be divided in two subsets. The terms with qi = 0:5 or
qi = 1, are inserted only once in the truncated multiplier matrix (either in the
IC column or in the column at the right of the IC). The terms with qi = 1:5, are
duplicated and inserted both in the IC column and in the column at the right
of the IC. The presence of partial products that are inserted twice in the ma-
trix results in an increase in hardware complexity with respect to the LMS1b
truncated multiplier. On the other hand the resulting PPM is not a conven-
tional one, since some partial products are duplicated. It is therefore expected
that standard TDM implementation strategies will not provide optimal perfor-
mances for this circuit since they do not exploit the redundancy of the PPM.
This is confirmed in Tab. 3.2 where the performances of the LMS2b trun-
cated multiplier implemented using the standard TDM of [5] and [7] to reduce
the PP matrix, and a final Kogge-Stone adder [22], to compute the result are
shown. As it can be seen, using the standard TDM results in a noticeable
increase in circuit complexity and power dissipation with respect to LMS1b
truncated multipliers. For the case n = 8, h = 2, the matrix of the LMS1b
truncated multiplier (shown in Fig. 3.2) is reduced with 31 full-adders (FA)
and 1 half-adder (HA). The matrix of the LMS2b truncated multiplier (shown
in Fig. 3.3) needs 35 FA and 1 HA with a 7% increase of both area and power
dissipation with respect to the LMS1b truncated multiplier. An even worse
behavior is observed for larger n values (16 and 24 bit multipliers in Tab. 3.2),
where the area and power increase of LMS2b truncated multipliers with re-
spect to LMS1b truncated multipliers is up to 10%.
In the following two approaches will be proposed, Implementation Method
1 (IM1) and Implementation Method 2 (IM2), with which the complexity of
the LMS2b truncated multipliers is reduced by exploiting the redundancy of
the PPM.
Implementation Method 1 Let us name 0i the PPs of the IC with qi =
1:5. In the implementation approach IM1, the PPs 0i are grouped in couples
74 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
x y3 8
x y4 7
x y5 6
x y7 4
x y8 3
x y6 5
sign-ext.
constant
x y1 8 x y2 8
x y2 7 x y3 7
x y3 6 x y4 6
x y4 5 x y5 5
x y5 4 x y6 4
x y6 3 x y7 3
x y6 2 x y8 2
x y8 1
pt2 pt3 pt4 pt5 pt6 pt7 pt8 pt9pt1 pt11pt10
2
-1
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-n
2
-n-1
2
-n-h-1
2
-2
MSP
LSPmajor
2
-n-m
pt12 pt13
lsb lsbIC lsbq
11
x y1 7
x y2 6x y1 6
x y3 5x y1 5 x y2 5
x y4 4x y2 4 x y3 4x y1 4
x y5 3x y3 3 x y4 3x y1 3 x y2 3
x y1 2 x y5 2x y4 2 x y4 2x y2 2 x y3 2
x y2 1 x y6 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
Kround
1
x y4 8
x y5 7
x y6 6
x y7 5
x y8 4
x y5 7
x y6 6
x y7 5
x y8 4
fq
Figure 3.4: Straightforward implementation of the proposed signed
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8 and h = 3.
 i = (
0
2i 1; 
0
2i). The contribution of the couple  i to the total summation is:
i =
1
2
lsbIC 

302i 1 + 3
0
2i

(3.4)
In the above equation the partial products 02i 1 and 
0
2i are either 0 or 1. There-
fore i 2 f0; 1:5lsbIC; 3lsbICg and can be expressed as follows:
i =
1
2
lsbIC 
 
2i  22 + 1i  21 + 0i  20

(3.5)
where 2i; 1i and 0i are three binary values. By using (3.5) three terms are
introduced in the PPM to account i, while four partial products are needed
in the simple approach that replicates the partial products on the two matrix
columns with weight lsbIC and lsbIC/2. The relation between (02i 1; 
0
2i)
and (2i; 1i; 0i) is shown in Tab. 3.3 and is given by:
2i = 
0
2i 1 ^ 02i
1i = 
0
2i 1 _ 02i (3.6)
0i = 
0
2i 1  02i
3.1. TRUNCATED MULTIPLIERS IMPLEMENTATIONS 75
Table 3.3: Truth table describing the Boolean relationship between
(02i 1; 
0
2i) and (2i; 1i; 0i).
02i 1 
0
2i i=lsbIC 2i 1i 0i
0 0 0 0 0 0
0 1 1.5 0 1 1
1 0 1.5 0 1 1
1 1 3 1 1 0
Fig. 3.5 shows the implementation of the LMS2b truncated multiplier
with n = 8, h = 2 using IM1 approach. The six terms of the IC
(x3y8; x4y7; : : : ; x8y3) are grouped in three couples ( 1; 2; 3) that feed the
ij generation logic that, using (3.6), computes the ij terms. These terms are
inserted in the PP matrix of the multiplier in order to compute fq(IC). Compar-
ing the IM1 circuit of Fig. 3.5 with the standard TDM of Fig. 3.3, it is evident
that the number of terms needed to compute fq(IC) reduces from 12 to 9 at the
expense of a small complexity increase due to ij generation logic.
A slightly different implementation is required in other cases. Fig. 3.6
shows the IM1 circuit for n = 8, h = 3. Since neq = 5, 1 = x4y8 has q1 = 1
and is directly inserted in the PP matrix. The remaining four terms of the IC
(x5y7; : : : ; x8y4) have qi = 1:5. These PPs form two couples  1; 2, which
produce (through the ij generation logic) six terms in the PP matrix.
Tab. 3.2 shows the implementation data of the IM1 LMS2b truncated mul-
tipliers. Note that the LMS2b truncated multiplier of Fig. 3.5 is reduced by a
carry-save tree with 32 FA and 2 HA, that is, with only one FA and one HA in
addition with respect to the circuit of Fig. 3.2. Comparing the standard TDM
and IM1 in Fig. 3.6 it is evident that the latter provides better performance as
n increases. As an example, for n = 24, IM1 multiplier is 4% smaller than
conventional implementation and only the 5% larger than the LMS1b trun-
cated multiplier. For n = 8 the IM1 yields a slightly larger circuit with more
power consumption with respect to the standard TDM. In this case, the reduc-
tion in carry-save tree complexity is nullified by the additional logic required
to compute ij terms.
76 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
1
sign-ext.
constant
x y1 8 x y2 8
x y2 7 x y3 7
x y3 6 x y4 6
x y4 5 x y5 5
x y5 4 x y6 4
x y6 3 x y7 3
x y6 2 x y8 2
x y8 1
pt2 pt3 pt4 pt5 pt6 pt7 pt8 pt9pt1 pt11pt10
2
-1
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-n
2
-n-1
2
-n-h-1
2
-2
MSP
LSPmajor
2
-n-m
pt12
lsb lsbIC lsbq
11
x y1 7
x y2 6x y1 6
x y3 5x y1 5 x y2 5
x y4 4x y2 4 x y3 4x y1 4
x y5 3x y3 3 x y4 3x y1 3 x y2 3
x y1 2 x y5 2x y4 2 x y4 2x y2 2 x y3 2
x y2 1 x y6 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
h11
h21
h12
h22
h13
h23
h01
h02
h03
G G G1 , ,2 3
h
i j
g
e n
e r
a t
io
n
l
o
g
ic
Kround
fq
Figure 3.5: Implementation Method 1 (IM1) of the proposed (signed)
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8; h = 2
Implementation Method 2 This improved implementation strategy for
LMS2b truncated multipliers fully exploits the hardware sharing between the
columns of the PPM.
Fig. 3.7 presents the IM2 approach for the LMS2b truncated multiplier
with n = 8, h = 2. In this case, as shown in (3.2), all the six qi are equal to
1.5. An auxiliary carry-save reduction tree sums the six IC terms. In the small
example of Fig. 3.7, the auxiliary carry save reduction tree is composed by
two full adders, whose carry and sum output are indicated as c1; c0 and s1; s0
respectively. Let us indicate as S and C the results of the auxiliary carry-save
tree: S = s1  21 + s0  20; C = c1  21 + c0  20. We have:
S + C = 1 + 2 + 3 + 4 + 5 + 6 (3.7)
Therefore, the value of fq(IC) in (3.2) can be expressed as:
fq(IC) =
1
2
lsbIC  (S + C) + lsbIC  (S + C) (3.8)
According to (3.8), the bits si and ci are inserted twice in the final PPM
of the truncated multiplier. Starting from the leftmost column (with weight
3.1. TRUNCATED MULTIPLIERS IMPLEMENTATIONS 77
x y3 8
x y4 7
x y5 6
x y7 4
x y8 3
x y6 5
sign-ext.
constant
x y1 8 x y2 8
x y2 7 x y3 7
x y3 6 x y4 6
x y4 5 x y5 5
x y5 4 x y6 4
x y6 3 x y7 3
x y6 2 x y8 2
x y8 1
pt2 pt3 pt4 pt5 pt6 pt7 pt8 pt9pt1 pt11pt10
2
-1
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-n
2
-n-1
2
-n-h-1
2
-2
MSP
LSPmajor
2
-n-m
pt12 pt13
lsb lsbIC lsbq
11
x y1 7
x y2 6x y1 6
x y3 5x y1 5 x y2 5
x y4 4x y2 4 x y3 4x y1 4
x y5 3x y3 3 x y4 3x y1 3 x y2 3
x y1 2 x y5 2x y4 2 x y4 2x y2 2 x y3 2
x y2 1 x y6 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
Kround
1
fq
x y4 8
h01h11
h21
h02h12
h22
G G1 , 2
h
i j
g
en
er
a t
i o
n
l
o
g
i c
Figure 3.6: Implementation Method 1 (IM1) of the proposed (signed)
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8; h = 3
1
sign-ext.
constant
x y1 8 x y2 8
x y2 7 x y3 7
x y3 6 x y4 6
x y4 5 x y5 5
x y5 4 x y6 4
x y6 3 x y7 3
x y6 2 x y8 2
x y8 1
pt2 pt3 pt4 pt5 pt6 pt7 pt8 pt9pt1 pt11pt10
2
-1
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-n
2
-n-1
2
-n-h-1
2
-2
MSP
LSPmajor
2
-n-m
pt12
lsb lsbIC lsbq
11
x y1 7
x y2 6x y1 6
x y3 5x y1 5 x y2 5
x y4 4x y2 4 x y3 4x y1 4
x y5 3x y3 3 x y4 3x y1 3 x y2 3
x y1 2 x y5 2x y4 2 x y4 2x y2 2 x y3 2
x y2 1 x y6 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
Kround
fq
s0s1
c0c1
s0
c0
s1
c1
si
ci
x y3 8
x y4 7
x y5 6
x y6 5
x y7 4
x y8 3
a u
x
i l
i a
ry
ca
rr
y
-s
av
e
t r
e e
Figure 3.7: Implementation Method 2 (IM2) of the proposed (signed)
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8; h = 2
78 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
x y3 8
x y4 7
x y5 6
x y7 4
x y8 3
x y6 5
sign-ext.
constant
x y1 8 x y2 8
x y2 7 x y3 7
x y3 6 x y4 6
x y4 5 x y5 5
x y5 4 x y6 4
x y6 3 x y7 3
x y6 2 x y8 2
x y8 1
pt2 pt3 pt4 pt5 pt6 pt7 pt8 pt9pt1 pt11pt10
2
-1
x3 x4 x5 x6 x7 x8x2x1
y3 y4 y5 y6 y7 y8y2y1
2
-n
2
-n-1
2
-n-h-1
2
-2
MSP
LSPmajor
2
-n-m
pt12 pt13
lsb lsbIC lsbq
11
x y1 7
x y2 6x y1 6
x y3 5x y1 5 x y2 5
x y4 4x y2 4 x y3 4x y1 4
x y5 3x y3 3 x y4 3x y1 3 x y2 3
x y1 2 x y5 2x y4 2 x y4 2x y2 2 x y3 2
x y2 1 x y6 1x y4 1 x y5 1x y3 1 x y4 1x y1 1
Kround
1
fq
x y4 8
si
ci
au
x
il
ia
ry
c
ar
ry
-s
av
e
tr
ee x y5 7
x y6 6
x y7 5
x y8 4
s0s1
c0
s0
c0
s1
Figure 3.8: Implementation Method 1 (IM1) of the proposed (signed)
LMS2b truncated multiplier lsbq = 12 lsbIC, n = 8; h = 3
1=2lsbIC), each bit si and ci is inserted once in the ith column and once in
the (i + 1)th column of the matrix. This is shown in Fig. 3.7 for the example
n = 8, h = 2. As it can be seen, the IM2 approach in this example introduces
8 terms in the matrix to compute fq(IC). This compares favorably with the 9
terms needed by the IM1 approach (see Fig. 3.5) and the 12 terms needed by
the standard TDM.
Fig. 3.8 shows the IM2 technique applied to the LMS2b truncated multi-
plier with n = 8, h = 3. In this case the four terms with qi = 1:5 (see (3.3)),
are compressed by using the auxiliary carry-save reduction tree, that includes
a single full-adder. Note that c1 = 0 and is not introduced in Fig. 3.8. The
term 1 = x4y8 has qi = 1 and is hence directly inserted in the matrix column
with weight lsbIC of the truncated multiplier. In the case of Fig. 3.8, we have
a total of seven terms that are included in the PPM to compute fq(IC).
In general, the wordlengths at the output of the auxiliary carry-save tree
(i.e. the wordlenghts of S and C signals) increase logarithmically with neq.
As a consequence, the number of terms to be included in the PPM to com-
pute fq(IC) also increases logarithmically with neq. On the contrary, the num-
ber of ij terms needed using the IM1 approach increases linearly with neq.
Therefore the IM2 approach is more and more effective as neq increases. It
3.2. TRUNCATED MULTIPLIERS PERFORMANCES 79
is worthwhile to note that TDM is able to compensate input delay asymme-
tries. Therefore, the delays of the auxiliary carry-save reduction tree are partly
absorbed by the TDM used to reduce the multiplier PPM, with a small delay
penalty.
Implementation data of the IM2 LMS2b truncated multipliers are shown
in Tab. 3.2. As it can be observed, the best performance are obtained by using
[7] for the auxiliary carry-save reduction tree, while the TDM approach of [5]
resulted, always, the best technique to compress the final matrix. The IM2
LMS2b truncated multipliers provide the best performances in every consid-
ered case. For n = 16, h = 1, the standard TDM LMS2b truncated multiplier
results in 10% area increase with respect to the LMS1b truncated multiplier.
Using the IM2 method, the area increase with respect to LMS1b truncated
multiplier is only 5%.
3.2 Truncated Multipliers Performances
This section is devoted to the comparison between LMS1b and LMS2b trun-
cated multipliers proposed in this chapter and previously proposed architec-
tures. From the result of the previous section, LMS2b truncated multipliers
will be implemented by using the IM2 approach. In the comparison with lit-
erature results, the attention is restricted to variable-correction truncated mul-
tipliers [10, 11, 12, 13, 14, 15, 16, 17, 19] which provide much lower errors
with respect to constant-correction circuits (see Ch. 1).
It is worth to highlight that not all the above cited papers discuss the im-
plementation of truncated multipliers with h > 0 and deal with both signed
and unsigned cases. A review of the variable-correction approaches found in
the Literature from this point of view is reported in Tab. 3.4.
In [13] Jou et al. consider the case of both signed and unsigned multipliers.
In [13] truncated multipliers with additional w columns in the matrix are con-
sidered, however the number of output bits of the multiplier in [13] is not equal
to n but it is instead n+ w. In this thesis (see Fig. 3.1), similarly to the rest of
the Literature, the number of output bits is fixed1 (is equal to n) and h is a de-
sign parameter that can be used to increase the accuracy without changing the
weight of the output lsb. Therefore the architecture of [13] can be compared
with the proposed multipliers only when h = 0. However, as shown in Ch. 2,
the results obtained for unsigned multiplier with h = 0 can be extended rather
1 In is worth highlighting that proposed approaches can be easily extended to consider cases
where the number of output bits is larger than n, still keeping h as an accuracy parameters.
80 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
Table 3.4: Previously proposed truncated multipliers considered in the
comparison.
Truncated multi-
plier
Signed
h=0
Unsigned
h=0
Signed/Unsigned
h>0
Jou et al.
[13]
proposed in the
original paper
proposed in the
original paper
can be extended
to this case
Van et al.
[15]-[16]
proposed in the
original paper
can be extended
to this case
proposed in the
original paper
Curticapean et al.
[14]
- proposed in the
original paper
can be extended
to this case
Kuang et al.
[18]
- proposed in the
original paper
can be extended
to this case
Liao et al.
[17]
proposed in the
original paper
- -
Swartzlander et al.
[11, 10, 23]
- proposed in the
original paper
proposed in the
original paper
straightforwardly to cover both signed and unsigned multipliers with h > 0.
Thus, as highlighted in Tab. 3.4, the approach of [13] has been extended to the
case h > 0, by introduction of a suitable rounding constant which has been
evaluated in order to minimize the total mean error (as discussed in Ch. 2, this
also minimizes the total mean square error).
The same considerations apply to the multipliers proposed by Kuang et
al. [18]. In this case, however, the original paper considers only unsigned
multiplier. Therefore the technique of [18] cannot be extended to signed mul-
tipliers with h = 0. Please note that [18] proposes two approaches. It will be
considered only the first approach (Type I) that provides a lower mean square
error.
The truncated multipliers proposed by Van et al. [15]-[16] include signed
multipliers for h  0. The error compensation function proposed for h > 0
can however be extended to the unsigned multiplier with h = 0 (similarly to
what has been done for [13],[18]).
Curticapean et al. [14] consider only the unsigned multiplier with h = 0.
The architecture proposed in [14] can hence be extended (still adding a suitable
3.2. TRUNCATED MULTIPLIERS PERFORMANCES 81
rounding constant) to signed and unsigned multipliers with h > 0.
The variable-correction multipliers proposed by Swartzlander et al. [11,
10, 23] consider the unsigned case for k  0. This technique, therefore, cannot
be extended to the signed multipliers with h = 0.
Finally the approach of Liao et al. [17] is originally developed for the
signed h = 0 case and cannot be extended to any other case, so it will be not
considered in the following.
In the following the multipliers will be compared in term of mean square
error (Sec. 3.2.1), maximum absolute error (Sec. 3.2.2) and in terms of area,
power and propagation delay (Sec. 3.2.3).
3.2.1 Mean Square Error Performances
The Tables 3.5-3.6 compare the mean square error obtained by the LMS1b
and LMS2b truncated multipliers proposed and the state of art architectures.
The first two columns in those Tables report the error achieved by the trun-
cated multiplier proposed, as obtained with the analytical formulas presented
in Ch. 2. The remaining columns in Tabs. 3.5-3.6 show, instead, the mean
square error as obtained after exhaustive simulation. Since the simulation time
increases as O(22n), an unreasonable amount of CPU time is needed when n
is larger than 16. For this reason, in Tabs. 3.5-3.6, simulation data are not
available for n = 32 and for n = 64.
The data in Tabs. 3.5-3.6 highlight a very good agreement between theory
and simulations for the proposed multipliers. The theoretical values of LMS1b
truncated multipliers are exact for h = 0, remaining data are almost exact.
LMS1b truncated multipliers exhibit good error performance. When h = 0
only the multiplier of Kuang et al. [18] is able to obtain a slightly lower error
than the LMS1b truncated multiplier. On the other hand, the multiplier of
Kuang et al. [18] yields a larger error than LMS1b truncated multiplier for
h > 0. In a few cases, when h = 1, LMS1b truncated multipliers presents also
a slightly larger error with respect to the multipliers of Van et al. [15]-[16] and
Jou et al. [13]. This happens for n = 8; 10; 12; 14. Note that only in few cases
LMS1b truncated multiplier presents a slightly larger error, because the better
results provided by [18], [15]-[16] and [13] are obtained through exhaustive
research, so it isn’t possible to find an unique methodology always valid.
LMS2b truncated multiplier provides the lowest error for any n and h value.
As an example in the case n = 16, h = 1 the LMS2b truncated multiplier
results in a reduction of mean square error of 13%, 13%, 39%, 31%, and 16%
82 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
Table
3.5:
T
heoretical
and
sim
ulated
m
ean
square
errors
of
proposed
and
previously
proposed
truncated
m
ultiplier(S=signed
m
ultiplier;U
=unsigned
m
ultiplier),n
=
8
;1
0
;1
2
and
h
=
0
;1;2
;3.
T
heoretical
Sim
ulated
M
ean
Square
E
rror
n
h
"
2to
ta
l (lsb
2)
"
2to
ta
l (lsb
2)
L
M
S1b
L
M
S2b
L
M
S1b
L
M
S2b
Jou
V
an
C
urticapean
K
uang
I
Sw
artzlander
[13]
[15]-[16]
[14]
[18]
[11,10,23]
(S/U
)
(S/U
)
(S/U
)
(S/U
)
(S/U
)
(S)
(U
)
(S)
(U
)
(S)
(U
)
(S)
(U
)
0
0.216
0.190
0.216
0.190
0.598
0.263
-
-
0.234
-
0.213
-
0.263
8
1
0.118
0.105
0.118
0.105
0.104
0.104
0.187
0.159
0.123
2
0.091
0.087
0.090
0.087
0.087
0.087
0.108
0.101
0.092
3
0.085
0.084
0.085
0.084
0.084
0.084
0.089
0.085
0.085
0
0.258
0.227
0.258
0.227
0.700
0.305
-
-
0.285
-
0.255
-
0.305
10
1
0.130
0.114
0.129
0.114
0.119
0.119
0.196
0.170
0.134
2
0.094
0.090
0.094
0.090
0.090
0.090
0.110
0.104
0.095
3
0.086
0.085
0.086
0.085
0.085
0.085
0.090
0.088
0.086
0
0.300
0.257
0.300
0.257
0.780
0.347
-
-
0.333
-
0.296
-
0.347
12
1
0.140
0.121
0.140
0.121
0.134
0.134
0.206
0.180
0.144
2
0.096
0.091
0.096
0.091
0.094
0.094
0.113
0.106
0.097
3
0.086
0.085
0.086
0.085
0.086
0.086
0.090
0.089
0.086
3.2. TRUNCATED MULTIPLIERS PERFORMANCES 83
Ta
bl
e
3.
6:
T
he
or
et
ic
al
an
d
si
m
ul
at
ed
m
ea
n
sq
ua
re
er
ro
rs
of
pr
op
os
ed
an
d
pr
ev
io
us
ly
pr
op
os
ed
tr
un
ca
te
d
m
ul
tip
lie
r(
S=
si
gn
ed
m
ul
tip
lie
r;
U
=u
ns
ig
ne
d
m
ul
tip
lie
r)
n
=
14
;1
6
;3
2
;6
4
an
d
h
=
0
;1
;2
;3
.
T
he
or
et
ic
al
Si
m
ul
at
ed
M
ea
n
Sq
ua
re
E
rr
or
n
h
"2 t
o
ta
l(
ls
b
2
)
"2 t
o
ta
l(
ls
b
2
)
L
M
S1
b
L
M
S2
b
L
M
S1
b
L
M
S2
b
Jo
u
V
an
C
ur
tic
ap
ea
n
K
ua
ng
I
Sw
ar
tz
la
nd
er
[1
3]
[1
5]
-[
16
]
[1
4]
[1
8]
[1
1,
10
,2
3]
(S
/U
)
(S
/U
)
(S
/U
)
(S
/U
)
(S
/U
)
(S
)
(U
)
(S
)
(U
)
(S
)
(U
)
(S
)
(U
)
0
0.
34
2
0.
29
0
0.
34
2
0.
29
0
0.
84
7
0.
38
9
-
-
0.
38
0
-
0.
33
8
-
0.
38
9
14
1
0.
15
1
0.
13
1
0.
15
1
0.
13
1
0.
14
7
0.
14
7
0.
21
6
0.
19
1
0.
15
5
2
0.
09
9
0.
09
5
0.
09
9
0.
09
5
0.
09
8
0.
09
8
0.
11
5
0.
10
9
0.
10
0
3
0.
08
7
0.
08
6
0.
08
7
0.
08
6
0.
08
7
0.
08
7
0.
09
1
0.
08
9
0.
08
7
0
0.
38
4
0.
32
6
0.
38
4
0.
32
6
0.
90
5
0.
43
1
-
-
0.
42
5
-
0.
38
0
-
0.
43
1
16
1
0.
16
1
0.
13
9
0.
16
0
0.
13
9
0.
16
0
0.
16
0
0.
22
7
0.
20
1
0.
16
5
2
0.
10
1
0.
09
6
0.
10
2
0.
09
6
0.
10
1
0.
10
1
0.
11
8
0.
11
2
0.
10
2
3
0.
08
8
0.
08
6
0.
08
8
0.
08
6
0.
08
7
0.
08
7
0.
09
2
0.
09
0
0.
08
8
0
0.
71
7
0.
60
0
N
.A
.
N
.A
.
N
.A
.
N
.A
.
-
-
N
.A
.
-
N
.A
.
-
N
.A
.
32
2
0.
12
2
0.
11
3
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
4
0.
08
6
0.
08
5
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
0
1.
38
4
1.
14
1
N
.A
.
N
.A
.
N
.A
.
N
.A
.
-
-
N
.A
.
-
N
.A
.
-
N
.A
.
64
2
0.
16
4
0.
14
7
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
4
0.
08
8
0.
08
7
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
N
.A
.
84 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
with respect to [13], [15]-[16], [14], [18], and [11, 10, 23], respectively. Even
larger improvements are achieved for h = 0.
The availability of analytical formulas allows us to evaluate the mean
square error of the proposed multipliers also for large n values, while this is
not possible for previously proposed architectures. Data in Tab. 3.5-3.6 show
that, by using h = 4, the proposed truncated multipliers yield an error very
close to full-rounded multipliers (lsb2=12 = 0:083  lsb2), also for very large
n values (e.g. n = 64).
3.2.2 Maximum Absolute Error Performances
The proposed analytical formula for the maximum error is compared against
numerical simulations of the LMS1b truncated multipliers with different bit
width. Furthermore, the maximum absolute error performances of the LMS1b
truncated multiplier are compared with the maximum absolute error of state of
art truncated multipliers and LMS2b multiplier. The results are shown in Table
3.7.
The comparison of the first two columns shows that the analytical formula,
eq. (2.126) is in perfect agreement with the simulated results, demonstrating
the correctness of the analytical formula devised in the previous chapter.
The comparison of the maximum absolute error performances highlights
the fact that the LMS1b truncated multiplier provides good maximum absolute
error performances if compared with the papers presented in the literature.
For h = 0 the only reference that overcomes the LMS1b truncated multiplier
is [14]. For h 6= 0 also [13] and [15]-[16] give a slightly lower maximum
absolute error.
But the important point is that LMS1b multiplier is the only one multiplier
with an analytical expression of the maximum absolute error. Hence when the
number of bits is higher than 16, the only available results are those obtained
with the analytical formula calculated in Ch. 2. In fact, obtaining the same
results with a brute force simulation of the truncated multiplier is impossible
due to the huge amount of needed CPU time. Recalling what has been told in
Ch. 2, the maximum absolute error of a truncated multiplier is only achieved
for a very small number of input values, hence non-exhaustive simulations are
very unlikely to provide the actual maximum absolute error value or an error
value near to the maximum.
3.2. TRUNCATED MULTIPLIERS PERFORMANCES 85
Ta
bl
e3
.7
:T
he
or
et
ic
al
an
d
si
m
ul
at
ed
m
ax
im
um
ab
so
lu
te
er
ro
ro
fp
ro
po
se
d
an
d
pr
ev
io
us
ly
pr
op
os
ed
tr
un
ca
te
d
m
ul
tip
lie
rs
(S
=s
ig
ne
d
m
ul
tip
lie
r;
U
=u
ns
ig
ne
d
m
ul
tip
lie
r)
,n
=
10
;1
2
;1
4
;2
0
;2
4;
3
2
;6
4
an
d
h
=
0;
1
;2
;3
.
T
he
or
et
ic
al
Si
m
ul
at
ed
M
ea
n
Sq
ua
re
E
rr
or
n
h
" m
a
x
(l
sb
)
" m
a
x
(l
sb
)
L
M
S1
b
L
M
S1
b
L
M
S2
b
Jo
u
V
an
C
ur
tic
ap
ea
n
K
ua
ng
I
Sw
ar
tz
la
nd
er
[1
3]
[1
5]
-[
16
]
[1
4]
[1
8]
[1
1,
10
,2
3]
(S
/U
)
(S
/U
)
(S
/U
)
(S
/U
)
(S
)
(U
)
(S
)
(U
)
(S
)
(U
)
(S
)
(U
)
0
2.
05
5
2.
05
5
2.
00
0
2.
34
6
2.
05
5
-
-
1.
88
8
-
2.
64
1
-
2.
05
5
10
1
1.
17
2
1.
17
2
1.
12
5
1.
09
0
1.
09
0
1.
36
1
1.
27
7
1.
19
4
2
0.
80
5
0.
80
5
0.
75
0
0.
75
2
0.
75
2
0.
88
8
0.
84
7
0.
80
5
3
0.
62
5
0.
62
5
0.
58
1
0.
60
6
0.
60
6
0.
67
3
0.
65
2
0.
63
1
0
2.
38
8
2.
38
8
2.
50
0
2.
68
0
2.
38
8
-
-
2.
22
2
-
2.
97
2
-
1.
36
1
12
1
1.
34
0
1.
34
0
1.
31
3
1.
25
7
1.
25
7
1.
52
7
1.
44
4
1.
36
1
2
0.
88
8
0.
88
8
0.
87
5
0.
83
6
0.
83
6
0.
97
2
0.
93
0
0.
88
8
3
0.
66
8
0.
66
8
0.
65
6
0.
64
7
0.
64
7
0.
71
5
0.
69
4
0.
67
3
0
2.
72
2
2.
72
2
3.
00
0
3.
01
3
2.
72
2
-
-
2.
55
5
-
3.
30
5
-
2.
72
2
14
1
1.
50
6
1.
50
6
1.
62
5
1.
42
3
1.
42
3
1.
69
4
1.
61
1
1.
52
7
2
0.
97
2
0.
97
2
1.
00
0
0.
92
0
0.
92
0
1.
05
5
1.
01
3
0.
97
2
3
0.
71
0
0.
71
0
0.
70
3
0.
68
9
0.
68
9
0.
75
6
0.
73
6
0.
71
5
20
0
3.
72
N
.A
.
N
.A
.
N
.A
.
N
.A
.
-
-
N
.A
.
-
N
.A
.
-
N
.A
.
24
0
4.
39
N
.A
.
N
.A
.
N
.A
.
N
.A
.
-
-
N
.A
.
-
N
.A
.
-
N
.A
.
32
0
5.
72
N
.A
.
N
.A
.
N
.A
.
N
.A
.
-
-
N
.A
.
-
N
.A
.
-
N
.A
.
64
0
11
.0
6
N
.A
.
N
.A
.
N
.A
.
N
.A
.
-
-
N
.A
.
-
N
.A
.
-
N
.A
.
86 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
3.2.3 Electrical Performances (Area Occupation, Power Dissipa-
tion, Propagation Delay)
As observed before, the proposed truncated multipliers are well suited for
state-of-the-art tree based implementations (including a TDM carry-save tree
and a final carry-prefix propagate adder) [24, 5, 7] since the circuit output can
be written as a summation matrix of partial products. This remains true for
many previously proposed truncated multipliers ([18]-[11, 10, 23]). The pa-
pers of Jou et al. [13], Van et al. [15]-[16] and Curticapean et al. [14], on the
other hand, consider array multiplier implementations which, because of ripple
architecture, are very slow and power hungry. These approaches can be still
implemented with a carry-save TDM reduction method. However, it is worth
to highlight that the terms added to the PPM (related to the computation of
the compensation function) are still obtained with a ripple AND/ OR network,
which increases circuit power dissipation and propagation delay.
In order to have a fair comparison among every approach, the multipli-
ers previously proposed in literature have been implemented with a carry-save
TDM reduction tree followed by a fast carry-propagate adder. All the circuits
have been implemented in a 0:18m technology. Fig. 3.9-3.10 shows the re-
sults of the implementations obtained by varying the delay constraint during
synthesis, in the case n = 16, h = 1.
The reported data show that proposed LMS1b truncated and Kuang et al.
multipliers exhibit very similar performances both in term of area and power.
A second set of multipliers which show very similar performance is composed
by LMS2bmultiplier, Van et al. and Swartzlander et al. multipliers. The multi-
pliers proposed by Jou et al. and Curticapean et al. are less effective than other
architectures. As we have observed before, this is due to the ripple structure
used in the network which implements the compensation function. This not
only limits the minimum delay for which the multiplier can be synthesized but
also results in a large glitching, which increases the power dissipation. Since,
in addition, Jou et al. and Curticapean et al. multipliers do not exhibit good
error performances (see Tab. 3.5,3.6) in the following we will not consider
anymore these two architectures.
It is interesting to observe that also the Van et al. multiplier includes a rip-
ple error compensation network. In this case, however, this network reduces
to a single multiple-input gate2 which can be implemented with a tree struc-
2 Please note that the error compensation network proposed by Van et al. reduces to a single
multiple-input gate only for h > 0. For h = 0 the network of Van et al. multiplier is similar to
the networks of Jou et al. and Curticapean et al. multipliers.
3.2. TRUNCATED MULTIPLIERS PERFORMANCES 87
9.5
10.5
11.5
12.5
13.5
14.5
15.5
2.40 2.60 2.80 3.00 3.20 3.40
Delay(ns)
A
re
a
(1
0
3
m
m
2
)
LMS1b LMS2b
Jou VAN
Curticap. Kuang I
Swart.
Figure 3.9: Signed Multiplier performances for n = 16 and h = 1 by
varying the delay constrain imposed during circuit synthesis (0:18m
technology).
20.0
22.0
24.0
26.0
28.0
30.0
32.0
2.40 2.60
P
o
w
e
r
(
m
W
/M
H
z
)
2.80
Delay (ns)
3.00 3.20 3.40
LMS1b LMS2b
Jou VAN
Curticap. Kuang I
Swart.
Figure 3.10: SignedMultiplier performances for n = 16 and h = 1 by
varying the delay constrain imposed during circuit synthesis (0:18m
technology).
88 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
ture. This justifies why Van et al. multiplier shows better performances than
Jou et al. and Curticapean et al. approaches.
As shown in Figs. 3.9-3.10, the multiplier of Kuang et al. has electrical
performance similar to LMS1b truncated multiplier. However the accuracy
of Kuang et al. multiplier is comparable to LMS1b truncated multiplier only
for h = 0, while for h > 0 the Kuang et al. multiplier yields a larger error.
As a consequence, the multiplier of Kuang et al. has also been excluded in
subsequent analysis.
3.2.4 Area versus Accuracy Trade-off
Previous sections analyzed separately the accuracy and the complexity (that is,
the silicon area) of the different multipliers. In a real application the accuracy
is generally constrained by system-level consideration and we would like to
achieve the lowest complexity for a required accuracy.
Parameter h is a trade-off parameter between accuracy and complexity that
can be used to tune the multiplier accuracy to the requirement imposed by
the specific application. A comparison of the different truncated multipliers
considering this trade-off between accuracy and complexity is therefore nec-
essary. To that purpose, LMS1b and LMS2b truncated multipliers and the
multipliers proposed by Van et al., Liao et al. and Swartzlander et al. have
been simulated and implemented. Figs. 3.11-3.12-3.13-3.14 show the results
on a diagram area vs. mean square error for n = 8; 12; 16 and 24 bit and
h = 0; 1; 2; 3. Each diagram, for each value of h, shows also the lowest theo-
retical mean square error achievable with a variable-correction truncated mul-
tiplier (the term "2low bound discussed in Ch. 2).
The data of Figs. 3.11-3.12-3.13-3.14 show that for h = 0 the proposed
LMS1b and LMS2b truncated multipliers exhibit the best trade-off between
area and accuracy. The LMS1b truncated multiplier results in the lowest area,
while the LMS2b truncated multiplier achieves an error very close to the the-
oretical limit. The multiplier of Van et al. [15]-[16] appears ineffective for
h = 0 because of the large error in comparison to other multipliers. The data
for h = 1 in Fig. 3.11-3.12-3.13-3.14 show a picture similar to the case h = 0
with the difference that now the multiplier of Van et al. is more effective, and
competes with LMS2b truncated multiplier.
For h  2 the error of all multipliers is very close to the theoretical limit. In
turn, for the considered n values, the theoretical limit is very close to the error
of the full-rounded multiplier (1=12  lsb2). In these cases the best solution
is selecting the multiplier with the lowest area, that is generally the proposed
3.2. TRUNCATED MULTIPLIERS PERFORMANCES 89
0.000
0.050
0.100
0.150
0.200
0.250
0.300
2.00 2.50 3.00 3.50 4.00
Area(10
3m
m
2
)
e
2
to
ta
l
(l
s
b
2
)
n =8
h=0
h=1
h=2
LMS1b LMS2b Van Swartz. e
2
low_bound
Figure 3.11: Trade off between Area Occupation and Error (mean
square total error - "2total) for different Signed Truncated Multipliers,
n = 8. The mean square error is obtained through simulation.
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
5.00 5.50 6.00 6.50 7.00 7.50 8.00
Area (10
3
mm
2
)
n =12
h=0
h=1
h=2 h=3e
2
to
ta
l
(l
s
b
2
)
LMS1b LMS2b Van Swartz. e
2
low_bound
Figure 3.12: Trade off between Area Occupation and Error (mean
square total error - "2total) for different Signed Truncated Multipliers,
n = 12. The mean square error is obtained through simulation.
90 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
h=0
0.000
0.100
0.200
0.300
0.400
9.00 10.00 11.00 12.00 13.00 14.00
Area(10
3
mm
2
)
n =16
h=1
h=2 h=3
e
2
to
ta
l
(l
s
b
2
)
LMS1b LMS2b Van Swartz. e
2
low_bound
Figure 3.13: Trade off between Area Occupation and Error (mean
square total error - "2total) for different Signed Truncated Multipliers,
n = 16. The mean square error is obtained through simulation.
LMS1b. Note that for n = 24 the mean square error cannot be simulated.
Therefore the data in Fig. 3.11-3.12-3.13-3.14 for n = 24 include only LMS1b
and LMS2b truncated multipliers.
3.3 Experimental Verification
The performance of the proposed truncated multiplier have been experimen-
tally verified on a test chip [25]. In this chip it has been implemented a trun-
cated multiplier designed with LMS1b technique for n = 16, h = 0 and also,
as a comparison, a full-rounded multiplier. The technology used is 0:18m
with 6 levels of metal. The photograph of the test chip is shown in Fig. 3.15.
The experimental performance are summarized in Tab. 3.15. The devel-
oped LMS1b truncated multiplier is able to halve the power dissipation and to
reduce of about 44% the area occupation. In addition the multiplier designed
with the proposed technique is 13% faster than the full-rounded multiplier.
These performance increase (relative to full-rounded multiplier) are consistent
with data in Tab. 3.2. On the other hand, the actual experimental values re-
ported in Tab. 3.15 are significantly worse with respect to the data in Tab. 3.2.
3.3. EXPERIMENTAL VERIFICATION 91
h=0
0.000
0.100
0.200
0.300
0.400
0.500
0.600
21.00 22.00 23.00 24.00 25.00 26.00 27.00 28.00
Area(10
3
mm
2
)
h=2
h=1
h=3
n =24
e
2
to
ta
l
(l
s
b
2
)
LMS1b LMS2b Van Swartz. e
2
low_bound
Figure 3.14: Trade off between Area Occupation and Error (mean
square total error - "2total) for different Signed Truncated Multipliers,
n = 24. The mean square error is obtained by the theoretical formulas
of Ch. 2.
Table 3.8: Experimental performance of the signed 16 bit multipliers
realized in the test chip shown in Fig. 3.15.
Multiplier fclk [MHz] Area [103  m2] Power
h
W
MHz
i
LMS1b (h = 0) 385 19.1 36.6
full-rounded 340 33.7 73.2
This is due to two reasons. Firstly, an un-optimized standard-cell library (using
only Manhattan geometry in the layout) was employed to fabricate the chip,
while a more effective vendor-supplied library was considered for the simula-
tions reported in Tab. 3.2. Secondly, data in Tab. 3.2 are synthesis results that
do not take into account the overhead due to place and route.
92 CHAPTER 3. VLSI IMPLEMENTATION AND PERFORMANCES
TRUNCATED 1
MULTIPLIER
b FULL-ROUNDED
MULTIPLIER
Figure 3.15: Trade off between Area Occupation and Error (mean
square total error - "2total) for different Signed Truncated Multipliers.
For n = 16 the mean square error is simulated. For n = 24 the mean
square error is obtained by the theoretical formulas of Ch. 2.
3.4 Conclusions
The practical implementation of truncated multipliers with the quantized lin-
ear compensation function proposed in Ch. 2 is investigated in this chapter.
Efficient implementations, based on a carry-save reduction tree and a final
parallel-prefix carry-propagate adder, have been considered.
It has been shown that, when coefficients are quantized with one bit
(LMS1b truncated multiplier), the implementation is straightforward. When
two bits are employed to quantize coefficients of compensation function
(LMS2b truncated multipliers), the optimal implementation is more challeng-
ing. The best approach uses a small auxiliary carry-save tree, to minimize
hardware.
The performance of the truncated multipliers developed in this thesis have
been extensively compared with previously proposed circuits. More than 100
truncated multipliers, with 8 different architectures have been synthesized,
with wordlengths ranging from 8 to 32 bits. Area, power and accuracy of
the multipliers have been investigated and compared. The analysis shows the
proposed LMS1b and LMS2b truncated multipliers represent very often the
best trade-off between complexity and accuracy.
Experimental performances, obtained from a test chip in 0.18mm technol-
ogy have also been presented.
Chapter 4
LMS Truncated Squarer
S
quaring function is a fundamental arithmetic operation in many digital
signal processing applications such as adaptive filtering, vector quanti-
zation, image compression, pattern recognition. A squarer is a multiplier in
which the two operands are equal. The resulting circuit can be done simpler
than a conventional multiplier since the squarer has actually a single input.
Using the symmetry of Partial Products arising from the equality of two
inputs, many techniques have been proposed in order to reduce the area occu-
pation of the squarer, giving however an exact output on 2n bits. These tech-
niques will be presented in Sec. 4.1. A further improvement can be obtained
in such application which only need a less precise n bits output (see Sec. 4.2),
since it is possible to use a truncated squarer, a squarer with n bit input and n
bit output. In this case the techniques described in the previous chapter can be
extended. In Sec. 4.4 the optimal compensation function, which minimize the
mean square error, is presented together with its linear approximation (LMS
truncated squarer - Linear Minimum mean Square error truncated squarer). In
Sec. 4.5 the results of the simulation and hardware implementation will be
presented and compared with the state of the art squarer.
4.1 Folded squarer
The folding technique described in [26] uses the symmetry of the partial prod-
uct matrix of squarer to achieve 50% reduction of the number of partial prod-
ucts compared with a standard multiplier. The partial products rearrangement
technique in [27, 28] can be used to reduce the depth of Partial Products (PPs)
Matrix. Ref. [29, 30] joins the Booth technique with folding technique and
93
94 CHAPTER 4. LMS TRUNCATED SQUARER
x x1 8 x x2 8 x x3 8 x x4 8 x x5 8 x x6 8 x x7 8
x x1 7 x x2 7 x x3 7 x x4 7 x x5 7 x x6 7
x x1 6 x x2 6 x x3 6 x x4 6 x x5 6
x x1 5 x x2 5 x x3 5 x x4 5
x x1 4 x x2 4 x x3 4
x x1 3 x x2 3
x x1 2
2-1
x
8
x
7
x
6
x
4
x
3
x
2
x
1
2-2 2-122-92-3 2-4 2-10 2
-11
2-5 2-6 2-7 2-8 2-152-13 2
-14 2-16
p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16p1
x
5
x x1 8 x x2 8 x x3 8 x x4 8 x x5 8 x x6 8 x x7 8 x x8 8
x x1 7 x x2 7 x x3 7 x x4 7 x x5 7 x x6 7 x x7 7 x x8 7
x x1 6 x x2 6 x x3 6 x x4 6 x x5 6 x x6 6 x x7 6 x x8 6
x x1 5 x x2 5 x x3 5 x x4 5 x x5 5 x x6 5 x x7 5 x x8 5
x x1 4 x x2 4 x x3 4 x x4 4 x x5 4 x x6 4 x x7 4 x x8 4
x x1 3 x x2 3 x x3 3 x x4 3 x x5 3 x x6 3 x x7 3 x x8 3
x x1 2 x x2 2 x x3 2 x x4 2 x x5 2 x x6 2 x x7 2 x x8 2
x x
1 1
x x2 1 x x3 1 x x4 1 x x5 1 x x6 1 x x7 1 x x8 1
2-1 2-2 2-122-92-3 2-4 2-10 2
-11
2-5 2-6 2-7 2-8 2-152-13 2
-14 2-16
p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16p1
a)
b)
Figure 4.1: Partial Products Matrix of unsigned squarer, n even
(n = 8). The bold elements form the antidiagonal. a) Full original
matrix of the squarer. b) Reduced matrix after applying (4.3).
provides advanced squarer circuits with the design of custom sub-circuits. In
this section a description of folding technique will be given.
4.1.1 Unsigned folded squarer
IfX is a n-bit unsigned fractional number, and xi are the bits that representX
we have:
X =
nX
i=1
xi  2 i (4.1)
the square of X is:
X2 =
 
nX
i=1
xi  2 i
!2
=
nX
i=1
nX
j=1
xixj  2 i j (4.2)
Let’s consider the case in which n is even. (4.2) is a weighted sum of
one bit partial products (xixj) that is graphically shown in Fig. 4.1(a). As
4.1. FOLDED SQUARER 95
x x1 8 x x2 8 x x3 8 x x4 8 x x5 8 x x6 8 x x7 8x x1 7
x x2 7 x x3 7 x x4 7 x x5 7 x x6 7
x x1 6
x x2 6
x x3 6 x x4 6 x x5 6
x x1 5
x x2 5
x x3 5
x x4 5
x x1 4
x x2 4
x x3 4
x x1 3
x x2 3
x x1 2x x1 2
x x7 8
x x6 7
x x2 3
x x3 4
x x4 5 x x5 6
x
8
x x1 8 x x2 8 x x3 8 x x4 8 x x5 8 x x6 8 x x7 8
x
8
x x1 7
x x2 7 x x3 7 x x4 7 x x5 7 x x6 7
x
7
x x1 6
x x2 6
x x3 6 x x4 6 x x5 6
x
6
x x1 5
x x2 5
x x3 5
x x4 5
x
5
x x1 4
x x2 4
x x3 4
x
4
x x1 3
x x2 3
x
3
x x1 2
x
2x1
2-1 2-2 2-122-92-3 2-4 2-10 2
-11
2-5 2-6 2-7 2-8 2-152-13 2
-14 2-16
a)
b)
p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16p1
p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16p1
Figure 4.2: Partial Products Matrix of unsigned squarer during the
folding process, n even (n = 8). a) Partial Product Matrix to which
(4.4) can be applied; in each column the circled elements will be
grouped. b) Final folded matrix.
xixj = xjxi, the matrix is symmetric with respect to the antidiagonal [31],
shown in Fig. 4.1(a) with boldface character. This means that, with respect
to the antidiagonal, the bottom and the top part of can be combined using the
identities:
xixj + xjxi = 2  xixj
xixi = xi
(4.3)
The result is the PPs matrix shown in Fig. 4.1(b) (n even). Finally the matrix
can be further reduced following the technique presented in [32, 27]. Using
the identity:
xi + xixi+1 = 2xixi+1 + xixi+1 (4.4)
in each column the circled elements of the Fig. 4.2(a) can be grouped obtaining
the final folded matrix shown in Fig. 4.2(b). The result of the squaring using
this modified matrix is given by:
X2 = xn 2 2n+
nX
i=2
 
xixi 12 2i + xixi 12 2i 1

+
n 1X
i=1
nX
j=i+1
xixj  2 i j
(4.5)
96 CHAPTER 4. LMS TRUNCATED SQUARER
2-1
p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16p1
2-2 2-122-92-3 2-4 2
-10 2-112-5 2-6 2-7 2-8 2-152
-13 2-14 2-16
p17 p18
2-17 2
-18
x x1 8 x x3 9 x x4 9 x x5 9 x x6 9 x x7 9 x x8 9x x1 7
x x2 7 x x4 8 x x5 8 x x6 8
x x1 6
x x2 6
x x3 6 x x5 7
x x1 5
x x2 5
x x3 5
x x1 4
x x2 4
x x1 3 x x1 9
x x2 8
x x3 7
x x4 6
x x2 9
x x3 8
x x4 7
x
9
x x7 8
x x6 7
x x5 6
x x3 4
x x2 3
x x1 2x x1 2
x x7 8
x x6 7
x x2 3
x x3 4
x x5 6
x x4 5x x4 5
x x8 9
Figure 4.3: Final folded matrix for unsigned squarer, n odd (n = 9).
This expression is valid for every n value. Note that, as described in [28], the
form of the matrix is slightly different when n is odd (see Fig. 4.3). For every
n value, the maximum height of the squarer matrix is
ln
2
m
and the number
of bits in the squaring matrix is
n2 + n
2
. If the squaring is performed using a
standard multiplier X X , the maximum height of the multiplication matrix
is n and the number of bits in the matrix in n2.
4.1.2 Signed folded squarer
If X is a n-bit two’s complement fractional number:
X =  x1  2 1 +
nX
i=2
xi  2 i (4.6)
the square of X is:
X2 = 2 1 + 2 n + x1  2 2 +
nX
i=2
nX
j=2
xixj  2 i j
+
nX
i=2
(xix1 + x1xi)  2 i 1 (4.7)
Let’s consider the case in which n is even. Following the same reasoning
done for the unsigned squarer and applying (4.3) one obtains the matrix of PPs
shown in Fig. 4.4(a) (n even). The matrix can be further reduced using the
technique presented in [32, 27]. The identity (4.4) can be applied to the black
circled elements in Fig. 4.4(a) as done for the unsigned squarer. Furthermore
the identity
1 + xi + xixi+1 = xixi+1 + 2xi (4.8)
4.2. TRUNCATED SQUARER 97
x x3 4
x x2 3x x2 3
x x3 4
a)
b)
x x1 8 x x2 8 x x3 8 x x4 8 x x5 8 x x6 8 x x7 8
x
8
x x1 7
x x2 7 x x3 7 x x4 7 x x5 7 x x6 7
x
7
x x1 6
x x2 6
x x3 6 x x4 6 x x5 6
x
6
x x1 5
x x2 5
x x3 5
x x4 5
x
5
x x1 4
x x2 4
x x3 4
x
4
x x1 3
x x2 3
x
3
x x1 2
x
2x1
p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16p1
1 10 0 0 0 0 0
2-1 2-2 2-122-92-3 2-4 2-10 2
-11
2-5 2-6 2-7 2-8 2-152-13 2
-14 2-16
x x1 8 x x2 8x x1 7
x x2 7 x x3 7
x x1 6
x x2 6
x x3 6 x x4 6
x x1 5
x x2 5
x x3 5
x x4 5
x x1 4
x x2 4
x x1 3x x1 2
p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16p1
2-1 2-2 2-122-92-3 2-4 2-10 2
-11
2-5 2-6 2-7 2-8 2-152-13 2
-14 2-16
x x3 8 x x4 8 x x5 8 x x6 8 x x7 8
x x4 7 x x5 7 x x6 7
x x5 6
x x7 8
x x6 7
x x5 6
x
8
x
4
x x1 2
Figure 4.4: Partial Products Matrix of signed squarer during the fold-
ing process, n even(n = 8). a) Partial Product Matrix to which (4.8)
can be applied; in each column the circled elements will be grouped.
b) Final folded matrix.
is used to modify the bit 1, the terms an
2
and an
2
 1 in nth column (red circles
in Fig.4.4(a)), and the identities:
xi + xixi+1 = 2xixi+1 + xixi+1 (4.9)
1 + xixi+1 = 2xixi+1 + xixi+1
are used to modify the blue circled elements of the two most significant column
obtaining the final folded matrix shown in Fig.4.4(b). If n is odd an identity
different from (4.8) is used, and the final matrix is slightly different. This has
been described by Wires et al. [28].
4.2 Truncated squarer
A n  n truncated squarer is a squarer in which the output is represented on
n bits. Since the squarer can be seen as a multiplier with two identical inputs,
every consideration for the truncated multiplier is still valid for the truncated
squarer. The squarer with n bit output, characterized by minimummean square
error, is obtained with a rounding operation. The error is the same calculated
98 CHAPTER 4. LMS TRUNCATED SQUARER
in Sec. 1.3.1; obviously this solution still requires the maximum silicon area,
even if the folding technique roughly halves the area occupation required by a
squarer with respect to a traditional multiplier. It is however possible to remove
some of the partial products in the less significant part of the matrix of Partial
Products (PPs) and to introduce a suitable compensation circuit. In this way the
approximation error increases but the hardware requirement is reduced. In [33,
34] the authors propose a truncated squarer with variable correction method:
the (n+1)th column is added to the nth column and the last n 1 columns are
discarded. In the following the matrix of partial product before applying the
folding identity ((4.4)-(4.8)-(4.9)), that is the reduced matrix before the final
step of the folding, will be considered.
The matrix of Partial Product (see Fig. 4.5(a)) can be divided into three
subsets: MSP, composed by the first n 1most significant columns, LSPminor,
the last n h columns, and LSPmajor, the columns from (n+1)th to (n+h)th.
In Ch. 2 it has been demonstrated that LSPminor can be estimated with the Input
Correction vector ((n+h+1)th column). Due to the presence of the single bit
xi (highlighted in green in Fig. 4.5(a)) the IC will be different if we consider
neq = n   h odd or even. Note that these elements are the only one with
the original weight (all the others are doubled), given by 2 2i. Hence they are
positioned in the even columns of the matrix of partial product (2nd; 4th; : : :)
and they have one bit in common with one of the partial product in the same
column and no bit in common with any partial product of the column on the
left. For example in Fig.4.5(a) x5 is present in the column whose weight is
2 10 as well as x5x6. Instead in the column on the left, whose weight is 2 9
no partial product is formed by x5.
If neq is even, the (n+ h+ 1)th column (odd) doesn’t contain any partial
product that is composed of single bit: xn+h+2
2
is present in LSPminor but not in
(n+ h+ 1)th column. Hence, when we try to estimate the contribution to the
final result only with (n+h+1)th column, we lose an important information.
Thus the IC vector is composed by (n + h + 1)th column and the bit xn+h+2
2
(see Fig. 4.5(b)).
If neq is odd, the opposite situation is verified. The (n+ h+ 1)th column
(even) contains the bit xn+h+1
2
, but also a partial product xn+h+1
2
xn+h+1
2
+1
formed by it. Instead of taking twice the same information, xn+h+1
2
is one of
the element of the IC while xn+h+1
2
xn+h+1
2
+1 is considered a part of LSPmajor.
In order to not lose the information regarding the other bit of the partial prod-
uct, also xn+h+1
2
+1 is considered one of the elements of the IC (see Fig.4.5(c)).
4.3. OPTIMAL COMPENSATION FUNCTION 99
x x3 10 x x4 10 x x5 10 x x6 10 x x7 10 x x8 10 x x9 10
x
10
x x2 10x x1 10
x x4 9 x x5 9 x x6 9 x x7 9 x x8 9x x3 9x x2 9
x x1 9
x x5 8 x x6 8 x x7 8x x4 8x x3 8
x x1 8
x x2 8
x x6 7x x5 7x x4 7
x x1 7
x x2 7
x x3 7
x x5 6
x x2 6
x x3 6
x x1 6
x x4 6
x x3 5
x x4 5
x x1 5
x x2 5x x2 4
x x3 4
x x1 4x x1 3
x x2 3
x x1 2
p2 p3 p4 p5 p6 p7 p8 p9 p15p1 p11 p12p10 p13 p14 p16 p17 p18 p19 p20
2
-1
2
-7
2
-8
2
-9
2
-6
2
-13
2
-4
2
-20
2
-19
2
-18
2
-11
2
-15
2
-17
2
-16
2
-5
2
-10
2
-14
2
-12
2
-3
2
-2
x
9
x
7
x
6x5x4
x
3
x
2
x
1
x
8
x x3 10 x x4 10 x x5 10 x x6 10 x x7 10 x x8 10 x x9 10
x
10
x x2 10x x1 10
x x4 9 x x5 9 x x6 9 x x7 9 x x8 9x x3 9x x2 9
x x1 9
x x5 8 x x6 8 x x7 8x x4 8x x3 8
x x1 8
x x2 8
x x6 7x x5 7x x4 7
x x1 7
x x2 7
x x3 7
x x5 6
x x2 6
x x3 6
x x1 6
x x4 6
x x3 5
x x4 5
x x1 5
x x2 5x x2 4
x x3 4
x x1 4x x1 3
x x2 3
x x1 2
x
9
x
7
x
6x5x4
x
3
x
2
x
1
x
8
x x3 10 x x4 10 x x5 10 x x6 10 x x7 10 x x8 10 x x9 10
x
10
x x2 10x x1 10
x x4 9 x x5 9 x x6 9 x x7 9 x x8 9x x3 9x x2 9
x x1 9
x x5 8 x x6 8 x x7 8x x4 8x x3 8
x x1 8
x x2 8
x x6 7x x5 7x x4 7
x x1 7
x x2 7
x x3 7
x x5 6
x x2 6
x x3 6
x x1 6
x x4 6
x x3 5
x x4 5
x x1 5
x x2 5x x2 4
x x3 4
x x1 4x x1 3
x x2 3
x x1 2
x
9
x
7
x
6x5x4
x
3
x
2
x
1
x
8
MSP
LSPmajor
LSPminor
IC
n h n =n-heq
n =n-heqhn
a)
b)
c)
Figure 4.5: Partial Products Matrix of unsigned squarer, (n = 10).
a) Partial Product Matrix before the last folding operation. In green
the matrix elements formed by a single bit. b) Subdivision of the par-
tial product matric in MSP, LSPmajor and LSPminor when neq is even
(n = 10; h = 2). c) Subdivision of the partial product matric in MSP,
LSPmajor and LSPminor when neq is odd (n = 10; h = 1).
In the following the unsigned squarer is considered but, as we are referring
to the matrix before the final step of folding, all the results are still valid for the
signed squarer. In fact the LSPminor is the same in both cases since the column
from the n+ 1 to 2n are the same for signed and unsigned squarer.
4.3 Optimal compensation function
Considering the subdivision of the matrix, the result can be computed as:
X2 = SMSP + SLSPmajor + SLSPminor (4.10)
where SMSP, SLSPmajor and SLSPminor are the weighted sum of the elements of the
MSP, LSPmajor and LSPminor respectively.
100 CHAPTER 4. LMS TRUNCATED SQUARER
In the truncated squarer the terms of the IC are employed to calculate the
compensation function f(IC) and the output is computed as:
X2t = truncn (SMSP + f(IC) +Kround) (4.11)
whereKround is the rounding constant. Since the same considerations done for
the multiplier are still valid for the squarer, for each value of IC the optimal
compensation function is fopt(IC) = LSP, where LSP is the mean of LSPminor
conditioned to the considered value of the IC (see Ch. 2). The optimal com-
pensation function will be computed following the same technique used for the
truncated multiplier. Note that even if the reasoning is the same, the optimal
f(IC) will be different, not only because the partial products have a different
weight, but, most of all, since they have a different probability distribution.
Let’s see how it will be specialized in the two different cases.
4.3.1 Optimal compensation function fopt(IC) when neq is even
The elements of the IC (see Fig.4.5(b)) will be indicated as follows:
IC =
h
1; 2; : : : ; neq
2
i
(4.12)
where:
i = xh+1+iyn+1 i i = 1; : : : ;
neq
2
  1
neq
2
= xn+h
2
+1 (4.13)
In order to calculate the conditioned mean, let’s analyze how the elements
of LSPminor are related to the elements of IC. For instance if we consider
Fig. 4.6(a), x4x9 depends on 2 = x3x9 and 3 = x4x8, x7x9 depends on
2 = x3x8 and 4 = x5x7, x8 depends on 3 = x4x8.
In general xixj (i 6= j) depends on two elements of IC, n+1 j and t.
If i  n+ h
2
+ 1, t = i h 1, otherwise t = n+1 i. Note that when
i = n+h2 + 1 the partial product is correlated to the element of the IC with the
single bit.
The generic xj depends only on n+1 j .
In order to simplify the analytical calculation, let’s indicate each element of
LSPminor with the index of the elements of the IC to whom they are correlated:
instead of xixj , let’s use pt;n+1 j (see Fig. 4.6(b)).
LSPminor can be divided in 4 parts (see Fig. 4.6(b)):
4.3. OPTIMAL COMPENSATION FUNCTION 101
p2,1 p3,1 p4,1 p5,1
p1
p4,1 p5,2
p4,3 p5,3
p5,4
p2
p4
p3
b) g
1
g
2
g
3
g
4
g
5
p2,1p3,1p4,1
p4,1p3,2 p3,2
p4,3
-
AB
C
D
x x3 10 x x4 10 x x5 10 x x6 10 x x7 10 x x8 10 x x9 10
x10x x2 10
x x4 9 x x5 9 x x6 9 x x7 9 x x8 9x x3 9
x x5 8 x x6 8 x x7 8x x4 8
x x6 7x x5 7
x9
x7
x6
x8
a)
2
-1
2
-7
2
-8
2
-9
2
-6
2
-4
2
-5
2
0
2
-3
2
-2
2
-n-h-1 ( )
-
Figure 4.6: LSPminor of unsigned squarer, (n = 10, h = 0).
a) LSPminor. In yellow, elements of IC vector, in green elements of the
matrix composed by a single bit. b) LSPminor in which the dependence
by the elements of IC is shown.
A the diagonal correlated to the only one element of IC composed by a single
bit (neq
2
);
B the partial products on the left side respect to A;
C the partial products on the right side respect to A;
D the partial products composed by only one bit.
Following this partition, the sum of the elements of LSPminor can be written
as:
SLSPminor = SIC + SA + SB + SC + SD (4.14)
where:
SIC = 2
 n h 1
neq
2X
i=1
i (4.15)
SA = 2
 n h 1
neq
2
 1X
i=1
pA neq
2
;i  2 
neq
2
+i (4.16)
102 CHAPTER 4. LMS TRUNCATED SQUARER
SB = 2
 n h 1
neq
2
 1X
i=2
i 1X
j=1
pBi;j  2 i+j (4.17)
SC = 2
 n h 1
neq
2
 1X
i=2
i 1X
j=1
pCi;j  2 neq+i+j (4.18)
SD = 2
 n h 1
neq
2
 1X
i=1
pDi  2 neq+2i 1 (4.19)
The last step is to calculate the mean of the generic element of LSPminor.
In the hypotheses of input bits independent and identically distributed, as xiyj
is correlated to different elements of the IC its mean value is given by:
E
IC=A
fxiyjg = E
IC=A
fxig  E
IC=A
fyjg (4.20)
If we consider a generic element pi;j of B or C, it is correlated to two different
elements of IC given by an AND operation between its bits. Hence if j = 1,
xn+1 j will be 1 with probability equal to 1, otherwise with a probability equal
to 1=3 (one of the remaining cases):
E
IC=A
fxn+1 jg = 1
3
(1 + 2j) (4.21)
Analogous reasoning is valid for the other element of partial product. Hence
the mean value of a generic element of LSPminor can be expressed as (X =
B;C):
E
IC=A
fpXi;jg = 1
9
(1 + 2i) (1 + 2j) 2

1
9
;
1
3
; 1

(4.22)
When the elements of A are considered, the mean value changes. In fact
one of the bit of each partial product is correlated to the single bit element of
IC, n h
2
; hence for each fixed value of IC its value is a constant:
E
IC=A
fpAi;jg = i 1
3
(1 + 2j) 2

1
3
i; i

(4.23)
Finally let’s consider the elements of D. These elements are composed by
a single bit and are correlated to a single element of IC, then:
E
IC=A
fpDig = 1
3
(1 + 2i) 2

1
3
; 1

(4.24)
4.3. OPTIMAL COMPENSATION FUNCTION 103
Using (4.22)-(4.23)-(4.24) in (4.16)-(4.17)-(4.18)-(4.19) and then the ob-
tained value in eq.(4.14), with some algebra one obtains:
fopt(IC) = 2 n h 1

K +
neq
2X
i=1
fii +
neq
2
 1X
i=1
neq
2X
j=i+1
fijij

(4.25)
where:
K =
1
9

neq
2
+ 21 
neq
2 +
1
3
21 neq   13
6

f1 =
1
9

11  22 neq   22 neq2

fi =
1
9

13  22 i   2i neq(21+neq2 + 4  2i)

i = 2 : : :
neq
2
  2
fneq
2
 1 =
1
9

49
4
  5  21 neq2

fneq
2
=
1
3

5
2
  21 neq2

fij =
4
9
 
2i j + 2 neq+i+j

i; j = 1 : : :
neq
2
  1
fi;neq
2
=
1
3

21+i 
neq
2

i; j = 1 : : :
neq
2
  1
(4.26)
4.3.2 Optimal compensation function fopt(IC) when neq is odd
The elements of the IC (see Fig.4.5(c)) will be indicated as follows:
IC = [1; 2; : : : ; w] (4.27)
where:
w =
neq + 1
2
i = xh+1+iyn+1 i i = 1; : : : ;w  2 (4.28)
w 1 = xw+h
w = xw+h+1
As done previously, let’s indicate each element of LSPminor with the index
of the elements of the IC to whom they are correlated: instead of xixj , let’s
use pt;n+1 j (see Fig.4.7(b)). LSPminor can be divided in 5 parts:
104 CHAPTER 4. LMS TRUNCATED SQUARER
p4,1 p5,1
p4,2 p5,2
p4,3 p5,3
b)
A
C
D
x10
x9
x7x6
x8
a)
2
-1
2
-7
2
-8
2
-6
2
-4
2
-5
2
0
2
-3
2
-2
2
-n-h-1 ( )
x x3 10 x x4 10 x x5 10 x x6 10 x x7 10 x x8 10 x x9 10
x x4 9 x x5 9 x x6 9 x x7 9 x x8 9
x x5 8 x x6 8 x x7 8
p1
p2
p3
g
1
g
2
g
3
g
4
g
5
p2,1 p3,1
p3,2
p3,1 p2,1
p3,2
B
E
-
-
Figure 4.7: LSPminor of unsigned squarer, (n = 10, h = 1).
a) LSPminor. In yellow, elements of IC vector, in green elements of the
matrix formed by a single bit. b) LSPminor in which the dependence by
the elements of IC is shown.
A the diagonal correlated to w 1;
B the diagonal correlated to w;
C the partial products on the left side respect to A;
D the partial products on the right side respect to B;
E the partial products composed by only one bit.
Following this partition, the sum of the elements of LSPminor can be written
as:
SLSPminor = SIC + SA + SB + SC + SD + SE (4.29)
where:
SIC = 2
 n h 1
wX
i=1
i (4.30)
SA = 2
 n h 1
w 2X
i=1
pAw 1;i  2 w+i+1 (4.31)
4.3. OPTIMAL COMPENSATION FUNCTION 105
SB = 2
 n h 1
w 2X
i=1
pBw;i  2 w+i (4.32)
SC = 2
 n h 1
w 2X
i=2
i 1X
j=1
pCi;j  2 i+j (4.33)
SD = 2
 n h 1
w 2X
i=2
i 1X
j=1
pDi;j  2 2w+i+j+1 (4.34)
SE = 2
 n h 1
w 2X
i=1
pEi  2 2w+2i (4.35)
The mean value of each elements of LSPminor can be calculated in the same
way done previously, obtaining:
E
IC=A
fp[C-D]i;jg = 1
9
(1 + 2i) (1 + 2j) 2

1
9
;
1
3
; 1

(4.36)
E
IC=A
fp[A-B]i;jg = 1
3
(1 + 2j) i 2

1
3
i; i

(4.37)
E
IC=A
fpEig = 1
3
(1 + 2i) 2

1
3
; 1

(4.38)
Using (4.36)-(4.37)-(4.38) in (4.31)-(4.32)-(4.33)-(4.34)-(4.35) and then
the obtained value in eq.(4.29), with some algebra one obtains:
fopt(IC) = 2 n h 1

K +
wX
i=1
fii +
w 1X
i=1
wX
j=i+1
fijij

(4.39)
106 CHAPTER 4. LMS TRUNCATED SQUARER
where:
K =
1
9

1
12
  43 + 923 w + 24 2w+ w
f1 =
1
9
 
11  8  2 2w   322 w
fi =
1
9
  24 2w   21+2i 2w + 23 w   321+i w + 13  22 i
fw 2 =
11
8
  21 w
fw 1 =
4
3
 
1  2 w
fw =
5
12
  1
3
21 w
fij =
4
9
 
2i j + 2 2w+i+j+1

i; j = 1 : : :w  2
fi;w 1 =
1
3
 
21+i w

i = 1 : : :w  2
fi;w =
1
3
 
2i w

i = 1 : : : w   2
(4.40)
4.4 Optimal Linear Compensation function
The optimal compensation function can be hardly implemented in hardware,
since it is a quadratic form of the elements of the IC. In Ch. 2 it has been
demonstrated that the only linear combination of the elements of the IC is a
good approximation of the optimum function. Hence let’s consider
flin(IC) = 2 n h 1
 
Kl +
zX
i=2
lii
!
(4.41)
where z = n h2 if neq is even, otherwise to z = w =
neq+1
2 . In order to calcu-
late the optimal coefficients and the optimal constant for flin(IC) it is possible
to conduct a demonstration similar to the one presented for the truncated mul-
tiplier. Here only the fundamental steps will be recalled, and the differences
will be highlighted.
The optimal values of the coefficientsKl, li in (4.41) are obtained by min-
imizing 2lin (2.38), given by:
2lin =
X
A2
(flin(A)  LSP(A)  lin)2 P (A) (4.42)
4.4. OPTIMAL LINEAR COMPENSATION FUNCTION 107
where:
lin =
X
A2
(flin(A)  LSP(A))  P (A) (4.43)
The problem can be solved as shown in Ch. 2 (see eq. 2.62), with one im-
portant difference. When we deal with the truncated multiplier all the elements
of the IC vector are partial product obtained with an AND operation between
the bits, hence their probability of being 1 is 1/4. In the truncated squarer some
of elements of the IC are constituted by only one bit, in this case the probability
of being 1 is 1/2.
Solving the linear system equivalent to (2.62) the coefficients of the linear
function are computed in closed form and imposing lin = 0 also the expres-
sion of the constant will be obtained in closed form. The expression are the
following.
neq even
Kl =
1
27

3
neq
2
+ 212 1 
neq
2 + 21 neq   35
4

l1 =
4
3
 
1  2 neq
li =
1
3
 
5  21 i   21+i neq i = 2 : : : neq
2
  2
lneq
2
 1 =
5
3

1  2 neq2

lneq
2
= 1  2 neq2
(4.44)
neq odd
Kl =
1
48
  17 + 9  22 w + 4w
l1 =
1
9
 
12  24  2 2w   9  2 w
li =
1
9
  2 2w  24 + 22+i + 22+2i+ 2 w  23   9  2 1+i+ 15  3  21 i
lw 2 =
7
4
  3  2 w
lw 1 =
3
2
  21 w
lw =
1
2
  2 w
(4.45)
108 CHAPTER 4. LMS TRUNCATED SQUARER
1 2 3 4 5 6 7 8
0
0.5
1
1.5
2
li
EVEN
li
ODD
Kl
EVEN
KlODD
ls
b
i
Figure 4.8: Optimal coefficients and constant of flin(IC) when neq
is odd (red line) and even (blue line). The green lines represents the
possible quantization levels.
The last step is the quantization of the coefficient. In Ch. 2 it has been
demonstrated that it is a problem of discrete optimization, not simple to solve.
Here only the result of the quantization on 1 bit are reported (see Fig. 4.8):
neq even
Kround = lsbIC
qi = 1 i = 1; 2;
neq
2
qi = 2 i = 3; : : : ;
neq
2
  1
(4.46)
neq odd
Kround = lsbIC
qi = 1 i = 1; 2;
neq + 1
2
  1
qi = 2 i = 3; : : : ;
neq + 1
2
  2
qneq+1
2
= 0
(4.47)
4.5. VLSI IMPLEMENTATION 109
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
8 9 10 11 12 13 14 15 16
h=0
ls
b
2
n
Foldedsquarer
LMS Truncated Squarer
Walters Squarer
LMS1b Truncated Multiplier
Figure 4.9: Mean square error obtained using a full-width folded
squarer with final rounding (blue line§), LMS truncated squarer (red
line), Walters [33] squarer (green line) and LMS1b truncated multi-
plier (violet line), varying n for h = 0.
Given these coefficients the implementation results to be very simple.
As done for LMS1b truncated multiplier, the proposed circuit eliminates the
LSPminor matrix and simply reorganizes the IC vector according to (4.46) or
(4.47).
4.5 VLSI Implementation
To verify the performance of the proposed circuits, various squares have been
implemented in TSMC 0:18m technology. As the error strongly depends on
the bit width of the multiplier (there are also some differences if neq is odd
or even), for the implementation n values ranging from 8 to 16 have been
considered with h = 0; 1.
Figs. 4.9-4.10 show the mean square error obtained by the full-width
folded squarer (output rounded to n bits), the proposed LMS squarer, the
squarer proposed by Walters et al. [33]. For comparison we have also ana-
lyzed the results given by LMS1b truncated multipliers, used as a squarer.
The data reported in the figures show that using the LMS1b truncated mul-
110 CHAPTER 4. LMS TRUNCATED SQUARER
0,05
0,10
0,15
0,20
0,25
0,30
8 9 10 11 12 13 14 15 16
h=1
Foldedsquarer
LMS Truncated Squarer
Walters Squarer
LMS1b Truncated Multiplier
ls
b
2
n
Figure 4.10: Mean square error obtained using a full-width folded
squarer with final rounding (blue line§), LMS truncated squarer (red
line), Walters [33] squarer (green line) and LMS1b truncated multiplier
(violet line), varying n for h = 1.
tiplier as a squarer does not provide any advantages with respect to the folding
technique [27]. Better results are provided with truncated squares as the one of
[33] and the proposed squarer. In particular the LMS truncated squarer gives
the best approximation and also slightly improved in power and maximum fre-
quency. Note that the result is different if we consider neq odd or even. If
neq is even the proposed technique gives an error 20% (on average) lower than
[33]. If we consider the odd case this error is only 8 10% (on average) lower.
This difference can be explained if we consider the IC vector. In [33] Walters
et al uses dneq2 e PPs in order to estimate the LSP part. LMS truncated squarer
always consider bneq2 c PPs. So, if neq is odd it estimates the LSP with a lower
number of PPs. For any considered value of n and h the proposed truncated
squarer always gives the best result.
Finally the Tab. 4.1 shows the mean error, the mean square error (lsb =
2 n), the maximum working frequency, the silicon area occupation and the
power dissipation. Compared with other truncated squarer circuits proposed in
the literature, LMS truncated squarer reduces the mean square error, the power
dissipation, the silicon area occupation, and increases the maximum working
frequency.
4.5. VLSI IMPLEMENTATION 111
Table 4.1: Comparison between squarer, h = 0
n Architecture " "ms Max Freq Area Power
lsb lsb2 MHz 103m2 W
MHz
Folded [27] 0.028 0.082 481 3157 7.3
LMS1b truncated multipliers 0.167 0.375 478 3100 9.5
9 Walters [33] 0.083 0.164 515 2112 5.1
LMS squarer 0.072 0.145 524 1986 4.9
Folded [27] 0.014 0.082 476 3875 9.3
LMS1b truncated multipliers 0.333 0.529 439 3829 12.1
10 Walters [33] 0.166 0.226 498 2565 6.5
LMS squarer 0.166 0.181 500 2425 6.3
Folded [27] 0.015 0.082 435 4797 11.7
LMS1b truncated multipliers 0.167 0.460 397 4627 15.1
11 Walters [33] 0.083 0.187 495 3057 7.9
LMS squarer 0.073 0.169 493 2877 7.8
Folded [27] 0.007 0.082 418 5715 14.3
LMS1b truncated multipliers 0.333 0.607 398 5495 18.3
12 Walters [33] 0.167 0.246 469 3579 9.6
LMS squarer 0.167 0.200 476 3380 9.5
Folded [27] 0.007 0.082 403 6749 17.4
LMS1b truncated multipliers 0.168 0.543 389 6427 21.7
13 Walters [33] 0.083 0.209 437 4135 11.4
LMS squarer 0.072 0.190 433 3895 11.1
Folded [27] 0.004 0.082 398 7780 20.5
LMS1b truncated multipliers 0.333 0.693 376 7428 25.5
14 Walters [33] 0.167 0.266 422 4720 13.3
LMS squarer 0.167 0.220 427 4467 13.2
Folded [27] 0.004 0.082 385 8995 24.2
LMS1b truncated multipliers 0.167 0.626 366 8499 29.4
15 Walters [33] 0.083 0.230 420 5346 15.4
LMS squarer 0.072 0.212 422 5053 15.1
Folded [27] 0.002 0.082 365 10205 27.9
LMS1b truncated multipliers 0.333 0.776 356 9640 33.8
16 Walters [33] 0.167 0.287 413 6001 17.6
LMS squarer 0.167 0.240 417 5731 17.5

Chapter 5
FIR filter
D igital filters, that shape the frequency spectrum of the signal, are animportant part of DSP and are used for audio, speech, video, image
processing, etc.
Digital filters can be classified in FIR filter and IIR filter. Indicating with
x[n] and y[n] the input and the output of the filter at the nth instant, the
FIR filter computes the output as a function of previous inputs ( yFIR[n] =
f(x[0]; : : : ; x[n]) ), while in IIR filter as a function of previous inputs and
previous outputs ( yIIR[n] = f(x[0]; : : : ; x[n]; y[1]; : : : ; y[n   1]) ). In this
chapter the attention is focused on FIR filters because, even if sometimes they
require more memory and calculations to achieve a given frequency response,
they present some advantages. FIR filters can easily be designed to be “linear
phase” (they delay the input signal but do not distort its phase); they are suited
to multi rate application, as decimation and interpolation, since in the FIR fil-
ters some calculations can be omitted while IIR filters require that each output,
even if discarded, must be individually calculated (due to the feedback). Fi-
nally IIR filters can cause significant problems due to the use of feedback and
hence a small error influences every subsequent output.
The basic operation of a digital filter is MAC (Multiply and ACcumulate).
MAC is the operation of multiplying a coefficient by the corresponding de-
layed data sample and accumulating the result, hence theMAC operation needs
a multiplier and an accumulator. Since the multiplication is the basic opera-
tion, in this chapter the impact of the use of truncated multiplier in conventional
filtering applications (such as a pass-band, a low-pass and a high-pass filter)
is investigated. Aim of the analysis is the definition of the parameters of the
truncated multipliers that are more significant for the optimization of a digital
113
114 CHAPTER 5. FIR FILTER
Figure 5.1: MAC with n bits inputs and 2n bits output. The error
provided by this configuration is zero.
filter (Sec. 5.1), the evaluation of the performances of the proposed LMS mul-
tiplier when applied to a digital filter and the comparison of the performances
in terms of error, power, frequency and area occupation with the use of existent
truncated multiplier (Sec. 5.2).
5.1 FIR filter with truncated MAC
The output of a FIR filter is computed as:
y[n] =
N 1X
i=0
hi  x[n  i] (5.1)
The FIR filter multiplies an array of the most recent N data samples by an
array of constants (the tap coefficients, hi), and sums the resulting elements.
The number of FIR taps is related to the amount of calculations and memory
required to implement the filter, and to the amount of “filtering” that the FIR
provides. More taps means increased stopband attenuation, reduced ripple,
narrower filters, etc.
A description of FIR filter design techniques is beyond the scope of this
dissertation (many good references are available [35, 36, 37]); in this chap-
ter the attention is focused on the multiplier since the multiplication is the
dominant computational requirement. The basic MAC operation is shown in
(Fig. 5.1). It is composed by a multiplier nn and an accumulator on 2n bits.
The final output is represented on 2n bits.
Due to the presence of addition operation, overflow can occur. The
methodology used to deal with overflow should depend on the specifics of
the signal, the type of operation and the DSP architecture used. In general,
overflow handling methodologies can be classified in five categories: satura-
tion, input scaling, fixed scaling, dynamic scaling and system design consider-
5.1. FIR FILTERWITH TRUNCATED MAC 115
K
round
Figure 5.2: MAC with n bits inputs and n bits output. It uses a full-
width multiplier and the output is rounded to n bits summing a round-
ing constant to the 2n bits output. In this configuration a rounding error
is present.
K
round
w w w w
Figure 5.3: MAC with n bits inputs, w bits output. It uses a truncated
multiplier with n bits inputs and w < 2n bit output. This configuration
presents the rounding error as in Fig. 5.2 and the error introduced by
the truncated multiplier.
ations. In this chapter the scaling solution will be used, in order to consider a
solution which is independent of the architecture and avoid the introduction of
additional hardware which could hide the differences between the implemen-
tations with various truncated multipliers.
When a MAC operation with n bits precision is required, the lowest error
is obtained working with a full precision until the final step, when the output of
the accumulator is reported on n bit with a rounding operation (see Fig. 5.2).
In this configuration only a rounding error is present.
Another possibility is to work with a lower precision in the MAC unit
itself. The full width multiplier can be replaced by a truncated multiplier whose
output is left on w = n + h bits (see Fig. 5.3). This solution reduces the
area occupation and power dissipation of the MAC unit, but introduces a final
error higher than previous one, since the rounding error is added to the error
introduced by each multiplication. Indicating with:
y2n the output of configuration in Fig. 5.1, full precision MAC,
116 CHAPTER 5. FIR FILTER
yn the output of configuration in Fig. 5.2-5.3, truncated MAC,
the output of FIR filter implemented with a truncated MAC is affected by an
error:
"FIR = y2n   yn (5.2)
As error metric it will be considered the mean square error "2FIR and the mean
error FIR, normalized to the weight of the less significant bit of the output
filter (lsb = 2 n):
"2FIR =
E[(eFIR)
2]
lsb2
=
E[(eFIR)
2]
2 2n
(5.3)
FIR =
E[eFIR]
lsb
=
E[eFIR]
2 n
(5.4)
Using the configuration of Fig. 5.2, only a rounding error is present: eFIR =
eRound. The mean and mean square error of the eRound has been computed in
Sec. 2.2.1, hence the error of the FIR filter has the following statistical charac-
terization:
FIR = round = 0 (5.5)
"2FIR = "
2
round =
1
12
lsb2 =
1
12
 2 2n (5.6)
These are the lower bound for any FIR filter with an output formed by the same
number of bits of the input and will be considered as the reference circuit in
the following.
Using a truncated multiplier, each operation of multiplication introduces an
error eMi, which depends by the coefficient fi and the applied input. Finally
a further error, eround, is due to the final rounding (in order to get n-bit output
from w-bit output). The error of the filter is therefore equal to:
eFIR =
NX
i=1
eMi + eround (5.7)
Assuming the independence of the eMi and of the eround:
FIR = E
"
NX
i=1
eMi + eround
#
=
NX
i=1
E[eMi] + E[eround] =
=
NX
i=1
E[eM jfi] + round (5.8)
5.1. FIR FILTERWITH TRUNCATED MAC 117
2FIR = VAR
"
NX
i=1
eMi + eround
#
=
NX
i=1
VAR[eMi] + VAR[eround] =
=
NX
i=1
VAR[eM jfi] + 2round (5.9)
where E[eM jfi] and VAR[eM jfi] are respectively the mean error and variance
of the error of the truncated multiplier given an input equal to fi. Finally, given
that 2round = "
2
FIR =
1
12
(since round = 0), using (5.5)-(5.6) in (5.8)-(5.9),
the statistical characterization of the FIR filter is:
FIR =
NX
i=1
E[eM jfi] (5.10)
"2FIR = 
2
FIR + (FIR)
2 =
1
12
+
NX
i=1
VAR[eM jfi] +
 
NX
i=1
E[eM jfi]
!2
(5.11)
Note that the (5.10)-(5.11) represent the theoretical FIR error applying all pos-
sible inputs to the FIR filter, that is, applying a random variable with an uni-
form distribution. Otherwise the probability distribution function (pdf) of the
signal can influence the result. In particular, the error of the filter depends on
the mean error and the variance of the error of the truncated multiplier condi-
tioned to the input with the particular probability distribution. In this case the
error is equal to:
FIR =
NX
i=1
E[eM jfi; x] (5.12)
"2FIR = 
2
FIR + (FIR)
2 =
=
1
12
+
NX
i=1
VAR[eM jfi; x] +
 
NX
i=1
E[eM jfi; x]
!2
(5.13)
(5.10)-(5.11) show that a truncated multiplier can replace a full-width multi-
plier (saving area and power) if
NX
i=1
VAR[eM jfi] +
 
NX
i=1
E[eM jfi]
!2
 1
12
= "2FIR2n (5.14)
118 CHAPTER 5. FIR FILTER
Table 5.1: Characteristic of the considered FIR Filters.
LOW-PASS BAND-PASS HIGH-PASS
Sampling Frequency 48KHz 48KHz 48KHz
Passband Frequency 1KHZ 8.5KHz-12KHz 10KHz
Stopband Frequency 2KHz 6KHz-14.5KHz 9.5KHz
Passband ripple 1dB 1dB 1dB
Stopband attenuation 60dB 80dB - 80dB 80dB
Filter order (N) 100 49 265
where "2FIR2n is the error committed in configuration of Fig. 5.2. In fact, if the
condition of (5.14) is verified, the error introduced by the truncated multiplier
is negligible respect to the error introduced by the final rounding.
The first point that comes out from the equations (5.10)-(5.11) and from
the more general equations (5.12)-(5.13) is that in this kind of application it
is very important to have truncated multiplier characterized by a low variance
but also a low mean error.
Furthermore the condition (5.14) is very useful in the design of the optimal
compensation function of a truncated multiplier employed in the MAC unit. In
fact in Ch. 2 the optimal compensation function has been calculated in the hy-
pothesis of bits independent and identically distributed. Using a different pdf
(which depends on the value that the coefficients hi of the filter can assume),
the mean of the discarded matrix will change according to the new probabil-
ity distribution of the bits, hence a new optimal compensation function can be
computed.
5.2 VLSI implementation
5.2.1 FIR synthesis
In order to analyze the impact of the use of fixed-with multiplier for the real-
ization of FIR filters, three filters, implemented with a different number N of
taps (see Tab. 5.1), have been considered.
As the error strongly depends on the bit width of the multiplier, dif-
ferent n values have been considered for the implementation of the filter:
n = 12; 16; 20. The input is assumed to be a signed fractional number rep-
5.2. VLSI IMPLEMENTATION 119
resented in 2’s complement, with the weights of the MSB and the LSB respec-
tively equal to 20 and 2 (n 1) (indicated as Q0:(n 1)); filter coefficients have
been quantized on n bits inQ0:(n 1) representation. The output is also a signed
fractional number represented on n+h bits (Q1:(n 2+h)). For n = 16, each of
the low-pass, band-pass and high-pass filters has been implemented in TSMC
0:18m technology using a full-width multiplier (output rounded to w = n+h
bits), the Jou et al. [13] multiplier [JOU], the Curticapean et al. [14] multiplier
[CCP], the Van et al. [15, 16] multiplier [VAN] and the proposed LMS1b mul-
tiplier. This allows the comparison of the performances between the various
multipliers in terms of error, area occupation, maximum working frequency
and power dissipation. For n = 12 and n = 20 the same implementations
have been done for the low-pass filter.
The implementations have been repeated for h = 0; 2; 4; 6; 8; 10. It is
not useful to consider a truncated multiplier with h > 10, because the area
occupation of the multiplier is very similar to full-width one.
5.2.2 Comparison with the analytical results
Applying 217 samples of an input signal with uniform distribution to the low
pass filter (n = 16) provides the results shown in Fig. 5.4. In the figure the
simulation values are indicated with symbols while the theoretical values, com-
puted with (5.11), with lines. The comparison confirms the correctness of the
analytical calculation of Sec. 5.1.
The same samples are applied also to the high pass filter and to the band
pass filter. The results, averaged on the three considered filters, are shown in
Tab. 5.2. In the first and second column there are the mean error and mean
square error. As example, for LMS1b multiplier with h = 2 the mean square
error is 1:93  lsb2 for low-pass filter, 1:10  lsb2 for band-pass filter and
4:54  lsb2 for high-pass filter, and the average value has been calculated as
2:52  lsb2. The remaining columns show power dissipation, silicon area
occupation and the maximum working frequency of the filter. For each value
the percentage variation with respect to the value obtained using a full-width
multiplier is also shown.
Tab. 5.2 demonstrates that the use of truncated multiplier with an output
formed by the same number of bits of the input causes a very big mean error
(as high as 100  lsb on average for JOU [13]), even if silicon area can be
reduced up to 31% while the working frequency increases up to 30%.
The best result in terms of mean square error is always provided by LMS1b
truncated multiplier. In order to get an error lower than 12 lsb on the output,
120 CHAPTER 5. FIR FILTER
*
*
*
* * *
FULLWIDTH
LMS1b
CURTICAPEAN
JOU
VAN
*
theoreticalsimulation
m
s
/
2
2
ls
b
0,01
0,1
1
10
10
4
10
3
10
2
e
w
16 18 20 22 24 26
FULL PRECISION MAC
Figure 5.4: Mean square error of low pass filter implemented with
different truncated multipliers varying the number of outputs bits w=
n + h (n = 16; h = 0; 2; 4; 6; 8; 10). The applied input is a signal
with uniform distribution. The lines represent the theoretical mean
square error, obtained with (5.11), the symbols the simulation results.
The theoretical value is experimentally verified. The orange line is the
mean square error of the full precision MAC with a final rounding.
it is necessary to use a truncated multiplier with h = 4. This solution for
LMS1b multiplier provides satisfactory results in terms of area occupation
(16% saving) and maximum working frequency (22% faster). The power
dissipation reduction is 18%.
Only for h = 0 LMS1b truncated multiplier is characterized by a slightly
higher area and power dissipation (only 1%) but it has also the lowest mean
square error. Finally the table shows that for h  6 the (5.14) is verified for
LMS1b and CCP [14] multipliers: their errors are equivalent to the reference,
full-width, circuit. As a consequence these truncated multipliers (16 bits input
and 20-22 bits outputs) can perfectly substitute a full-width multiplier, with a
significant improvement in power, speed and area occupation.
As the reader can see in Tab. 5.3-5.4, this trend is still valid for n = 12 and
n = 20. The best result in terms of mean square error is always provided
by LMS1b truncated multiplier. Using a truncated multiplier with h = 4
guarantees an error lower than 12 lsb on the output, with 12% (n = 12) and
5.2. VLSI IMPLEMENTATION 121
Table 5.2: Performances of the FIR Filters implemented using various
Truncated Multipliers, with n = 16. The results have been obtained
with an uniform distributed signal. Bold numbers indicate the best
performing circuit for each w(h) value.
w "FIR "msFIR Power Area Frequency
(lsb) (lsb2) (W=MHz) (103m2) (MHz)
(h) Architecture value value value % value % value %
Full-width 0.00 11.58 67.33 33.73 255
16 Jou[13] -94.09 12702.40 45.94 -31.77 23.41 -30.58 332 -30.23
Ccp[14] 9.13 166.07 48.76 -27.57 23.24 -31.08 303 18.79
(0) Van[15, 16] 35.74 1856.16 45.92 -31.79 23.40 -30.62 326 27.69
LMS1b -1.18 40.61 46.23 -31.33 23.46 -30.43 328 28.52
Full-width 0.00 0.80 74.58 39.41 253
18 Jou[13] -23.48 794.04 58.48 -21.59 31.49 -20.08 284 12.22
Ccp[14] 0.15 2.99 59.17 -20.65 31.47 -20.14 284 12.22
(2) Van[15, 16] 11.02 179.92 57.80 -22.50 31.26 -20.67 312 23.05
LMS1b -0.20 2.52 57.28 -23.19 31.24 -20.73 312 23.05
Full-width 0.00 0.13 77.08 40.11 252
20 Jou[13] -5.77 48.61 65.30 -15.29 33.96 -15.33 296 17.46
Ccp[14] 0.07 0.28 65.21 -15.40 33.93 -15.39 297 17.80
(4) Van[15, 16] - - - - - - - -
LMS1b -0.06 0.23 63.18 -18.03 33.67 -16.04 307 21.78
Full-width 0.00 0.09 79.36 40.97 251
22 Jou[13] -1.42 3.06 70.45 -11.23 36.11 -11.87 299 19.46
Ccp[14] 0.00 0.09 70.45 -11.23 36.09 -11.90 299 19.46
(6) Van[15, 16] - - - - - - - -
LMS1b 0.00 0.09 69.02 -13.03 35.82 -12.58 303 20.91
Full-width 0.00 0.08 81.54 41.69 250
24 Jou[13] -0.34 0.26 73.79 -9.51 37.99 -8.87 292 16.96
Ccp[14] 0.01 0.08 73.93 -9.33 37.97 -8.91 292 16.96
(8) Van[15, 16] - - - - - - - -
LMS1b 0.01 0.08 74.55 -8.57 37.66 -9.66 302 20.85
Full-width 0.00 0.08 83.09 42.34 250
26 Jou[13] -0.08 0.09 77.83 -6.33 39.54 -6.61 292 16.96
Ccp[14] 0.01 0.08 77.93 -6.20 39.53 -6.64 292 16.96
(10) Van[15, 16] - - - - - - - -
LMS1b 0.00 0.08 77.91 -6.23 39.19 -7.42 299 19.76
122 CHAPTER 5. FIR FILTER
Table 5.3: Performances of the Low-Pass Filters implemented using
various Fixed-width Multipliers, for n = 12. The results have been
obtained with a uniform distributed signal. Bold numbers indicate the
best performing circuit for each m(h) value.
w "FIR "msFIR Power Area Frequency
(lsb) (lsb2) (W=MHz) (103m2) (MHz)
(h) Architecture value value value % value % value %
Full-width 0.00 7.33 38.34 20.72 324
12 Jou[13] 56.59 3216.08 28.94 -24.51 15.63 -24.55 379 17.05
Ccp[14] -7.47 68.76 30.24 -21.11 15.81 -23.70 353 9.19
(0) Van[15, 16] -25.16 647.14 28.93 -24.54 15.62 -24.60 376 16.17
LMS1b -0.76 21.28 29.49 -23.07 15.66 -24.39 380 17.49
Full-width 0.00 0.56 47.32 26.49 281
14 Jou[13] 13.80 191.35 39.46 -16.60 22.31 -15.78 346 23.18
Ccp[14] -0.42 1.84 39.31 -16.92 22.30 -15.83 346 23.18
(2) Van[15, 16] 11.02 126.35 38.75 -18.11 22.16 -16.34 348 24.04
LMS1b -0.49 1.58 38.16 -19.35 22.13 -16.45 356 26.69
Full-width 0.00 0.12 49.30 27.12 279
16 Jou[13] 3.23 10.59 42.55 -13.71 24.07 -11.26 342 22.60
Ccp[14] 0.07 0.28 42.85 -13.09 24.05 -11.32 342 22.60
(4) Van[15, 16] - - - - - - - -
LMS1b -0.01 0.16 43.00 -12.79 23.90 -11.88 347 24.31
Full-width 0.00 0.09 51.12 27.82 275
18 Jou[13] 0.72 0.60 46.00 -10.02 25.62 -7.90 330 19.80
Ccp[14] -0.03 0.09 46.32 -9.40 25.61 -7.95 330 19.80
(6) Van[15, 16] - - - - - - - -
LMS1b -0.01 0.09 46.20 -9.64 25.43 -8.60 332 20.60
Full-width 0.00 0.08 52.88 28.54 273
20 Jou[13] 0.14 0.10 49.05 -7.24 26.90 -5.75 326 19.22
Ccp[14] -0.02 0.08 49.41 -6.56 26.90 -5.77 326 19.22
(8) Van[15, 16] - - - - - - - -
LMS1b 0.00 0.08 49.47 -6.44 26.69 -6.49 331 21.19
Full-width 0.00 0.08 54.28 29.26 272
24 Jou[13] 0.02 0.08 51.55 -5.03 27.90 -4.66 323 18.71
Ccp[14] 0.02 0.08 51.55 -5.03 27.90 -4.66 323 18.71
(10) Van[15, 16] - - - - - - - -
LMS1b 0.00 0.08 51.57 -4.73 27.89 -4.70 325 19.48
5.2. VLSI IMPLEMENTATION 123
0,01
0,1
1
10
10
4
*
theoreticalsimulation
w
16 18 20 22 24 26
10
3
10
2
*
*
*
* * *
m
s
/
2
2
ls
b
e
FULLWIDTH
CURTICAPEAN
JOU
VAN
LMS1b
FULL PRECISION MAC
Figure 5.5: Mean square error of FIR implemented with different
truncated multipliers varying the number of outputs bits w = n + h
(n = 16; h = 0; 2; 4; 6; 8; 10). The applied input is a sinusoid. The
orange line is the mean square error of the full precision MAC with a
final rounding.
17% (n = 20) saving in area occupation, 13% (n = 12) and 23% (n = 20)
power dissipation reduction and a gain in frequency of 24% (n = 12) and
16% (n = 20).
5.2.3 Effect of the probability distribution of the input on the er-
ror
In Sec. 5.1 it has been demonstrated that the error depends on the pdf of the in-
put, (5.12)-(5.13). In order to verify this equation and to analyze the behavior
of the filter varying the inputs, every FIR filter considered in the previous sec-
tion has been tested applying 217 samples of a sinusoid and of two signals with
different pdf: Gaussian (a signal with a rised cosine frequency spectrum) and
Exponential. The results, averaged on the three considered filters, are shown
in Fig. 5.5-5.6-5.7. The results are in good agreement with those obtained us-
ing (5.12)-(5.13). Fig. 5.5-5.6-5.7 show that the LMS1b multiplier has a very
stable behavior as a function of the pdf of the input. The same happens to Van
[15, 16] and JOU [13] multipliers. On the contrary, the CCP [14] multiplier
124 CHAPTER 5. FIR FILTER
0,01
0,1
1
10
10
4
*
theoreticalsimulation
w
16 18 20 22 24 26
10
3
10
2
*
*
*
* * *
m
s
/
2
2
ls
b
e
FULLWIDTH
CURTICAPEAN
JOU
VAN
LMS1b
FULL PRECISION MAC
Figure 5.6: Mean square error of FIR implemented with different
truncated multipliers varying the number of outputs bits w = n + h
(n = 16; h = 0; 2; 4; 6; 8; 10). The applied input has a gaussian proba-
bility distribution (rised cosine frequency spectrum). The orange line is
the mean square error of the full precision MAC with a final rounding.
is more dependent on the applied input. The multiplier provides a high error
for exponential distribution, a distribution for which the most probable input
is formed by many bits equal to 1, and a low error for gaussian distribution, a
distribution for which the most probable input is formed by many bits equal to
0.
5.2.4 Example of FIR filter
As example it has been implemented a 100 taps low-pass FIR filter, using the
architecture of Fig. 5.3 with w = n = 20, h = 2. In this example the overflow
prevention has been made with the use of 4 guard bits and the final rounding
is realized by initializing the accumulator with the rounding constant before
each accumulation. The sample frequency is assumed to be 48 MHz. The filter
coefficients have been calculated with the equiripple algorithm by imposing a
1 MHz pass-band, a 1 MHz transition band and a 60dB attenuation in the stop-
band. Fig. 5.8 shows the frequency response of the filter implemented with the
architectures of Fig. 5.3, by using a LMS1b truncated multiplier with h=0. The
response is compared with the response of the floating-point filter and with a
5.3. CONCLUSION 125
0,01
0,1
1
10
10
4
*
theoreticalsimulation
w
16 18 20 22 24 26
10
3
10
2
*
*
*
* * *
m
s
/
2
2
ls
b
e
FULLWIDTH
CURTICAPEAN
JOU
VAN
LMS1b
FULL PRECISION MAC
Figure 5.7: Mean square error of FIR implemented with different
truncated multipliers varying the number of outputs bits w = n + h
(n = 16; h = 0; 2; 4; 6; 8; 10). The applied input has a exponential
probability distribution. The orange line is the mean square error of
the full precision MAC with a final rounding.
filter using a full-width (non-rounded) multiplier and a 36 bit accumulator. As
can be seen the three frequency responses result almost indistinguishable.
5.3 Conclusion
The performances provided by different architectures of truncated multipliers
when used to implement FIR filters using a single Multiply and Accumulate
unit, has been investigated in this chapter. The error provided by the truncated
multiplier on the output of the FIR is analytically calculated. The analytical
equations show that in this kind of application it is important to use a truncated
multiplier characterized by a small mean and mean-square error, for the given
coefficients. In fact each result is given by the accumulation of a series of
products, and hence positive and negative truncation errors have high proba-
bility to compensate each others. Various FIR filters have been implemented in
TSMC 0:18m technology, considering different truncated multipliers varying
the number of output bits. The analysis shows that truncated multipliers are a
suitable replacement for the full-width multipliers and that the optimal perfor-
126 CHAPTER 5. FIR FILTER
FrequencyResponse
10
0
-10
-20
-30
-40
-50
-60
-70
0 0.5 1 1.5 2 2.5 3 3.5
M
a
g
n
it
u
d
e
(d
B
)
Frequency (Mhz)
FIR with floating point arithmetic
FIR with full width Multiplier
FIR with LMS1b Truncated Multiplier
Figure 5.8: Frequency Response of the FIR filter implemented using
the architecture of Fig. 5.2 and Fig. 5.3 with w = n = 20, h = 2.
mances are provided by the LMS1b proposed truncated multiplier able, for the
proposed FIR filters, to keep the square error below 1=4lsb2 while providing,
for n = 16, a 16% and 18% reduction of the silicon area occupation and power
dissipation with a 22% increase of the working frequency, and, for n = 20, a
17% and 23% reduction of the silicon area occupation and power dissipation
with a 16% increase of the working frequency, while. Furthermore LMS1b
multiplier is low sensible at the pdf of the input, and so the filter implemented
with this multiplier can be used with various inputs, keeping the same mean
square error.
5.3. CONCLUSION 127
Table 5.4: Performances of the Low-Pass Filters implemented using
various Fixed-width Multipliers, for n = 20. The results have been
obtained with a uniform distributed signal. Bold numbers indicate the
best performing circuit for each m(h) value.
w "FIR "msFIR Power Area Frequency
(lsb) (lsb2) (W=MHz) (103m2) (MHz)
(h) Architecture value value value % value % value %
Full-width 0.00 7.33 79.70 53.59 183
20 Jou[13] 72.59 5299.94 45.56 -42.84 37.38 -30.26 225 23.20
Ccp[14] 4.32 56.38 51.63 -35.23 37.70 -29.65 207 12.02
(0) Van[15, 16] -26.17 715.50 49.04 -38.47 37.38 -30.25 220 20.22
LMS1b -0.48 30.39 47.16 -40.83 37.39 -30.23 220 20.48
Full-width 0.00 0.56 90.61 66.65 182
22 Jou[13] 17.85 320.73 63.47 -29.96 53.36 -19.94 205 12.70
Ccp[14] 0.79 3.05 63.37 -30.06 53.33 -19.99 208 14.35
(2) Van[15, 16] -7.15 53.12 70.61 -22.07 53.04 -20.42 220 20.88
LMS1b -0.34 2.90 61.97 -31.61 52.99 -20.49 217 19.31
Full-width 0.00 0.11 91.99 67.37 182
24 Jou[13] 4.29 18.62 72.81 -20.85 56.41 -16.27 206 13.17
Ccp[14] -0.25 0.32 69.98 -23.92 56.37 -16.33 207 13.64
(4) Van[15, 16] - - - - - - - -
LMS1b 0.27 0.30 71.17 -22.63 56.01 -16.87 212 16.53
Full-width 0.00 0.09 94.84 68.08 182
26 Jou[13] 1.06 1.21 81.26 -11.66 59.12 -13.15 204 12.24
Ccp[14] 0.05 0.09 81.33 -11.58 59.10 -13.19 204 12.24
(6) Van[15, 16] - - - - - - - -
LMS1b 0.05 0.09 77.62 -15.62 58.71 -13.75 211 15.79
Full-width 0.00 0.08 97.47 68.78 182
28 Jou[13] 0.28 0.16 83.46 -14.37 61.56 -10.49 203 11.79
Ccp[14] 0.01 0.08 84.04 -13.78 61.54 -10.53 203 11.79
(8) Van[15, 16] - - - - - - - -
LMS1b 0.01 0.08 84.04 -13.78 61.12 -11.13 209 15.06
Full-width 0.00 0.08 99.61 69.49 182
30 Jou[13] 0.17 0.09 88.09 -11.57 63.68 -8.35 203 11.79
Ccp[14] 0.01 0.08 88.19 -11.46 63.67 -8.37 203 11.79
(10) Van[15, 16] - - - - - - - -
LMS1b 0.00 0.08 87.45 -12.21 63.24 -8.99 207 14.11

Chapter 6
Temperature Control for Gas
Sensors
G
as sensors are becoming increasingly important in our everyday lives.
Their current applications include smoke alarms, laboratory analysis,
medicine, automotive and industrial safety. However the range of applications
is hinderer by the high cost (due to the semiautomated manufacturing methods)
and high power consumption (due to the power needed to heat the sensing
material and to reduce its response time) of currently available gas sensors
[38].
Silicon-based micro gas sensors can overcome these problems. The small
size helps in achieving low power consumption, while the use of existing mi-
croelectronics technology can greatly reduce manufacturing costs.
New silicon-based gas sensors using micro-hotplate technology are in great
need of accurate temperature control. In fact the accurate determination of the
working temperature allow to determine the gas that is detected and precisely
monitors its concentration. In this chapter the implementation and the perfor-
mance of two controllers, On/Off and PI will be described. Using the proposed
truncated multiplier the PI controller will be optimized and it will be compared
also with a new controller, based on a mixed PI-On/Off approach.
6.1 Resistive gas sensors
Nowadays most of the resistive gas sensors use metal oxides for the sensing
material. Metal oxides (e.g. zinc oxide [39]) react with different gases at fairly
high temperatures (300C- 500C) when their resistance changes as function
129
130 CHAPTER 6. TEMPERATURE CONTROL FOR GAS SENSORS
of the gas concentration. Therefore, from the change in resistance and the
optimum temperature of operation it is possible to figure out the concentration
and nature of the gas. However, the high working temperature demands very
large power, e.g. the most widely available resistive gas sensor (Taguchi type,
sold by Figaro, Japan) has an operating temperature of typically 400C and
power consumption of between 500mW and 800mW [40]. Hence, researchers
have been trying to reduce the power consumption by changing the technology
and the design of the gas sensor device.
One of these approaches is a membrane based gas sensor [41]. This ap-
proach is particularly useful when it is integrated with CMOS, since on-chip
read out circuits can be used. The membrane structure contains a micro-heater
(to heat up the sensing material), the interdigitated electrodes (to measure the
change in resistance of the sensing material) and a temperature sensor (to mea-
sure accurately the temperature of the sensitive material). The membrane ap-
proach is an innovative approach with which it is possible to achieve lower
power consumption by thermal isolation of the micro-heater from the rest of
the chip through the removal of the bulk silicon (by deep reactive ion etch-
ing or KOH etching). A further reduction of the power dissipation is obtained
using a fast pulse mode operation of the micro-heater (instead of providing a
constant supply all the time). In this case an accurate temperature control of
the micro-heater is particularly important to maintain the desired temperature
in the sensing layer.
Several designs of high temperature micro gas sensors can be found in
literature [42, 43, 44]. Here no more details will be given since in this chapter
the attention is focused on the temperature control.
The resistive gas sensor used in this chapter is presented in [45]. It is
a tungsten based microhotplate fabricated in a commercial silicon on insula-
tor (SOI)-CMOS process. Tungsten, used as metallization in some commer-
cial SOI processes (for increasing the junction temperature of the CMOS),
has good mechanical strength and can operate reliably at high temperatures.
Thanks to the reduced power dissipation given by the SOI technology, it is
possible to heat the sensing material up to 450C in a short time. The struc-
ture is shown in Fig. 6.1. During the fabrication of the interface circuitry, the
tungsten metal layers (used as interconnect in CMOS circuitry) are used to
form the heater, a heat spreading plate, and interdigitated electrodes for gas
sensing. Back etching by deep reactive ion etching (DRIE) is then used to
release the membrane, followed by the deposition of the gas sensing layer.
The layout of the micro-hotplate device is shown in Fig. 6.2. The heater
6.1. RESISTIVE GAS SENSORS 131
P NP PN N
Tungsten
Heather
MetalHeat
Spreading
Plate Gas sensing
Material
Passivation
Silicon Heat
Spreading
Plate
Silicon
Dioxide
PMOS NMOS
CMOSSensor
Substrate
Buried
Silicon Dioxide
Figure 6.1: Tungsten SOI chip: gas sensor and integrated CMOS circuitry.
is rectangular ring shaped and heats an area of 185m  185m. Interdigi-
tated electrodes have been designed within the heater area where the sensing
material is deposited. There is also a bipolar transistor temperature sensor
(implemented as a diode) below the heater and this is used by the controller
to measure the temperature of the membrane. The membrane area used for
this particular sensor was 530  530m. The device has been fabricated in
a commercial CMOS foundry, followed by a separate wet etching process (to
form the membrane) in MEMS foundry
6.1.1 Interface Circuitry
The interface circuitry consists of driving circuits for heater and temperature
sensor and the On/Off circuitry to control the temperature of the heater. The
micro-heater can be driven in a voltage or a current control mode. In case of
voltage mode, one needs to use current limiting resistor in series with micro-
heater in order to increase the lifetime of the heater [46], consequently, current
drive circuitry is preferred and has been selected for the purpose of this work.
A cascode current mirror was designed to drive 20mA maximum current
through the heater. The temperature sensor, bipolar junction transistor with
base and collector terminal shorted together, was driven by a constant current
(of  65A). The voltage across the temperature sensor reduces linearly with
increase in temperature (0.87 V @ room temperature to 0.45 V @ 375 C, i.e.
-1.2 mV/C). The detailed discussion on the temperature sensor performance
132 CHAPTER 6. TEMPERATURE CONTROL FOR GAS SENSORS
Temperature
sensortrack
Interdigitated
Electrodes
Heater
Membrane
Heater
Tracks
90 mm
Figure 6.2: Layout of the microhotplate device.
was reported elsewhere [47]. Circuit schematic is shown in Fig. 6.3, it also
shows the A/D converter and the Temperature controller.
6.2 Temperature Control
It is very important to accurately control the temperature of the sensing layer.
Often due to variations in the ambient temperature, metal electromigration in
the heater or the simple variability in the dimensions of the heater during fabri-
cation lead to significant variations in the power vs. temperature characteristic.
As a consequence it is not possible to rely on power measurement to estimate
the temperature of the sensitive layer. The only way to accurately control the
temperature of the sensing layer is using an accurate temperature sensor em-
bedded in the micro-hotplate in conjunction with a CMOS controller to detect
and adjust sensing layer temperature.
Several classes of controllers have been implemented with digital CMOS
circuitry due to the great advantages that the digital circuits provide over their
analogue counterparts. They are less expensive, more reliable, easy to manipu-
late and flexible. As it is shown in Fig. 6.3, a temperature sensor reads the tem-
perature of the microhotplate. This value y(t) is converted into a digital form
6.2. TEMPERATURE CONTROL 133
A/D
Temperature
Controller
H
E
A
T
E
R
TEMPERATURE
SENSOR
MEMBRANE
Ts
R(kTs)
y(kTs)
u(t)
y(t)
data
Figure 6.3: Schematic of the circuit composed by the gas sensor
(membrane), cascode current mirror to drive heater and temperature
sensor, A/D converter and a temperature controller. y(t) is the mea-
sured temperature, R(kTS) the desired temperature, u(t) is the control
variable and TS is the sampling period.
and compared with the desired value of temperature (R(kTS)), where TS is the
sampling period. The controller takes this error e(kTS) = y(kTS)   R(kTS)
and, depending on the implemented controlling function, drives the microhot-
plate in order to reach the desired temperature.
6.2.1 On/Off control
An On/Off controller is the simplest form of a temperature control device. The
output from the device is either on or off, with no middle state. When the
microhotplate is cooler than the set-point temperature the heater is turned on
at maximum power, and once it is hotter than the set-point temperature the
heater is switched off completely. The On/Off control is designed to include
hysteresis: there is a deadband, a region around the setpoint value in which no
control action is needed. The width of the deadband is adjustable from outside
134 CHAPTER 6. TEMPERATURE CONTROL FOR GAS SENSORS
D
y(kTs)>R(kTs)-D
y(kTs)<R(kTs)-D
OFF
ON
D
u(t)
variable
control
[Volt]
e(kTs)
Figure 6.4: On/Off controller.
(see Fig. 6.4).
The working principle of the On/Off controller can be described with the
following equation. The output of the controller at k + 1th instant will be:
u ((k + 1)TS) =
8>>>>>><>>>>>>:
0 if

u (kTS) = 1
e (kTS) <  
1 if

u (kTS) = 0
e (kTS) > 
u (kTS) otherwise
(6.1)
where:
1. TS is the sampling period;
2. e(kTS) is the difference between the setpoint R(kTS) (desired temper-
ature) and the measured temperature y(kTS)
3. u(kTS) the output of the controller at the h-th instant
4.  the width of the deadband
On/Off control is very simple and cheap, but persistent oscillation of the
process variable occurs and there is a continuous cycling of the controlled
variable and excess wear on the final control element. Hence it is usually used
where a precise control is not necessary or in systems which cannot handle
6.2. TEMPERATURE CONTROL 135
frequent On/Off switches or in those cases in which system temperature change
extremely slowly.
6.2.2 PI control
Proportional controls are designed to eliminate the cycling associated with the
On/Off control. A proportional controller decreases the average power sup-
plied to the heater as the temperature approaches the setpoint. This has the
effect of slowing down the heater so that it will not overshoot the setpoint, but
will approach the setpoint and maintain a stable temperature. In order to reduce
to zero the steady-state error produced by the proportional action (rejecting the
offset), an integral action is also applied. This second term is proportional to
the integral of the error. So the control output is given by:
u (t) = KP

e (t) +
1
Ti
Z t
0
e () d

(6.2)
where e(t) is the difference between the desired and the measured value of
temperature, KP is the proportional gain and Ti the integral time. The ideal
PI controller has been digitalized and his output is used by a PWM block to
vary the desired percent of duty cycle of a signal applied to heather. PWM is a
Pulse Width Modulation in which the duty cycle of a square wave is modulated
to encode a specific analog signal level: the voltage source is supplied to the
analog load by means of a repeating series of on and off pulses. The on-time
is the time during which the DC supply is applied to the load, the off-time is
the period during which that supply is switched off. Hence the heater will be
driven with a mean voltage equal to the ones needed to keep the desired value
of temperature.
In order to formulate a discrete PI controller, we apply the backward differ-
ence methods to the expression of the system in the Laplace domain, obtaining:
u (kTS) = u (kTS   TS) +KP (e (kTS)  e (kTS   TS)) + KPTS
Ti
e (kTS)
(6.3)
Finally the output is given by:
u (n) = u (n  1) + a  e (n) + b  e (n  1) (6.4)
where n = kTS .
136 CHAPTER 6. TEMPERATURE CONTROL FOR GAS SENSORS
Figure 6.5: Digital implementation of PI algorithm.
Figure 6.6: Digital implementation of an optimized PI algorithm.
This expression is also known as velocity PI algorithm: the calculation of
current control uses the previous control value as reference, hence the control
is calculated as a change. The digital implementation of the PI controller is
represented in Fig. 6.5. As it is shown in this figure, there are two registers, a 9
bits register which stores the previous value of the error and a 18 bits register
which keeps the previous value of the control variable, and a saturation block,
with which the overflow of the control variable is prevented (as at each clock
cycle a positive quantity could be added to previous value). Finally, with a
rounding operation, the output is converted from 18 to 8 bits, in order to have
the proper signal that should be applied to the PWM block.
In the circuit of Fig. 6.5 most of area consumption is determined by the
two multipliers. Hence it can be useful to use the proposed LMS1b truncated
multiplier for the implementation of the PI. Using this truncated multiplier an
optimized PI controller has been implemented. The final circuit is represented
6.2. TEMPERATURE CONTROL 137
Figure 6.7: Mixed controller.
in Fig. 6.6. As it is shown in this figure the output of the multiplication is on the
same number of bits of the inputs. Therefore, as already mentioned, the area
occupation is reduced because of the use of this fixed-width multiplier, and
furthermore the second register is formed by only 9 bits, as it will be shown in
Sec. 6.3.
6.2.3 Mixed control
PI controllers allow a better adjustment in the system compared to On/Off
controller, leading to a more accurate control of the variable. But there are
some advantages to using an On/Off controller instead of a PI controller. The
On/Off controller is cheap, both the design and the operation principle are
simple. Nevertheless it is inefficient, it can generate noise (because it can
dramatically overshoot or undershoot a set-point), and gives low accuracy on
the control. Hence a good solution could be to use a mixed control.
The idea of MIX control is to apply the best possible control depending
on the current value of the error. If the error is very high it is not necessary to
apply a PI control to the heater. We can simply switch it on or off depending
on the sign of the error. If the error is positive (the desired value is higher than
the measured value) the heather will switch on, otherwise it will switch off. On
the opposite, when the value of the error becomes small, we can apply a more
accurate control, such as a PI control. The final circuit is shown in Fig. 6.7.
138 CHAPTER 6. TEMPERATURE CONTROL FOR GAS SENSORS
The error is split into two parts. The most significant part (MSP), formed by
the first 5 bits of the errors, and the less significant parts (LSP), formed by the
last 4 bits.
If every bit of the MSP part are 0 or to 1, the absolute value of the error
is smaller than a small quantity, so we can apply a PI control. In this case we
only need a multiplier formed by 4 bits, because we know the other part of the
multiplication (formed by all 0 or all 1). Otherwise we can directly apply an
On/Off control. Depending on the sign of the error we will switch the heather
on or off. Finally a multiplexer is present in order to select which control we
will apply.
6.3 Silicon Implementation
The proposed controller was designed in Verilog, synthesized using RTL com-
piler and than the layout of the final chip 2mm  3mm was generated with
SoC Encounter. The design has been carried out in 0:8m CMOS process
with three levels of metal. The chip layout is shown in Fig. 6.8. It is possible
to identify the Mixed controller, the PI controller, the Optimized PI controller
and the On/Off controller (implemented using two or three levels of metal).
The differences in terms of area occupation between them controller are visi-
ble. The amount of these differences can be seen Tab. 6.1, where also the other
hardware characteristic are reported. The On/Off controller provides the mini-
mum area occupation and power dissipation and maximumworking frequency.
If however, the most important requirement is the accuracy, one should use a
PI controller. The problem is that it will occupy an area five times larger than
that of the On/Off controllers.
By using a truncated multiplier the same accuracy is obtained and the area
occupation and power dissipation of the PI controller can be reduced by as
much as 50%
The best compromise between accuracy and physical implementation is
provided by the novel MIXED Controller. It is quite rough if the error is very
big but very precise when the error become smaller and smaller. The Area con-
sumption is only double that of the On/Off ones and also the power dissipation
is relatively low.
Since having a precise temperature control is important for accurate gas
detection, it is very useful to have this control on-chip. Fig. 6.9 shows the
layout of the microhotplate integrated with one of the analyzed controller, the
On/Off. In the figure it is possible to see the analog driver circuit, the micro-
6.3. SILICON IMPLEMENTATION 139
Table 6.1: Comparison Between Controllers.
Controller Frequency [MHz] Power [m=MHz] Area [mm2]
On/Off Controller 60 157 0.08
PI Controller 22 825 0.41
Optimized PI Controller 25 340 0.26
Mixed Controller 25 390 0.18
heater, the On/Off controller and the A/D converter. The design has the flexi-
bility of changing deadband region from outside the chip. The ability to define
the frequency and the deadband allow a simple means to digitally control the
accuracy of the micro-hotplate temperature, hence the gas detection.
140 CHAPTER 6. TEMPERATURE CONTROL FOR GAS SENSORS
M
ix
e
d
C
o
n
tro
lle
r
O
N
-O
F
F
c
o
n
tro
lle
r
2
m
e
ta
l-3
m
e
ta
l
P
I
c
o
n
tro
lle
r
O
p
tim
iz
e
d
P
Ic
o
n
tro
lle
r
Figure 6.8: Digital chip with controllers.
6.3. SILICON IMPLEMENTATION 141
A
n
a
lo
g
d
riv
e
r
c
irc
u
it
M
ic
ro
h
e
a
th
e
r
O
n
-O
F
F
c
o
n
tro
lle
r
A
/D
c
o
n
v
e
rte
r
Figure 6.9: Mixed signal chip.

Conclusions
This dissertation has presented methods for the design of truncated multipliers
and squares that allows to achieve area, delay and power benefits with the
minimum mean square error.
The state of the art truncated multipliers show substantial limits. In partic-
ular, they are not derived from an analytical theory but rather heuristically or
with the help of exhaustive searches. Thus, in many cases the proposed tech-
niques can not be applied to multipliers with large bit widths (say, 24 or 32 bits)
and/or can not be considered for a possible implementation in automatic syn-
thesis tools. In addition, the errors are computed numerically through exhaus-
tive simulations. This approach can be pursued only for small n values since
the simulation time increases as O(22n), requiring an unreasonable amount of
CPU time when n increases.
On the contrary the approach proposed in this dissertation is analytical.
Looking at the error due to the approximation of some bits of the result with
a compensation function, the optimal compensation function, which minimize
the mean square error, has been computed.
The first result is that the optimal compensation function is a quadratic
form of the partial products of the IC; this is due to the approximation of a
matrix with a vector.
A second important result is the evaluation, in closed form, of the intrinsic
error, the mean square error introduced by the optimal compensation function.
Until now it has been thought that by using a suitable compensation function
the lowest error that could be obtained was the rounding error; indeed it has
been demonstrated that the lowest error is represented by "2low bound, indepen-
dent by the chosen compensation function. The analytical relation between
this error and the parameters of hardware implementation gives to the designer
the minimum characteristic that the truncated multiplier should have in order
to give the desired accuracy.
The optimal compensation function, being a quadratic form, cannot be ef-
143
144 Conclusion
ficiently implemented in hardware. Therefore the performances achievable by
using a linear compensation function, best suited for hardware implementa-
tion, have been investigated. It is shown that the additional error due to the
linear compensation function is negligible, pointing out that this solution is
the best choice from a practical point of view. The effect of coefficient quan-
tization is also treated by providing the quantized optimal coefficients, using
two different levels of quantization, given by only one bit (LMS1b truncated
multiplier) and two bits (LMS2b truncated multiplier).
The mean square error and the maximum absolute error of the LMS trun-
cated multiplier have been computed in closed form. This is one of the most
important characteristic of the proposed LSM multiplier. It is the only archi-
tecture that can be designed, for every bit width, using an analytical approach
that allows the a priori knowledge of the committed error. Without the ana-
lytical solution, it can only be computed using slow exhaustive simulations,
possible only for low n values.
The analysis of the practical implementation of truncated multipliers with
the quantized linear compensation function has been done and it has been
demonstrated that the implementation of LMS1b multiplier is straightforward,
while the optimal implementation of LMS2b multipliers is more challenging.
The best approach uses a small auxiliary carry-save tree to minimize the hard-
ware. The comparison of performances of the truncated multipliers developed
in this thesis with previously proposed circuits has demonstrated that the pro-
posed solution is the one which provides the best trade-off between hardware
and accuracy.
All the results have been extended to the truncated binary squarer. Also
in this case it has been obtained the analytical solution for the optimal com-
pensation function, which minimizes the mean square error, and its linear ap-
proximation. The analytical technique results in a very simple circuit. Com-
pared with other truncated squarer circuits proposed in the literature, the LMS
squarer reduces the mean square error, the power dissipation, the silicon area
occupation, and increases the maximum working frequency.
The error provided by the truncated multiplier on the output of the FIR fil-
ter has been analytically calculated. The analytical equations show that in this
kind of application it is important to use a truncated multiplier characterized
by a small mean and mean-square error, for the given coefficients.
Various FIR filters have been implemented in TSMC 0:18m technology,
considering different truncated multipliers varying the number of output bits.
The analysis shows that truncated multipliers are a suitable replacement for
Conclusion 145
the full-width multipliers and that the optimal performances are provided by
the proposed LMS1b truncated multiplier. Furthermore LMS1b multiplier is
low sensible at the pdf of the input, and so the filter implemented with this
multiplier can be used with various inputs, keeping the samemean square error.
Finally it has been demonstrated that the use of truncated multiplier can
be useful also in others field, for example in the implementation of the PI
temperature control. By using the truncated multiplier is possible to save area
and power, even if the accuracy is a bit low. Note that, in order to obtain a
better accuracy, the coefficients of the controller can be adjusted (automatic
tuning is also possible).

Appendix A
Intrinsic Error
A.1 Computing 2i;j
By using (2.47) in (2.55) one obtain:
2ij(A) = ij(A)  2ij(A) = ij(A) [1  ij(A)] =
=
1
9
(1 + 2i h) (1 + 2n+1 j)

1  1
9
(1 + 2i h) (1 + 2n+1 j)

(A.1)
Developing the product in (A.1) and remembering that 2i = i, one obtains:
2ij(A) =
1
81
[8 + 10i h + 10n+1 j   28i hn+1 j ] (A.2)
A.2 Computing the covariance COVi;j;l;m(A)
In order to compute the covariance (2.56) in explicit form, we need to com-
pute the mean of the product xiyj  xlym in every set 
(A). xiyj and xlym,
correlated to i h, n+1 j and l h, n+1 m respectively, are conditionally
independent for each fixed value of IC (therefore fixed the value that bits of
the product can assume). Hence:
E
IC=A
fxiyj  xlymg = E
IC=A
fxig E
IC=A
fyjg E
IC=A
fxlg E
IC=A
fymg (A.3)
where the conditioned mean is equal to 1 if the correlated element of IC is equal
to 1, 1=3 otherwise (see (2.47)). Five different cases must to be considered.
147
148 APPENDIX A. INTRINSIC ERROR
Case 1: j=m This condition happens when the two partial products xiyj and
xlym belong to the same row of the LSPminor. Hence the product xiyjxlyj =
xiyjxl is correlated to three elements of IC: i h, n+1 j and l h. Applying
(A.3), the mean of the product is:
E
IC=A
fxiyj  xlyjg = 1
27
 (1 + 2i h)  (1 + 2n+1 j)  (1 + 2l h) (A.4)
E
IC=A
fxiyjxlg 2

1;
1
3
;
1
9
;
1
27

Case 2: i=l This is a dual case with respect to the case 1, when the two
partial products belong to the same diagonal of LSPminor. The generic element
xiyjxiym = xiyjym is correlated to three elements of IC: i h, n+1 j and
n+1 m. Applying (A.3), the mean of the product is:
E
IC=A
fxiyj  xiymg = 1
27
(1 + 2i h)  (1 + 2n+1 j)  (1 + 2n+1 m)
(A.5)
E
IC=A
fxiyjymg 2

1;
1
3
;
1
9
;
1
27

Case 3: m+i=n+h+1 In this case xiym = xiyn+h+1 i is an element of the
IC. Hence it is fixed in each set 
(A):
E
IC=A
fxiyj  xlymg = xiym E
IC=A
fyj  xlg = i h  l;j =
= i h
1
9
(1 + 2l h) (1 + 2n+1 j) (A.6)
Case 4: l+j=n+h+1 This is the dual case of the case 3:
E
IC=A
fxiyj  xlymg = xn+h+1 jyj E
IC=A
fxi  ymg = n+1 j  i;m =
= n+1 j
1
9
(1 + 2i h) (1 + 2n+1 m) (A.7)
Case 5: uncorrelated partial products When no previous case is verified,
the two partial products are uncorrelated and hence the mean of their product
is equal to the product of their means:
E
IC=A
fxiyj  xlymg = i;j  l;m (A.8)
A.3. COMPUTING THE INTRINSIC ERROR 149
A.3 Computing the intrinsic error
In order to simplify the reasoning, the second (2.10) can be rearranged as fol-
lows:
n h 1X
i=1
n h i+1X
j=2
xi+j 1+hyn+1 i2 n h j (A.9)
"2intrinsic is given by two contribution Sij and SCOVijlm.
As told in Sec. 2.3.1 Sij is the sum of the variances of each partial product
2i;j(A) (multiplied by the square of partial product weight):
Sij =
n h 1X
i=1
n h i+1X
j=2
2i+j 1+h;n+1 i(A)2
 2h 2n+1 (A.10)
substituting (A.2) in (A.10) one obtains:
Sij =
n h 1X
i=1
n h i+1X
j=2
1
81
[8 + 10i+j 1 + 10i   28i+j 1i] 2 2h 2n+1
(A.11)
SCOVijlm is the sum of twice the covariance between every couple of dif-
ferent partial products, COVi;j;l;m(A), (multiplied by the products of their
weights). SCOVijlm can be divided in three terms, in order to highlight the
single addends discussed previously.
The first term, T1, is given by the covariance of the terms on the same row, as
computed in (A.4):
T1 = 2
 2h 2n+1
n h 2X
i=1
n h iX
j=2
n h i+1X
m=j+1
2 j m
2
81
(1  i+
+ 2i+j 1 + 2i+m 1   2ii+j 1   2ii+m 1+
+ 4i+j 1i+m 1   4ii+j 1i+m 1) (A.12)
The second term, T2, is given by the covariance of the terms discussed in the
case 2 (A.5):
T2 = 2
 2h 2n+1
n h 2X
i=1
n h+1 iX
j=3
i+j 2X
m=i+1
2 i 2j+m
2
81
(1  i+j 1 + 2i+
+ 2m   2i+j 1i   2i+j 1m + 4im   4ii+j 1m) (A.13)
150 APPENDIX A. INTRINSIC ERROR
The final term, T3, is given by both the terms discussed in the case 3 and in the
case 4 whose reference equations are (A.6) and (A.7):
T3 = 2
 2h 2n+1
n h 2X
i=1
n h iX
j=2
n h i j+2X
m=2
2 j m
81
( 1 + i+j 1 
  2i   2i+j+m 2 + 2i+j 1i + 2i+j 1i+j+m 2 
  4ii+j+m 2 + 4ii+j+m 2i+j 1) (A.14)
Finally the "2intrinsic is given by:
"2intrinsic =
X
A2
(Sij + T1 + T2 + T3)P (A) (A.15)
In order to solve the eq. (A.15) we can invert the external sum over the set
 with the internal sums on indexes i, j, and m. Furthermore asX
A2
iP (A) =
X
A2
xh+iyn+1 iP (A) =
X
A2
xh+iP (A)
X
A2
yn+1 iP (A)
= E fxh+igE fyn+1 ig
and the terms i (i=1, . . . ,neq) are independent, we can write:
X
A2
iP (A) =
1
4X
A2
ijP (A) =
1
16
(A.16)
X
A2
ijmP (A) =
1
64
Substituting (A.16) in the equation in eq. (A.15) one obtains:
"2intrinsic = lsb
2  2 2h

1
24
2 neq   7
324
2 2neq+
+

13
864
  1
48
2 neq

 neq   13
648

(A.17)
Appendix B
Calculation of Linear function
In order to solve the system (2.62) we can firstly compute the mean erasing
obtained when the linear compensation function (2.60) is employed. By sub-
stituting (2.60) and (2.54) in (2.29) we have:
lin =
X
A2
0BBBBBB@ H  
neqX
i=1
ii  
NeqX
i=1
neqX
j = 1
j 6= i
fi;j
2
ij
1CCCCCCA  P (A) (B.1)
where we have defined:
H = K  Kl; i = fi   li (B.2)
Inverting the external sum over the set  with the internal sums on indexes i, j
and using (A.16), the equation (B.1) can be simplified as follows:
lin =  H   1
4
neqX
i=1
i   1
16
neqX
i=1
neqX
j = 1
j 6= i
fi;j
2
(B.3)
We can now compute 2lin. By substituting (B.3), (2.60) and (2.54) in (2.61)
151
152 APPENDIX B. CALCULATION OF LINEAR FUNCTION
we have:
2lin =
X
A2
 
NeqX
i=1
i

i   1
4

+
+
NeqX
i=1
NeqX
j = 1
j 6= i
fi;j
2

ij   1
16

1CCCCCCA (B.4)
Note that 2lin does not depend on H, as observed in Sec. 2.4. In order to find
the optimal li values we have to solve the system (2.62). By substituting (B.4)
in the system (2.62) one obtains:X
A2
m 
 
NeqX
i=1
i

i   1
4

+
NeqX
i=1
NeqX
j = 1
j 6= i
fi;j
2

ij   1
16

1CCCCCCA  P (A) = 0 m = 1; :::; Neq (B.5)
This system can be again simplified with the help of (A.16). With simple
algebra the system simplifies directly in diagonal form:8>>>>>><>>>>>>:
m =  1
2
neqX
i = 1
i 6= m
fi;m
2
m = 1; :::; neq (B.6)
From (B.6) the result (2.63) is easily obtained.
As discussed in Sec. 2.4,Kl (that is H) can be fixed by imposing lin = 0.
By using the lin expression (B.3), we have:
 H   1
4
neqX
i=1
i   1
16
neqX
i=1
neqX
j = 1
j 6= i
fi;j
2
= 0 (B.7)
153
By substituting (B.6) in (B.7) and solving for H we found:
H =
1
16
neqX
i=1
neqX
j = 1
j 6= i
fi;j
2
(B.8)
From this equation the result (2.65) is easily verified.

Bibliography
[1] C. Baught and B. Wooley, “A two’s complement parallel array multipli-
cation algorithm,” IEEE Transactions on Computers, vol. C-23, no. 12,
pp. 1045–1047, Dec. 1974.
[2] C. Wallace, “A suggestion for fast multiplier,” IEEE Transaction on Elec-
tronic Computers, vol. EC-13, no. 1, pp. 14–17, Feb. 1964.
[3] L. Dadda, “Some schemes for parallel multipliers,” Alta Frequenza,
vol. 34, pp. 349–356, Jan. 1965.
[4] K. Bickerstaff, M. Schulte, and J. E.E. Swartzlander, “Parallel reduced
area multipliers,” Journal of VLSI Signal Processing Systems, vol. 9,
no. 3, pp. 181–191, Apr. 1995.
[5] V. Oklobdzija, D. Villeger, and S. Liu, “A method for speed optimized
partial product reduction and generation of fast parallel multipliers us-
ing an algorithmic approach,” IEEE Transactions on Computers, vol. 45,
no. 3, pp. 294–309, Mar. 1996.
[6] P. Stelling, C. Martel, V. Oklobdzija, and R. Ravi, “Optimal circuits for
parallel multipliers,” IEEE Transaction on Computers, vol. 47, no. 3, pp.
273–285, Mar. 1998.
[7] J. Um and T. Kim, “Optimal bit-level arithmetic optimization for high-
speed circuits,” Electronics Letters, vol. 36, no. 5, pp. 405–406, Mar.
2000.
[8] S. Kidambi, F. El-Guibaly, and A. Antonious, “Area-efficient multipliers
for digital signal processing applications,” IEEE Transaction on Circuits
and Systems II: Analog and digital signal processing, vol. 43, no. 2, pp.
90–95, Feb. 1996.
155
156 BIBLIOGRAPHY
[9] Y. Lim, “Single-precision multiplier with reduced circuit complexity
for signal processing applications,” IEEE Transactions on Computers,
vol. 41, no. 10, pp. 1333–1336, Oct. 1992.
[10] E. King and J. E.E. Swartzlander, “Data dependent truncated scheme for
parallel multiplication,” in Proc. of 31th Asilomar Conf. on signals, cir-
cuits and systems, 1998, pp. 1178–1182.
[11] M. Schulte and J. E.E. Swartzlander, “Truncated multiplication with cor-
rection constant [for DSP],” in VLSI Signal Processing, VI, 1993., [Work-
shop on], Veldhoven, Netherlands, Oct. 1993, pp. 388–396.
[12] J. Stine and O. Duverne, “Variations on truncated multiplication,” in Pro-
ceedings of the Euromicro Symposium on Digital Systems Design, DSD
’03, Sep. 2003, pp. 112– 119.
[13] J. Jou, S. Kuang, and R. Chen, “Design of low-error fixed-width multi-
pliers for DSP applications,” IEEE Transactions on Circuits and Systems
II: Analog and Digital Signal Processing, vol. 46, no. 6, pp. 836–842,
Jun. 1999.
[14] F. Curticapean and J. Niittylahti, “A hardware efficient direct digital fre-
quency synthesizer,” in The 8th IEEE International Conference on Elec-
tronics, Circuits and Systems, 2001. ICECS 2001, vol. 1, Malta, May
2001, pp. 51–54.
[15] L. Van, S. Wang, and W. Feng, “Design of the lower error fixed-width
multiplier and its application,” IEEE Transactions on Circuits and Sys-
tems II: Analog and Digital Signal Processing, vol. 47, no. 10, pp. 1112–
1118, Oct. 2000.
[16] L. Van and C. Yang, “Generalized low-error area-efficient fixed-width
multipliers,” IEEE Transactions on Circuits and Systems I: Regular Pa-
pers, vol. 52, no. 8, pp. 1608–1619, Aug. 2005.
[17] Y. Liao, H. Chang, and C. Liu, “Carry estimation for two’s complement
fixed-width multipliers,” in IEEE Workshop on Signal Processing Sys-
tems Design and Implementation, 2006. SIPS ’06., Banff, Alta., Oct.
2006, pp. 345–350.
[18] S. Kuang and J. Wang, “Low-error configurable truncated multipliers for
multiply-accumulate applications,” Electronics Letters, vol. 42, no. 16,
pp. 904–905, Aug. 2006.
BIBLIOGRAPHY 157
[19] R. Michard, A. Tisserand, and N. Charvillon, “Carry prediction and se-
lection for truncated multiplication,” in IEEE Workshop on Signal Pro-
cessing Systems Design and Implementation, 2006. SIPS ’06., Banff,
Alta., Oct. 2006, pp. 339–344.
[20] H. Park and J. E.E. Swartzlander, “Truncated multiplication with sym-
metric correction,” in Fortieth Asilomar Conference on Signals, Systems
and Computers, 2006. ACSSC ’06., Pacific Grove, CA, Nov. 2006, pp.
931–934.
[21] A. Strollo, N. Petra, and D. D. Caro, “Dual-tree error compensation for
high performance fixed-width multipliers,” IEEE Transactions on Cir-
cuits and Systems II: Express Briefs, vol. 52, no. 8, pp. 501–507, Aug.
2005.
[22] B. Parhami, Computer Arithmetic: Algorithms and Hardware Designs.
Oxford University Press, 1999.
[23] J. E.E. Swartzlander, “Truncated multiplication with approximate round-
ing,” in Conference Record of the Thirty-Third Asilomar Conference on
Signals, Systems, and Computers, 1999., vol. 2, Pacific Grove, CA, USA,
Oct. 1999, pp. 1480–1483.
[24] C. Piguet, Low-power electronics design. CRC Press, 2005.
[25] N. Petra, D. D. Caro, and A. Strollo, “Minimummean square error signed
and unsigned inputs fixed-width multipliers,” in Proceedings of IEEE Eu-
ropean Conference on Circuits Theory and Design (ECCTD07), Sevilla,
Spain, Aug. 2007, pp. 464–467.
[26] J. Pihl and E. Aas, “A multiplier and square generator for high perfor-
mance dsp applications,” in Proceedings of the 39th Midwest IEEE Sym-
posium on Circuit and Systems, vol. 1, Ames, IA, USA, Aug. 1996, pp.
109 – 112.
[27] R. Kologotla, W. Griescbach, and H. Scrinivas, “Vlsi implementation of
a 350mhz 0.35 um 8 bit merged squarer,” Electronics Letters, vol. 34, pp.
47 – 48, Jan. 1998.
[28] K. Wires, M. Schulte, L. Marquette, and P. Balzola, “Combined unsigned
and two’s complement squarers,” in 33rd Asilomar Conference on Sig-
nals, Systems and Computers, vol. 2, Pacific Grove, CA, USA, Oct. 1999,
pp. 1215 – 1219.
158 BIBLIOGRAPHY
[29] A. Strollo, E. Napoli, and D. D. Caro, “New design of squarer circuits
using booth encoding and folding techniques,” in The 8th IEEE Inter-
national Conference on Electronics, Circuits and Systems, ICECS 2001,
vol. 1, Sep. 2001, pp. 193 – 196.
[30] A. Strollo and D. D. Caro, “Booth folding encoding for high performance
squarer circuits,” IEEE Transaction on Circuits and Systems II: Analog
and Digital Signal Processing., vol. 50, no. 5, pp. 250 – 254, May 2003.
[31] T. Chen, “A binary multiplication scheme based on squaring,” IEEE
Transactions on Circuits and Systems I, vol. C, no. 20, pp. 678 – 680,
1971.
[32] R. Strandberg, L. Bustamante, V. Oklobdzija, M. Soderstrand, and
J. LeDuc, “Efficient realizations of squaring circuit and reciprocal used
in adaptive sample rate notch filters,” IEEE Journal of VLSI Signal Pro-
cessing, vol. 14, no. Dec., pp. 303 – 309, 1996.
[33] E. W. III and M. Schulte, “Efficient function approximation using trun-
cated multipliers and squarers,” in Proceedings of the 17th IEEE Sympo-
sium on Computer Arithmetic, Jun. 2005, pp. 232 – 239.
[34] K. Cho, Y. Kim, and J. Chung, “Power and area efficient squarer de-
sign,” in Fortieth Asilomar Conference on Signals, Systems and Comput-
ers, 2006. ACSSC ’06, Pacific Grove, CA, Nov. 2006, pp. 1721–1725.
[35] A. Oppenheim and R. Schafer, Discrete-Time Signal Processing. Pren-
tice Hall, 1999.
[36] S. Smith, The scientist and Engineers Guide to Digital Signal PRocess-
ing. California Technical Publishing, 1999.
[37] R. Lyons, Understanding Digital Signal Processing. Prentice Hall Pro-
fessional Technical Reference, 2004.
[38] I. Simon, N. Barsan, and U. Weimar, “Micromachined metal oxide gas
sensors: Opportunities to improve sensor performance,” Sensors and Ac-
tuators B: Chemical, vol. 73, no. 1, pp. 1–26, Feb. 2001.
[39] S. Santra, S. Ali, P. Guha, P. Hiralal, H. E. Unalan, S. Dalal, J. Covington,
J. Gardner, W. Milnea, and F. Udrea, “Cmos alcohol sensor employing
zno nanowire sensing films,” in Proceedings of the 13th International
Symposium on Olfaction and Electronic Nose. AIP Conference Proceed-
ings, vol. 1137, Brescia,Italy, Oct. 2009, pp. 119–122.
[40] J. Gardner, V. Varadan, and O. Awadelkarim, Microsensors MEMS and
Smart Devices. John Wiley & Sons, 2001.
[41] P. Guha, S. Ali, C. Lee, F. Udrea, W. Milne, T. Iwaki, J. Covington,
and J. Gardner, “Novel design and characterisation of soi cmos micro-
hotplates for high temperature gas sensors,” Sensors and Actuators B:
Chemical, vol. 127, no. 1, pp. 260–266, 2007.
[42] J. Suehle, R. Cavicchi, M. Gaitan, and S. Semancik, “Tin oxide gas sen-
sor fabricated using cmos micro-hotplates and in-situ processing,” IEEE
Electron Device Letters, vol. 14, no. 3, pp. 118–120, Mar. 1993.
[43] F. Udrea, J. Gardner, D. Setiadi, J. Covingtom, T. Dogaru, C. Lu, and
W. Milne, “Design and simulations of soi cmos micro-hotplate gas sen-
sors,” Sensors and Actuators B: Chemical, vol. 78, pp. 180–190, Aug.
2001.
[44] M. Afridi, J. Suehle, M. Zaghloul, D. Berning, A. Hefner, R. Cavac-
chi, S. Semancik, C. Montgomery, and C. Taylor, “A monolithic cmos
microhotplate-based gas sensor system,” IEEE Sensors Journal, vol. 2,
no. 6, pp. 644–655, Dec. 2002.
[45] S. Ali, F. Udrea, W. Milne, and J. Gardner, “Micromachined metal ox-
ide gas sensors: Opportunities to improve sensor performance,” Journal
of Microelectromechanical system, vol. 17, no. 6, pp. 1408–1414, Dec.
2008.
[46] M. Graf, D. Barrettino, H. Baltes, and A. Hierlemann, CMOS Hotplate
Chemical Microsensors. Springer, 2005.
[47] S. Santra, P. Guha, S. Ali, I. Haneef, F. Udrea, and J. Gardner, “Soi diode
temperature sensor operated at ultra high temperature - a critical anal-
ysis,” in Proc. of 13th IEEE Sensors Conference, Italy, Oct. 2008, pp.
78–81.
160 BIBLIOGRAPHY
