Design of hardware efficient FIR filter: A review of the state-of-the-art approaches  by Chandra, Abhijit & Chattopadhyay, Sudipta
Review
Design of hardware eﬃcient FIR ﬁlter: A review of the state-of-the-art
approaches
Abhijit Chandra a,*, Sudipta Chattopadhyay b
aDepartment of Instrumentation & Electronics Engineering, Jadavpur University, Kolkata, India
bDepartment of Electronics & Telecommunication Engineering, Jadavpur University, Kolkata, India
A R T I C L E I N F O
Article history:
Received 5 May 2015
Received in revised form
11 June 2015
Accepted 24 June 2015
Available online 1 September 2015
Keywords:
Common sub-expression elimination (CSE)
Differential coeﬃcient method (DCM)
Genetic algorithm (GA)
Minimal difference differential coeﬃcients
method (MDDCM)
Mixed integer linear programming (MILP)
Multiple constant multiplication
Multiplier-less ﬁlter
Pseudo ﬂoating point (PFP)
A B S T R A C T
Digital signal processing (DSP) is one of the most powerful technologies which will shape the science,
engineering and technology of the twenty-ﬁrst century. Since 1970, revolutionary changes took place
in the broad area of DSP which has made it an essential tool in many engineering applications. Digital
ﬁlter is considered to be one of the most important components of almost every DSP sub-systems and
therefore a number of extensive works had been carried out by researchers on the design of such ﬁlter.
In order to meet the stringent requirements of ﬁlter speciﬁcation, order of the designed ﬁlter is gener-
ally assumed to be very large and this leads to high power and area consumption during their
implementation. As a matter of fact, design of hardware eﬃcient digital ﬁlter has drawn enormous at-
tention which needs to be addressed by various useful means. One popular approach has been to encode
the tap coeﬃcients of such ﬁlter in the form of sum of signed powers-of-two and thus the operation of
multiplication is substituted by simple addition and shifting. This paper presents a detailed review of
the basic design approaches applicable for the synthesis of hardware eﬃcient ﬁnite duration impulse
response (FIR) ﬁlter. Both the traditional and heuristic search algorithms have been incorporated and
properly arranged in this review.
Copyright © 2015, The Authors. Production and hosting by Elsevier B.V. on behalf of Karabuk
University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/
licenses/by-nc-nd/4.0/).
1. Introduction
Digital ﬁlter design has brought signiﬁcant attention amongst re-
searchers over the last few decades. The class of digital ﬁlters may
broadly be categorized into ﬁnite duration impulse response (FIR) and
inﬁnite duration impulse response (IIR) ﬁlter. FIR ﬁlters exhibit sig-
niﬁcant advantages like bounded-input-bounded-output (BIBO)
stability, phase-linearity, and low-coeﬃcient sensitivity over IIR coun-
terparts which have made them perfectly suitable in many
applications [1–3]. Amajor drawback of FIR ﬁlter is the large number
of arithmetic operations involved during the implementation which
limits its speed and demands more power [4]. This has motivated
researchers to lean on the ﬁeld of hardware eﬃcient low-power ﬁlter
design and accordingly this ﬁeld has been enriched with a number
of valuable contributions frommany scientists all over theworld. The
ﬁrstwritten articlewas published in the year 1982 [5] and the concept
is exhaustively studied even today [6–8]. Hence it can be identiﬁed
as an active area of research. FIR ﬁlters are generally characterized
by their impulse response coeﬃcients indicating the multiplication
constantswith the input signals. Thesemultipliers are power and area
consuming devices and thus make the ﬁlter unbeﬁtting in portable
wireless devices like mobile phones, tablets, laptops etc. One of the
most eﬃcient ways to reduce the complexity of digital ﬁlter is to
conﬁne the tap coeﬃcients to assume values in the form of sums of
signed-powers-of-two (SPT). As a matter of fact, multipliers can be
replaced by a small number of shifters and adders resulting in cor-
responding improvement in the area and power eﬃciency. Low area
and low power design of FIR ﬁlter can also be achieved with the aid
of parallel or block processing which is also found to be suitable to
increase the effective throughput. This has been achieved by various
means which include frequency spectrum characteristics [9], iter-
ated short convolution [10,11] and so on. These power eﬃcient digital
ﬁlters have found their application in modern digital communica-
tion systems [12,13], wireless sensor networks [14] and so on.
One of the simplest methods to realize ﬁnite word length FIR
ﬁlter is obtained by rounding the optimum inﬁnite precision coef-
ﬁcients to its B-bit representation. However, the performances of
such ﬁlters are signiﬁcantly degraded from those with optimum real
coeﬃcients. Some suboptimal algorithms [15–17] can be found in
* Corresponding author. Tel.: +91 33 2335-2587; Fax: +91 33 2335-7254.
E-mail address: abhijit922@yahoo.co.in (A. Chandra).
Peer review under responsibility of Karabuk University.
http://dx.doi.org/10.1016/j.jestch.2015.06.006
2215-0986/Copyright © 2015, The Authors. Production and hosting by Elsevier B.V. on behalf of Karabuk University. This is an open access article under the CC BY-NC-
ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Engineering Science and Technology, an International Journal 19 (2016) 212–226
Contents lists available at ScienceDirect
Engineering Science and Technology,
an International Journal
journal homepage: ht tp : / /www.elsevier.com/ locate / jestch
Press: Karabuk University, Press Unit
ISSN (Printed) : 1302-0056
ISSN (Online) : 2215-0986
ISSN (E-Mail) : 1308-2043
Available online at www.sciencedirect.com
ScienceDirect
HOSTED BY
this context which may improve the coeﬃcients obtained by the
rounding of optimal ﬂoating-point arithmetic through global search,
univariate search, modiﬁed univariate search [15], and random-
search optimization method [17]. Application of branch and bound
technique for nonlinear discrete optimization in selecting the co-
eﬃcients of recursive digital ﬁlter with a given word length has been
shown in Reference 16 to meet the arbitrary response speciﬁca-
tion. These methods can only be applied for ﬁlter with lower order
and the obtained solution is suboptimal in most of the cases.
Design of FIR ﬁlter with SPT coeﬃcients has largely been consid-
ered as a problem of optimization in a discrete space with an aim to
reduce the error power between the ideal and desired frequency re-
sponse. In connection to this, themethodsof integer linear programming
and integer quadratic programming are particularly useful for design-
ing FIRﬁlterswith powers-of-two coeﬃcient grid [18]. However, integer
programming have some serious disadvantages in the sense that the
solution of ﬁnite word length obtained by using integer program-
ming saves only a few bits in coeﬃcient word length in comparison
with the solution obtained by rounding. Moreover, it demands huge
amount of computer resources and thus limits the maximum size of
the ﬁlter. As a matter of fact, the design problems of powers-of-two
FIR ﬁlter have subsequently been formulated as mixed integer linear
programming (MILP) [19], integer semi-inﬁnite linearprogramming [20],
semi deﬁnite programming (SDP) [21], discrete semi-inﬁnite linear pro-
grammingproblem (DSILP) [22], andmixed integer programming (MIP)
[23] by various researchers over a number of years. Somemethodsusing
the branch and bound (B & B) technique based on linear program-
ming are most useful in MILP [20,24].
In the context of powers-of-two ﬁlter design, problem of mul-
tiple constant multiplication (MCM) has been an active area of
research over the last two decades. As the coeﬃcients are con-
stants, it is possible to realize those coeﬃcients using shifts, additions
and subtractions and thus eliminating the need of multipliers in the
ﬁlter structure. MCM is the problem of realizing the multiplica-
tion of the same input by a number of constant integers using
minimummultiplier-less operations. The idea of MCM is to utilize
redundancies between the coeﬃcients so as to minimize the re-
quired number of adders. Generally speaking, different MCM
algorithms as available in the literature can be divided into three
groups, i.e. adder graph method [25–27], common sub-expression
elimination [28,29] and difference method algorithms [30,31].
Graph-based algorithms are bottom-up methods that itera-
tively construct the graph representing the multiplier block. The
graph construction is guided by a heuristic that determines the next
graph vertex to be added to the graph. Graph-based algorithms offer
more degrees of freedom by not being restricted to a particular rep-
resentation of the coeﬃcients and typically produce solutions with
the lowest number of operations. The ﬁrst written article in this
regard has been published in 1995 [25] which has proposed the
concept of n-dimensional reduced adder graph (RAG-n) algorithm
for the reduction of adders in ﬁlter design. This was considered to
be the best approach for more than a decade before the inception
of the HCUB algorithm in 2007 [27]. In recent times, algorithms like
difference based adder graph heuristic for MCM problems [26] and
truncated MCM using pattern modiﬁcation technique (PMT) [32]
have been appearing in the list which have established their supe-
riority over the previously best approaches.
Complexity reduction of multiplication coeﬃcients has been
carried out by variousmeans amongstwhich common sub-expression
elimination (CSE) algorithm is most popular. A number of research
articles have already been reported in the literature [29,33–35] which
have dealt with CSE in different aspects of multiplier-less FIR ﬁlter
design. One common feature of CSE algorithms is to identify common
bit patterns in the coeﬃcient set and to share those identiﬁed common
sub-expressions to minimize the adder cost. Hartley [29] took the
pioneering initiative of sub-expression sharing in ﬁlters using canonic
signed digit (CSD) multipliers. Inspired by the application of CSE in
designing hardware eﬃcient digital ﬁlters, algorithms like non re-
cursive signed common sub-expression elimination (NR-SCSE) [36]
and heuristic common sub-expression elimination [37] have been suc-
cessively introduced by various researchers over a number of years
and their supremacy over the predecessors has been properly sub-
stantiated. However, the ﬁlter structure obtained using CSE is hard
to pipeline because it is highly irregular. In addition to this, since the
coeﬃcients of programmable and reconﬁgurable ﬁlters are not ﬁxed,
it would not have been easy to ﬁnd the common sub-expressions for
newly applied coeﬃcients [38].
As a matter of fact, reordering of ﬁlter coeﬃcients has emerged
as one of the eﬃcient techniques to reduce the hardware cost of
digital ﬁlters. In regard to this, differential coeﬃcientsmethod (DCM)
[30] follows the intuition that recasting the ﬁlter computation in
terms of the difference between the adjacent coeﬃcients can reduce
the number of ones required to represent coeﬃcients. DCM works
wellwhenconsecutive coeﬃcients are similar i.e. hk = ( )10110 01111
and hk− = ( )1 1 0 010 0 0111 . However, it suffers when large differ-
ences result in many 1’s being needed to store the difference.
Moreover, if any of the two coeﬃcients is zero, DCM offers no ad-
ditional advantage. To get rid of this problem, Vinod et al. have
proposed the minimal difference differential coeﬃcients method
(MDDCM) [39,40] which, rather than storing the difference of FIR
coeﬃcients in an order from h0 to hN−1, sorts the coeﬃcients in such
away that adjacent coeﬃcients haveminimumdifferences inmag-
nitude. By considering the fact that the adderwidth canbeminimized
by limiting the shifts of the operands to shorter lengths, an eﬃ-
cient coeﬃcient partitioning algorithm, called pseudo ﬂoating point
(PFP) representation has been introduced in Reference 41. This has
been integrated with vertical common sub-expression elimination
(VCSE) algorithmtowards thedesignof lowcomplexity channelﬁlters.
Inspired by the genetic and social behaviour of animals, the last
quarter of the twentieth century has brought various intelligent op-
timization techniques into limelight. These techniques, classiﬁed into
evolutionary and swarm optimization, have subsequently been
applied in a number of research applications of proper relevance.
In regard to powers-of-two FIR ﬁlter design, methods like tabu search
(TS) [42], genetic algorithm (GA) [43–45], micro GA (μGA) [46], modi-
ﬁed μGA [47], orthogonal GA (OGA) [48], differential evolution (DE)
[49], self-organizing random immigrants genetic algorithm [50,51],
particle swarm optimization (PSO) [52], and artiﬁcial bee colony
(ABC) algorithm [53,54] have been successfully incorporated and
their supremacy over their predecessors has been ﬁrmly established.
Success of hardware eﬃcient powers-of-two one dimensional
(1D) FIR ﬁlter design has also been extended towards the synthe-
sis of two-dimensional (2D) multiplier-less image ﬁlter when Pei
and Jaw published their article in the year 1987. Since then, re-
searchers throughout the world have signiﬁcantly contributed in
this ﬁeld for improving the quality of digital image. In connection
to this, conventional methods like linear programming (LP) [55], semi
deﬁnite programming (SDP) and artiﬁcial methods like genetic al-
gorithm (GA) [56,57], gravitational search algorithm (GSA) [58],
differential evolution [59] have proven their effectiveness.
It is unarguably true that the ﬁeld of hardware eﬃcient FIR ﬁlter
design has been enrichedwith numerous valuable contributions from
researchers for more than 30 years. As a matter of fact, it seems ap-
propriate to summarize those concepts which have been adopted
to address this problem over so many years. Motivated by this aim,
this work presents a detailed review on the evolution of design ﬂow
for hardware eﬃcient FIR ﬁlter. The rest of the paper has been or-
ganized as follows. Section 2 describes the growth of mathematical
programming in the design of linear phase powers-of-two FIR ﬁlter.
CSE and its numerous advancements have been presented in Section
3. Section 4 accumulates all such approaches focusing on the min-
imization of adders in hardware eﬃcient FIR ﬁlter design. In Section
213A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
5, an extensive survey on the coeﬃcient representation scheme has
been carried out. Application of intelligent optimization tech-
niques in the area of multiplier-less FIR ﬁlter design has been
thoroughly investigated in Section 6 followed by the design strat-
egies of two dimensional hardware eﬃcient FIR ﬁlter in Section 7.
Experimental observations have been listed in Section 8 and the
paper is ﬁnally concluded in Section 9with a possible scope of future
research in this particular ﬁeld.
2. Linear phase powers-of-two FIR ﬁlter design using
mathematical programming
Design of FIR ﬁlters over a discrete powers-of-two coeﬃcient
space has been a part of active research since long. The ﬁrst written
article based on this area of signal processing was that of Lim et al.
[5] where the discrete coeﬃcients are selected by the method of
integer programming. The frequency response H ω( ) of any FIR ﬁlter
of length N can always be expressed as a trigonometric function of
the frequency variable ω [18]:
H
h
N
h n
N
n for odd N
h
n
N
ω
ω
( ) =
−⎛⎝⎜ ⎞⎠⎟ + ( )
−
−
⎛⎝⎜ ⎞⎠⎟
=
−( )
∑12 2
1
2
2
0
3
2
cos ,
n
N
n for even N
n
N
( ) − −⎛⎝⎜ ⎞⎠⎟
⎧
⎨
⎪⎪
⎩
⎪⎪
=
( )−
∑ cos ,ω 120
2 1
(1)
Magnitude part of the response has been illustrated and the
symmetricity of the impulse response i.e. h n h N n( ) = − −( )1 is
assumed in equation (1). The resulting phase response for both odd
and even N may therefore have a form like:
Θ ω
ω ω
ω π ω
( ) =
−
−⎛⎝⎜ ⎞⎠⎟ ( ) >
−
−⎛⎝⎜ ⎞⎠⎟ + ( ) <
⎧
⎨
⎪⎪
⎩
⎪⎪
N
if H
N
if H
1
2
0
1
2
0
,
,
(2)
The ﬁlter design problem is nothing but obtaining the set of co-
eﬃcients h(n) such that H ω( ) is the best approximation to some
desired function D ω( ) with respect to some optimality criterion.
During their design using minimax strategy, the value of H ω( ) is
subject to the following constraint:
D k H D kω δ ω ω ω δ ω( )− ( ) ≤ ( ) ≤ ( ) + ( ) (3)
δk(ω) in equation (3) stands for the ripple to be minimized. One
possible criterion to optimize the ﬁlter coeﬃcient h(n) is to reduce
the output error power [60]. This may be expressed as:
E V D Hω ω ω ω( ) = ( ) ( ) − ( )2 2 2 (4)
E(ω) and V(ω) signify the frequency spectrum of error signal and
input signal respectively in the above equation. According to
Parseval’s theorem, the above equationmay be observed as themin-
imization of the variable J as follows [60]:
J V D H d= ( )− ( )∫ ( )ω ω ω ωπ 2 2
0
(5)
As can be inspected from the former illustration, optimum co-
eﬃcient values may be obtained by reducing the weighted average
of the error signal with different weights to be assigned for the fre-
quency variable in the pass-band and stop-band. For feasible solution
of equation (5), integration can simply be replaced by summation
like, ∑ ( ) ( ) − ( )i i i iV D Hω ω ω2 2. Minimization of J subject to some
linear constraints on the elements of h(n) is a quadratic program-
ming problem and therefore a general purpose integer quadratic
programming can be used for solution. However, combination of
linear and quadratic programming packages with a branch-and-
bound (B & B) technique may sometimes be useful to design ﬁlters
in discrete space and successfully be employed by the authors in
Reference 17. Authors have made use of two different variants of
branch-and-bound algorithm, namely isocost and breadth-ﬁrst
branch-and-bound search mechanism. The ﬁrst problem always
selects the best sub-problem for further branching for reducing the
search cost and the latter one continues until a sub-optimum dis-
crete solution is achieved.
The method of synthesizing hardware eﬃcient FIR ﬁlters re-
quiring fewer arithmetic operations than the conventional one is
based on a cascade structure of a multiplier-less pre-ﬁlter and an
FIR equalizer. One such optimal method for designing multiplier-
less FIR and IIR ﬁlters has been demonstrated [61] with cascaded
preﬁlter-equalizer architecture. During the course of design, both
the preﬁlter and equalizer are simultaneously designed using MILP
that yields a resulting ﬁlter with minimal complexity, assuming that
FIR ﬁlter consists of a cyclotomic polynomial (CP) preﬁlter and in-
terpolated second order polynomial (ISOP) equalizer. As far as the
design of IIR ﬁlter is concerned, all pole IIR equalizers consisting
of inverse of interpolated ﬁrst order polynomials (IIFOPs) are in-
troduced and a CP-preﬁlter cascaded with this type of equalizer has
been designed.
Themost convenient way of representing the coeﬃcients of hard-
ware friendly FIR ﬁlter is that of signed-powers-of-two (SPT)
illustration. For a ﬁxed word length B of the DSP processor, the
impulse response coeﬃcient may have its general form like:
h n s with si i
i
B
i( ) = ∈ −{ }−
=
∑ 2 1 0 1
1
, , (6)
A number of such illustrations are available in literature with a
common emphasis to reduce the hardware complicacy as a whole.
Mixed integer linear programming (MILP) technique was judi-
ciously utilized for this purpose which is formulated to minimize
the number of SPT terms for a given ﬁlter speciﬁcations. Such a rep-
resentation with minimum number of SPT terms is the canonic
signed digit code (CSDC) representation where no two SPT terms
can be adjacent. The SPT representation in equation (6) may have
its alternative binary form as:
h n s s with s sn i n i i
i
B
n i n i( ) = −( ) ∈{ }+ − −
=
+ −∑ , , , ,, ,2 0 1
1
(7)
Introduction of equation (7) has made the formulation of the op-
timization goal function linear and made it possible to have linear
constraints on thenumber of SPT termsper coeﬃcientwhich the system
designers have desperately tried to limit. Authors in Reference 19
(Gustafsson, 2001) have addressed this criterion in the optimization
problem through the inclusion of the constraint ∑ −( ) ≤= + −iB n i n i maxs s L1 , , ,
where Lmax is the maximum number of SPT terms per coeﬃcient.
The solution obtained by the process of optimizationmay not emerge
as in CSDC form which has been ensured through the incorpora-
tion of the inequality, s s s s n Nn i n i n i n i, , , , , , , . . ,+ − ++ +−+ + + ≤ ∀ = −{ }1 1 1 0 1 2 1… .
Design of FIR ﬁlters with sum of SPT coeﬃcients by means of
integer programming approach has been relaxed after a long time
when Lu [21] has proposed a semi deﬁnite programming (SDP)
problem that is solvable using eﬃcient SDP solvers in polynomial
time. SDP is nothing but a constrained optimization problemwhere
a linear objective function is minimized subject to matrix con-
straints that closely depend on the variable vector h. Typical
formulation of SDP problem has its form like [21]:
minimize C h where C cos NT Tω ω ω ω( ) = −( )[ ]1 2 1, , cos , . . , cos…
(8a)
214 A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
subject to F h F h Fi i
i
r
: ( ) = +
=
∑0
1
0 (8b)
Matrices Fi for 0 ≤ i ≤ r in the above equation are symmetric and
 symbolizes positive semi deﬁnite.
In the next year, Ito et al. [62] have proposed another design
method for linear phase SPT ﬁlters based on an SDP relaxation
method. Their method includes a linear programming (LP) relax-
ation and a relaxation by adding triangle inequalities. From the
theoretical view point, SDP relaxation with triangle relaxation is
stronger enough than simple SDP relaxation, LP relaxation or LP re-
laxation with triangle inequalities. In the same year, Yao and Chien
[63] have proposed a three step algorithm for designing linear phase
FIR ﬁlters with SPT coeﬃcients where MILP has been applied in the
last step to the three least signiﬁcant digits of the ﬁlter coeﬃ-
cients for reducing the number of SPT terms.
In order to ensure the optimality of the obtained solution, the
SPT FIR design problem has been formulated as a discrete semi-
inﬁnite linear programming problem (DSILP) and consequently been
solved by branch and bound (B & B) method [19]. Authors have
started solving the optimization problem by ignoring the fact that
each of the coeﬃcients is of SPT, i.e. relaxing DSILP to simple SILP.
As SILP is a continuous optimization problem, the achievable so-
lution may not always ensure the possibility of each coeﬃcient to
be of SPT and hence SILP is combined with the B & B method.
Anewapproach to low-power FIRﬁlter design algorithmhas been
formulated in Reference 64 as an MILP problem that minimizes
Chebyshev error and synthesizes coeﬃcients consisting of pre-
speciﬁed alphabets. The number of alphabets corresponding to the
coeﬃcients has been reduced signiﬁcantly and the near optimal co-
eﬃcients satisfy the ﬁlter characteristics as well. In the same year,
two-step and three-step schemes have been proposed in Reference
65 towards the design of variable digital ﬁlters with sum of SPT co-
eﬃcients using minimax or least-square criterion. For the least-
square criterion, an effective application of B & Bmethod for solving
this complex non-linear integer programming problemhas been ac-
complished through the introduction of a reduced search area and
an eﬃcient cutting scheme. Through numerical examples, authors
have also claimed that the obtained ﬁnite precision ﬁlters yield ap-
proximately the same performance as the inﬁnite precision solution
with a small number of additions and subtractions.
Recently, the design of discrete coeﬃcient FIR ﬁlters has been
facilitated by MILP and subsequently been solved by B & B tech-
nique [66]. The ﬁlter design problem has been formulated as a
minimization problem such as:
minimize γ , (9a)
subject to H forp p: ,ω δ ω ω( )− ≤ ∈[ ]1 0
H for sω γ ω ω π( ) ≤ ∈[ , ] (9b)
δp and γ identify the maximum allowable ripple of H ω( ) in the
pass-band and stop-band regions of interest. Authors have pointed
out theminimization problemwith trigonometric semi-inﬁnite con-
straints (TSICs). According to theMarkov-Lukacs theorem, linear TSICs
in the variable h N∈ + 1 can always be changed to non-negative trig-
onometric polynomial. The ﬁlter design problem may thus be
formulated as:
minimize
h,γ
γ
subject to A h d C ii i i+ ∈ ( ) =* , , , ,7 1 2 3 4 (10)
with A A A A I d dp
T
p1 2 3 4 1 21 0 0= − = = − = = −( ) = (, , , . . , ,δ δ…
T1 0 0+ ), , . . ,… , d d p T3 4 0 0= = ( )δ , , . . ,… . Equation (10) introduces a
new term Ci*, derived from Ci, which is known to be the descrip-
tion of TSICs in terms of trigonometric curves and its polar
in the reverse order. They have their mathematical illustrations as
follows:
C c cos a b where
c cos
a b
N
N
, : cos , cos
, , cos ,
= ( ) ∈[ ]∈{ }
( ) =
+ω ω
ω ω ω

…
1
1 2 . . , cosN Tω( ) (11)
C u u v v Ca b a b, ,* : ,= 〈 〉 ≥ ∀ ∈{ }0 (12)
Semi deﬁnite programming (SDP) of equation (10) has there-
fore been solved by using SeDuMi [67] and consequently the optimal
ﬁlter coeﬃcients can be synthesized very easily.
Branch and bound (B & B) technique has later been utilized for
designing low power linear phase FIR ﬁlters in Reference 68 by ﬁxing
a coeﬃcient to a certain value which is determined by ﬁnding the
boundary values of the coeﬃcient using linear programming. Al-
though the worst case run time of the algorithm is exponential, its
capability to ﬁnd out appreciably good solutions in reasonably good
amount of time makes it a desirable CAD tool for designing such
ﬁlters. Superiority of the algorithm on existing methods like those
in References 18, 62, 69, 70 and 71 in terms of SPT term count, design
time, hardware complexity and power performance has been ex-
plicitly demonstrated with several design examples.
The same problem of discrete coeﬃcient FIR ﬁlter design using
mixed integer programming (MIP) was formulated in Reference 23
where MIP is transformed into an equivalent integer program-
ming problem on the basis of a transformation between two integer
spaces and the computation of the optimum scaling factor for a given
set of coeﬃcients. An eﬃcient algorithm based on a discrete ﬁlled
function has subsequently been developed for solving the equiva-
lent problem. Authors have proven the supremacy of their design
over [70–74] with the help of some numerical examples.
An integer linear programming (ILP) approach to design optimal
ﬁnite word length linear-phase FIR ﬁlters in the logarithmic number
system (LNS) domain has been recently proposed [75]. Authors have
optimized the ﬁlter directly in the LNS domain with ﬁnite word
length constraints in which several branch variable selection and
branching direction schemes were suggested and evaluated. By
means of different design examples, it has been shown that the re-
sultant ﬁlters are optimal in the minimax sense under ﬁnite word
length conditions.
3. Common sub-expression elimination for the design of
hardware eﬃcient FIR ﬁlter
The general method for carrying out multiplication by a con-
stant value can be achieved using a sequence of shifters and adders.
However, the operations of subtractions are also used as well for
the sake of using the hardware eﬃciently. In most of the cases, best
results are obtained when the multipliers are represented by CSD
digits as reported in literature. A number of articles are available
in literature where researchers have developed the idea of opti-
mizing the design of CSD multipliers by eliminating the common
sub-expressions in ﬁlter coeﬃcients. Common sub-expression elim-
ination (CSE) has been extensively studied in literature and various
algorithms have been proposed in References 28, 29, 76 and 77 in
this regard. Basic feature of CSE method is to identify common bit
patterns in the set of coeﬃcients and to share those identiﬁed
common sub-expressions in order to reduce the number of addi-
tion operations.
The ﬁrst available article in this respect has been published by
Hartley in the year 1996 which has reduced the number of adders
by approximately 50% [29]. The proposed algorithm considers sub-
expressions mixing terms in different versions of the input signal
215A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
and additionally it explicitly takes into account the number of delay
latches in the circuit and attempts tominimize the number of adders
and delays.
The algorithm is based on ﬁnding out several common sub-
expressions between coeﬃcients. The main idea may be illustrated
by a simple example of a 4-tap FIR ﬁlter whose output can be ex-
pressed as:
y n h x n h x n h x n h x n( ) = ( ) + −( ) + −( ) + −( )0 1 2 31 2 3 (13)
with h h h0 2 1 2 21 01010 0 0 0 010 0 10 0 0 10 10 10 1= ( ) = ( ), ,
an20 10 010 0 0 0 010= ( ) d h2 210 0 0 0 010 10 0 0= ( ) .
It unambiguously identiﬁes ﬁve occurrences of the same common
sub-expressions between three different coeﬃcients as shown in
Fig. 1.
The ﬁrst sub-expression can be expressed as:
x x n x n x x2 1 1 11 2 1 1= ( )− −( ) = − −[ ] −( ) (14)
The term [-1] in equation (14) represents a unit delay and the
sign ‘ > >n’ corresponds to an n-step right shift and the bar indi-
cates a negative expression. Themethod proposed by Hartley is then
applied recursively to identify common sub-expression. Fig. 2shows
the location of the previous sub-expression into a new matrix and
its recursive use.
From these ﬁgures, complete deﬁnition of the ﬁlter may be
written as:
y x x x x x x
x x
= − + + − −[ ] − −[ ] − −[ ]
+ −[ ] − −
1 3 2 3 2 2
1 1
2 10 1 5 1 11 2
1 3 6 3
   
  [ ] = + 8 23 2 1with x x x (15)
In case any sub-expression deﬁnition involves negative shift, it
is to be modiﬁed accordingly to remove the negative shift as shown
below:
′ = − −[ ]x x x2 1 11 1 (16)
′ = ′ +x x x3 2 1 3 (17)
y x x x x x x x
x
= − + ′ + ′ − −[ ] − ′ −[ ] − ′ −[ ]+ −[ ]
− −
1 3 2 3 2 2 1
1
1 9 1 4 1 10 2 3
6
   
 3 8 1 12 2 3 3[ ] ′ = ′ =  with andx x x x (18)
In the same year, Potkonjak and his co-workers [28] have de-
scribed another common sub-expression based technique which
ﬁnds the maximum number of coincidences between two signed
digits. A new solution of the MCM problem is presented in Refer-
ence 77 that combines exhaustive search for multiple pattern
identiﬁcation with a steepest descent approach for pattern selec-
tion. Results have identiﬁed a signiﬁcant reduction in either
arithmetic operations or necessary hardware along with satisfac-
tory runtimes.
Towards the elimination of common sub-expression during the
design of multiplier-less ﬁlter, non recursive signed common sub-
expression elimination (NR-SCSE) algorithm has been proposed in
Reference 36 and consequently its several applications have been
discussed. The limitation resulting from the recursive utilization of
a common sub-expression is the high logic depth into the digital
circuit. This has been solved in Reference 36 by using each sub-
expression once. This new array splitting algorithm combines the
advantages of previousmethods in the sense that it reduces the logic
depth fromHartley algorithm [29] and uses approximately the same
number of logic operators than Bull-Horrocks modiﬁed (BHM) al-
gorithm [25]. It searches for the non recursive signed common sub-
expressions which must be eliminated from the original CSD array.
Heuristic common sub-expression elimination (CSE) and the co-
eﬃcient quantization by successive approximation algorithm have
been integrated in Reference 37 to precisely distribute a pre-
deﬁned addition budget to the quantized coeﬃcients. An improved
exploration algorithm with variable step-sizes has also been pro-
posed to ﬁnd an optimum scale factor that collectively settles the
ﬁlter coeﬃcients into the quantization space. Authors have claimed
to reduce approximately 30% budgets for comparable ﬁlter re-
sponses. The improved scale factor exploration helps to ﬁnd an
identical or a better quantization result with signiﬁcantly less run
time irrespective of the application of CSE.
X(n) 1 1 1 1
X(n-1) 1 1 1 1 1
X(n-2) 1 1 1
X (n-3) 1 1 1
Fig. 1. Five occurrences of the same common sub-expressions in four coeﬃcients of FIR ﬁlter.
1 2 1 2
2 1 2
2
1 1
1 3 2
3 2
2
1 1
Fig. 2. Recursive use of the algorithm [29] over the array in Fig. 1.
216 A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
As time progresses, researchers have not only considered the SPT
patterns in the coeﬃcients but also the length of critical path in the
multiplier-block. In connection to this, Yao et al. had proposed a novel
CSE algorithm [78] for the synthesis of ﬁxed point FIR ﬁlters which
performs tradeoff designs between complexity and the through-
put rate. The number of adders as synthesized by this method is
proportionate with that required by the algorithms like Refer-
ences 25, 28, 29 and 36. Authors have also claimed that their method
can synthesize the higher order complicated FIR ﬁlters within a few
seconds.
In the year 2005, Macleod and Dempster introduced a new CSE
algorithmwhich searches for a bounded number of minimal signed
digit (MSD) representation [79]. The proposed algorithm ﬁrst ﬁnds
all the possibleMSD representation of each different coeﬃcient value
by utilizing the method as described in Reference 80. Authors have
established the supremacy of their proposed algorithm by com-
paring its performance with the existing algorithms like References
81–83. A genetic programming-based method for CSE in multiplier-
less digital ﬁlter realization has been introduced in Reference 84
which had searched for the common factors in higher order digital
ﬁlters with a few non-zero digits in their coeﬃcients. Fitnessmeasure
for this optimization technique involves the number of common sub-
expressionswhich reduce interconnections and latency. Authors have
also established the eﬃciency of their approach by experiments in
1D and 2D ﬁlters.
In order to implement low-complexity parallel multiplier-less
digital FIR ﬁlters using the concept of shift inclusive differential (SID)
coeﬃcients and CSE, a new computation reductionmethod has been
proposed by Wang and Roy in Reference 33. The idea of SID coef-
ﬁcients has been reformulated by introducing a new graph
representation andmapping the optimization problem into an equiv-
alent problem of determining a directed minimum spanning tree
(DMST) of a directed multi-graph which has subsequently been
solved by an optimal graph theoretic algorithm. For further reduc-
tion of design complexity, a novel CSE method has been proposed
which recursively eliminates 2-bit sub-expressions with a steep-
est descent approach for sub-expression selection. As far as the
eﬃciency of the proposed method is concerned, up to 75% reduc-
tion has been achieved in terms of number of additions as compared
to other multiplier-less architectures like References 28, 29, 76 and
77. In comparison with one contemporary CSE algorithm,Wang and
Roy’s algorithm [33] achieves an improvement up to 19%.
Towards the synthesis of low-complexity powers-of-two FIR
ﬁlter, minimization of SPT terms has been considered as the opti-
mization goal. This problem statement has been reformulated in
Reference 35 to account for the sharable adders where the authors
address the optimization of the reusability of the adders for two
major types of common sub-expressions, together with the reduc-
tion of adders for spare SPT terms. By limiting the number of
common SPT (CSPT) terms to be no more than that of the rounded
CSD coeﬃcient set, ﬁrst stage of the algorithm freely allocates any
CSD coeﬃcient in the neighbourhood of the rounded coeﬃcient
set to enhance the occurrences of the two chosen common sub-
expressions while reducing the total number of spare SPT terms
in the minimal CSD coeﬃcient set. The unachievable normalized
peak ripple magnitude (NPRM) in the ﬁrst stage has been compen-
sated in the second stage by an eﬃcient word length dependent
adaptive neighbourhood search method. The algorithm uses a
common sub-expression-based hamming weight pyramid to locate
low-cost candidate coeﬃcients with preferential consideration of
shared common sub-expressions. The performance of the algo-
rithm was compared with a number of state-of-the-art multiplier-
less algorithms like References 18, 70, 71, 85, 86 and 87. Experimental
results have demonstrated that this method is capable of synthe-
sizing FIR ﬁlters with least CSPT terms in comparison with previous
approaches.
4. Approaches for the minimization of adders in hardware
eﬃcient FIR ﬁlter design
Design complexity resulting from the implementation of non-
recursive digital ﬁlters in custom or semi-custom integrated circuits
without any built-in multiplier is often measured in terms of the
number of addition operations used to realize the multiplication
operation. With a view to reduce this complexity, CSD representa-
tion was being used for a long time by the circuit designers for this
purpose [88].
The year 1995 has been marked with a signiﬁcant progress in
the ﬁeld of circuits and systems where many researchers have come
up with their innovative ideas towards the reduction of adder cost
in ﬁlter design. Some of them have proven that using multiplier
blocks for exploiting the redundancy across the coeﬃcients results
in considerable reduction in complexity over CSD representation
which in turn are less complex than standard binary representa-
tion. Three such new algorithms have consequently been proposed
in Reference 25 which consist of an eﬃcient modiﬁcation of an ex-
isting algorithm, one novel algorithm for better results and a
hybridization of these two which trades off performance against
computational time. Authors have investigated the shortcomings
of popular BH algorithm, proposed by Bull and Horrocks [89], which
had used multiplier blocks for reducing the implementation cost
of FIR ﬁlters. The performance of Bull and Horrocks [89] yields an
identical result as compared to the original designwhich uses several
single-coeﬃcient multipliers with fewer adders and subtractors by
virtue of the fact that it allows all products of the input sample to
be produced simultaneously. The limitations and the correspond-
ing solutions are readily available in Reference 25 which have
signiﬁcantly improved the results obtained.
As a part of their major contribution in reducing the adder cost,
Dempster and Macleod have introduced an n-dimensional reduced
adder graph (RAG-n) algorithm that is divided into two parts. The
ﬁrst section is optimal in the sense that it ensures minimum adder
cost provided that the set of the coeﬃcients had been completely
synthesized by this part of the algorithm. The second part of the
algorithm is heuristic which uses two look-up-tables generated by
the MAG algorithm [81], covering a range from 1 to 4096. For each
coeﬃcient value in the prescribed range, the cost look-up-table con-
tains the optimum single-coeﬃcient costs of multiplication and
fundamental look-up-table contains different sets of fundamen-
tals which can be used to implement the multiplication at optimal
cost. It has been well established that for small set sizes, BH [88]
and modiﬁed BH (BHM) [81] algorithms are signiﬁcantly faster than
the RAG-n algorithm.
As far as their contribution in the relevant ﬁeld is concerned, it
has been demonstrated that the heuristic RAG-n multi-coeﬃcient
cost multiplier block design algorithm results in an average im-
provement of about 20% over popularly known BH algorithm for
ﬁve coeﬃcients of 12 bit word length. BHM algorithm, which is iden-
tiﬁed as less eﬃcient one in comparison with RAG-n because of its
higher cost graph, even yields 10% improvement over BH algo-
rithm of 12 bit word length and hybrid algorithm. However, RAG-n
algorithm is slower than BHM for small coeﬃcient sets but is quicker
for large sets in which case the computation time for BH and BHM
has a square law growth rate with set size in comparison with linear
growth for RAG-n.
In the same year, towards the reduction of the complicacy of
ﬁxed-point multipliers with ﬁxed or programmable multipli-
cands, one method was presented by Li [90] which has driven
enormous attention in the relevant ﬁeld. Their approach deals with
ﬁnding out theminimumnumber of adders for implementing amul-
tiplier of a given multiplicand. Before the proposition of this article,
CSD expressions were normally used for multiplicands which had
been heavily challenged by the proposed minimum number of
217A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
shift-add operations (MNSAO) as far as the number of adders in the
structure is concerned.
In comparison with CSD expressions under no more than same
number of shift-add operations (SAOs); the MNSAO expression sig-
niﬁcantly increases the largest representable contiguous range and
the number of representable integers in a given range and thus
reduces the mean approximation error. Therefore it has subse-
quently been applied for the design of multiplier-less digital ﬁlters
subject to some pre-speciﬁed implementation cost determined by
the total number of adders in the entire ﬁlter. It has been shown
that the ﬁlters designed in Reference 70 are signiﬁcantly superior
to those designed by MILP programming [91] and simulated an-
nealing (SA) [92] which prescribes the number of SPT terms per
coeﬃcient to be no more than two. Li et al. have shown that the
designed ﬁlter can achieve up to 4.2 dB smaller normalized peak
ripple (NPR) over the technique in Reference 70, subject to the same
number of adders for the entire ﬁlter.
Another promising article [76] demonstrates the use of opti-
mizing transformations to diminish the number of additions and
subtractions for a given set of ﬁlter coeﬃcient values and coeﬃ-
cient representation schemes. For a direct form FIR ﬁlter structure,
the number of additions has been minimized by eliminating the
common sub-expressions in the binary representation of the co-
eﬃcients. Reduction of the adders in the transposed form i.e. MCM-
based form of FIR ﬁlter has also been taken care of by the authors
through some modiﬁcation of their already proposed algorithm. It
has been demonstrated clearly that through the incorporation of
CSE algorithm, total number of addition and subtraction opera-
tions has been reduced by as much as 35% for direct structure and
38% for transposed architecture. In effect, the total number of ad-
ditions and subtraction operation has been reduced by an average
factor of 2.2 in comparison with 1.43, as achieved in Reference 93.
Pearson and Parhi [94] had introduced a novel approach towards
the design of low power FIR ﬁlter by means of parallel or block pro-
cessing with duplication of hardware. They have achieved
considerable reduction in multiplier element at the cost of dou-
bling the number of adder elements. However, the reducedmultiplier
implementation yields lower hardware cost and less power con-
sumption by virtue of the fact that the area required to implement
a multiplier element is signiﬁcantly larger than that of the adder
element. In continuation to this, an adjacent coeﬃcient sharing based
sub-structure sharing technique along with maximum absolute dif-
ference quantization process has been introduced in References 95
and 96 and has subsequently been employed to reduce the hard-
ware cost of parallel FIR ﬁlters. Based on the given examples, authors
had shown that their proposition results in 45% reduction in hard-
ware cost as compared to traditional parallel ﬁltering methods.
Reduction of the total number of adders for synthesizing
multiplier-less FIR ﬁlters has been achieved through a number of
favourable approaches amongst which the systematic algorithm as
proposed by Kaakinen and Saramaki [85] ﬁnds its suitable place in
the literature. During the optimization procedure, one linear pro-
gramming algorithm has been initially used for determining the
parameter space of the inﬁnite-precision coeﬃcients as well as the
feasible space where the ﬁlter meets the given amplitude speciﬁ-
cations. The second step locates the ﬁlter parameters in this space
such that the resulting ﬁlter satisﬁes the criterion with the sim-
plest coeﬃcient representation form. Themain advantage associated
with the approach in comparison with other existing techniques is
that it ﬁnds all the solutions which can satisfy the given magni-
tude speciﬁcations.
Although the complexity of multiplier blocks was signiﬁcantly
reduced by adopting techniques like decomposing multiplication
into simple operations of shifts and additions and sharing common
sub-expressions, reducing the delay of multiplier blocks remained
as an unexplored area till Kang and Park [97] have presented new
algorithms to minimize the complexity of multiplier blocks under
the given delay constraints. Authors have combined three pro-
posed methods to BHM [81] and RAG-n [25] algorithms to
implement ﬁlters which can satisfy the given speciﬁcation of the
number of adder-steps. A trade-off between delay and hardware
complexity is enabled by changing the delay constraints. Experi-
mental results have shown that the algorithm in Reference 97 can
reduce the delay of multiplier blocks at the cost of a little increase
in complexity.
It took several years when researchers have aggressively reduced
both the coeﬃcient word length and the number of non-zero bits
in the ﬁlter coeﬃcients with an aim of minimizing the adder step
[98]. The authors have modiﬁed the representation of the ﬁlter co-
eﬃcients such that the number of full-adders resulting from the
hardware implementation is proportional to only the product of the
signal word length and the number of adders. In effect, it implies
that the number of full-adders is entirely independent of the co-
eﬃcient word length and the number of shifts between the non-
zero bits in the coeﬃcient. Incorporation of this novel algorithm
yields promising results to ﬁlters with up to 500 taps. In terms of
the number of multiplier block adders and multiplier block full
adders, authors have demonstrated the supremacy of their pro-
posed technique over some existing ones. More explicitly, while the
algorithm proposed in Reference 99 comes up with 25% to 44% re-
duction in the number of MB adders, the same achieved with
Reference 98 is as high as 67%. In terms of the number of FAs, the
resulting reduction is around 71% from Reference 98 as compared
to 25% to 54% reduction in Reference 99.
For a long time, RAG-n was considered to be the probably best
algorithm to solve MCM problems. However, a new algorithm called
HCUB has been emerged as a way of improving the results over
RAG-n [27]. Both of them are adder graph algorithms, divided into
two stages – an optimal part and a heuristic part. The heuristic part
can be viewed as adding extra coeﬃcients to be realized such that
the basic operation in the optimal part can continue. It is explic-
itly mentioned that the HCUB algorithm ﬁnds solutions that require
up to 20% less additions and subtractions than the solutions found
out by the previously known best algorithm like RAG-n [25] and
BHM [81].
An adder graph type algorithm for solving the MCM problem has
been introduced in Reference 26 with a novel heuristic inspired by
difference method class of algorithms. Unlike the previous algo-
rithms, it does not rely on look-up tables for its execution. It has
been shown that the proposed heuristic provides better or com-
parable results than RAG-n. Compared to HCUB, the algorithm is
slightly better on average for most of the conditions.
During the optimization of the coeﬃcients of multiplier-less ﬁlter,
common sub-expression sharing proves to be very much fruitful in
which the coeﬃcientmultipliers are represented as amultiplier block
(MB) with shared shifters and adders. As far as the power con-
sumption in MBs is concerned, not only the total number of adders
but also the adder depth of every coeﬃcient demands for signiﬁ-
cant contribution. Few years back, an MILP based technique [100]
has been employed to optimize the ﬁlter coeﬃcients subject to the
minimization of ripples in the frequency response of the ﬁlter along
with a constraint on the total number of adders and an allowable
maximum adder depth. Authors have established the supremacy
of their proposed algorithm by means of a design example which
reveals that the proposed algorithm generates ﬁlters using less adders
with minimum adder depth than the approach like References 25,
78 and 101.
Recently, truncated MCM using pattern modiﬁcation tech-
nique (PMT) has been developed for FIR ﬁlter implementation [32].
This algorithm truncates every node adder in DAG generated by dif-
ferent MCM algorithms with a common principle of ensuring
that every two inputs to the same node have the same weight.
218 A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
Superiority of PMT has been established by virtue of the fact that
compared to non-truncated MCM algorithms, it reduces the area
cost by 35% without increasing quantization error.
5. Coeﬃcient representation schemes in multiplier-less ﬁlter
design
Tap coeﬃcients of multiplier-less FIR ﬁlter are encoded in dif-
ferent forms so as to yield hardware eﬃcient architecture. Many of
the approaches have tried to select common sub-expressions after
representing the constants in CSD form. Although CSD represen-
tation is effective for one constant, it is not the best for multiple
constants because the CSD representation of a constant is unique
and independent of the other constants. For the multiple constant
multiplications, it would have been more eﬃcient to use minimal
signed digit representation (MSD) that has the same number of non-
zero digits as CSD but provides multiple representations for a
constant [102,103]. An algorithm has been proposed in this regard
[83] to ﬁnd all MSD representations of a constant and to synthe-
size digital ﬁlter based on the MSD representation. It utilizes the
redundancy of the MSD representation to make as many common
sub-expressions and thus leads to smaller ﬁlters. Superiority of the
proposition has been established by implementing several ﬁlters
and comparing the results with conventional ones obtained from
CSD representation.
CSE technique decomposes all the constants in terms of several
common bases. With a vision to optimize the storage of ﬁlter co-
eﬃcients, this algorithmeffectively extracts the commonly occurring
sub-expressions. However, because of its highly irregular struc-
ture, ﬁltermodel usingCSE is hard topipeline [104]. This has seriously
drawn the attention of several researchers towards the low power,
high speed realization of FIR ﬁlters. Sankarayya and his co-workers
have been considered to be pioneers in this regard when they had
proposed a new algorithm [30] for eﬃcient representation of FIR
ﬁlter coeﬃcients. Instead of the direct coeﬃcients, this algorithm
uses various orders of differences between the coeﬃcients alongwith
the stored pre-computed results to compute the convolution sum
and accordingly has been termed as differential coeﬃcientsmethod
(DCM) in literature. As differential coeﬃcients have shorter word
length than the original, it can reduce the number of ones required
to represent the coeﬃcients and hence reduce the power consump-
tion. An N-tap FIR ﬁlter with coeﬃcient hk, input sequence xj and
output sequence yj, can be expressed as:
y h x jj k j k
k
N
= ∀⋅
−
=
−
∑
0
1
(19)
DCM technique, on the other hand, ﬁrst computes the partial
product with differential coeﬃcients and then computes the sum
of the stored partial product of previous computation to obtain the
result corresponding to the original coeﬃcient set. Two consecu-
tive outputs of the ﬁlter may readily be obtained by expanding
equation (19) as:
y h x h x h x h xj j j j N j N= + + + +⋅ ⋅ ⋅ ⋅− − − − +0 1 1 2 2 1 1 (20)
y h x h x h x h xj j j j N j N+ + − − − += ⋅ + ⋅ + + +⋅ ⋅1 0 1 1 2 1 1 2 (21)
The term yj+1 may be written in terms of ﬁrst order difference
DCM as:
y h x dh x h x dh x h xj j j j N j N N j N+ + − − + − − += + +( ) + + ⋅ +( )⋅ ⋅ ⋅ ⋅1 0 1 11 0 11 2 2 2 (22)
The variable dh h h k Nk k k1 1 1 2 1= − ∀ = −− , , , ,… is termed as the ﬁrst
order difference between the adjacent coeﬃcients hk and hk−1, and
the terms like h0.xj and h xN j N− − +⋅2 2 are the compensating terms. As
can be inspected from equation (22), DCM suffers from overheads
since it needs extra adders to compute the sums of stored partial
products of previous computation in order to compensate the effect
of differential coeﬃcients [27]. Apart from considering differential
coeﬃcients, differential inputs had also been taken care of in one
of an algorithm termed as differential coeﬃcients and input method
(DCIM) [105]. For three consecutive outputs y y and yj j j− +1 1, ; their
ﬁrst order differences may be deﬁned as follows:
y y y h x x h x x h x xj j j j j j j N j N j N1 1 0 1 1 1 2 1 1= − = ⋅ −( )+ ⋅ −( )+ + ⋅ −− − − − − − + − ( )
(23)
y y y h x x h x x h x xj j j j j j j N j N j N+ + + − − − + −= − = ⋅ −( )+ ⋅ −( )+ + ⋅ −11 1 0 1 1 1 1 2 +( )1
(24)
Sum of the ﬁrst (N-1) partial products of yj may be deﬁned as
[105]:
y h x x h x x h x xj N j j j j N j N j N, −( ) − − − − − + − += ⋅ −( )+ ⋅ −( )+ + ⋅ −(1 0 1 1 1 2 2 2 1 ) (25)
Now,
y h x x h h x x h x x
h
j j j j j j j
N
+ + − − −= ⋅ −( )+ −( )⋅ −( )+ ⋅ −( ){ }+
+
1
1
0 1 1 0 1 0 1 2 
− − − + − + − − + − +
+
−( )⋅ −( )+ ⋅ −( ){ }
= ⋅ −
1 2 2 1 2 2 1
0 1
h x x h x x
h x
N j N j N N j N j N
j x h h x x
h h x x
h x
j j j
N N j N j N
( ) + −( )⋅ −( )+
+ −( )⋅ −( )
+ ⋅
−
− − − + − +
1 0 1
1 2 2 1
0

j j j j N j N j N
j
x h x x h x x
h x
−( )+ ⋅ −( ){ }+ + ⋅ −( )}
= ⋅
− − − − − + − +
+
1 1 1 2 2 2 1
0 1

−( )+ −( )⋅ −( )+
+ −( )⋅ −( )+
−
− − − + − +
x h h x x
h h x x y
j j j
N N j N j N j
1 0 1
1 2 2 1

, N−( )1 (26)
and
y y y
h x x h h x x
h h
j j j
j j j j
N N
+ +
+ −
− −
= +
= −( )+ −( )⋅ −( )+
+ −( )
⋅
1 1
1
0 1 1 0 1
1 2

⋅ −( ) + +
− + − + −( )x x y yj N j N j N j2 1 1, (27)
As can be inspected from equation (27), except the ﬁrst term,
rest (N-1) partial products are multiplications between differen-
tial coeﬃcients and differential inputs and therefore a shorter
multiplier than that in DCMmay be used in DCIM which stores the
sum of compensated terms in yj N, −( )1 . This has the consequence of
avoiding additional (N-2) unnecessary memory accesses and ad-
ditions. But for each output yj, two extra storage and additions are
required. However, since the basic technique used in DCIM is same
as that of DCM, their overheads are also common. In addition to this,
DCIM suffers from input propagation delay since the difference
cannot be derived prior to the input arrival.
Both DCM and DCIM method calculate the difference between
adjacent tap coeﬃcients in order to minimize the resulting hard-
ware cost. This approachmay not always lead to shorter word length
of the difference signals in case the adjacent coeﬃcients differ by
a signiﬁcantly large margin. This issue has been studied in recent
past by few researchers [39,40] who had calculated the difference
between those coeﬃcients which are having least difference between
their magnitude values and subsequently these minimal differ-
ence values have been used to encode the differential coeﬃcients.
The use of minimal difference coeﬃcients reduces the effective word
length and minimizes the number of full adders and net memory
in turn. This approach, known as minimal difference differential co-
eﬃcients method (MDDCM) [40], ﬁrst sorts the coeﬃcients such that
adjacent coeﬃcients are having minimal differences in their mag-
nitudes before computing the difference representation.
Almost all the algorithms available in the literature for design-
ing multiplier-less ﬁlter have primarily focused on the minimization
of the total number of full adders in realizing the ﬁlter. There are
few reported articles which have judiciously represented the
219A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
powers-of-two tap coeﬃcients in such a way that the number of
full adder count can be reduced. Pseudo ﬂoating point (PFP) rep-
resentation scheme is one such approach which has drawn
considerable attention amongst researchers. For any arbitrary co-
eﬃcient hi of word length B, represented in the form of CSD as
hi jB
aij
= ∑
=
−
−
0
12 , can have its PFP representation as [41]:
h si a j
a a
j
B
a c
j
B
i ij i i ij
= =⋅
−
− −( )
=
−
−
−
=
−
∑ ∑2 2 2 20 0 0
0
1
0
1
(28)
where s j ∈ −{ }1 0 1, , and c a aij ij i= − 0. The term ai0 is known as the
‘shift’ and themaximum of cij , i.e. a ai B i−( ) −( )1 0 is termed as the ‘span’
part. As can be inspected from equation (28), PFP representation
makes it possible to express any B-bit CSD coeﬃcient as a (shift,
span) pair using fewer bits.
Coeﬃcient partitioning [41] is another well developed algo-
rithm which has been really effective in reducing the range of the
span part of PFP by partitioning it into two sub-components. This
method divides the entire span part into two sub-components of
length M 2 for even M (or two sub-components of length
M and M2 2
⎡⎢ ⎤⎥ ⎢⎣ ⎥⎦ for oddM) where M represents the span of PFP rep-
resentation. The latter sub-component is further scaled by its order
to reduce its span. As a matter of fact, the partitioned and scaled
version of PFP coeﬃcients can be added with less number of full
adders. Moreover, attempts have been taken to examine the adder
complexity reduction achieved by partitioning the coeﬃcients into
more than two sub-components. It has come to the observation of
the authors that the widths of the adders in the intermediate stages
of the multiplier are larger and thus calls for more full adders. On
the other hand, when the coeﬃcient is partitioned into two sub-
components, only one inner shift operation exists and the widths
of the adders in the preceding stages are less, while the ﬁnal stage
adder requires the highest width. Therefore partitioning a coeﬃ-
cient into two halves offers the best reduction of full adders than
partitioning into multiple parts.
Limitations of PFP scheme have very recently been pointed out
in Reference 106 by the introduction of minimum index ﬂoating
point (MIFP) representation for the powers-of-two coeﬃcients of
FIR ﬁlter. Computational cost of MIFP scheme has been measured
with respect to various performance metrics like number of one bit
full adders, number of one bit shifters and total delay count. Su-
periority of the scheme has subsequently been established in terms
of those parameters. Resultant coeﬃcient representation underMIFP
scheme may be outlined as [106]:
h s where a
a a
i j
a
j
i i
i ii i
ji i
= ⋅ = +
−( )
−
−
=
∑2 2 21
1
1
μ μ
 
(29)
The term μi in the above equation identiﬁes the overall shift
applied to the terms inside the span part and hence it is known as
the ‘shift’ part in MIFP. The variable ai
j in equation (29) implies the
index of a non-zero term relative to the position of the term μi and
hence it may assume both positive and negative integer values in-
cluding zero depending upon its position. Collection of powers-of-
two termswith positive (including zero) ai
j constitutes the ‘left span’
and with negative ai
j constitutes the ‘right span’ part in the MIFP
scheme.
In order to reduce the power consumption of FIR ﬁlter, a novel
coeﬃcient ordering algorithm has been described in Reference 107
where the implementations are based on processing the coeﬃ-
cients in a non-conventional order using both direct form (DF) and
transposed form (TF) FIR ﬁlters. An overall power reduction of up to
34%with up to 56% area overhead for TF structure is reported as com-
pared to conventional ﬁlter implementation. However, DF structure
results in 19% power reduction without incurring any area overhead.
A new hardware eﬃcient reconﬁgurable FIR ﬁlter architecture
has been recently proposed in Reference 38 where ﬁlter coeﬃ-
cients have been partitioned into smaller sub-coeﬃcients based on
novel binary signed sub-coeﬃcients. Partial products of all possi-
ble sub-coeﬃcients and input data have been calculated in pre-
computer block and results are distributed on ﬁlter taps to compose
the coeﬃcient multiplication.
6. Intelligent optimization techniques in the ﬁeld of hardware
eﬃcient FIR ﬁlter design
Eﬃcient design of multiplier-less powers-of-two FIR ﬁlter has
already been addressed as a problem of optimization by several re-
searchers. As amatter of fact, a number of mathematical optimization
algorithms like MILP [18] and SDP [21] have been judiciously em-
ployed for the purpose of solving the problem. The last decade of
the twentieth century is considered to have signiﬁcant impact on
the ﬁeld of signal processing because of its resourceful amalgam-
ation with artiﬁcial intelligence. In connection to this, Benvenuto
and his co-researchers [92] have presented one simulated anneal-
ing (SA) algorithm for the design of linear phase powers-of-two
digital ﬁlter. Towards the reduction of computational complexity,
new features have also been added with respect to traditional SA
algorithms. As an attempt to combat with the large computation
time of SA, entropy directed deterministic annealing (EDDA) opti-
mization algorithm [108] has been presented for the design of digital
ﬁlters with discrete coeﬃcients. It utilizes estimates of condition-
al entropy to prune the problem during the optimization and thereby
reduces the computational time by 30 to 50%. The concept of SA
has been recently applied to the sum of powers-of-two optimiza-
tion problem by minimizing the total number of nonzero digits of
the FIR coeﬃcients [109]. Apart from dealing with classical ﬁlter
speciﬁcation like in-band ripple and stop-band rejection, it has also
considered additional uncommon shape constraints even in the tran-
sition band.
Moreover, quite a few evolutionary and swarm optimization tech-
niques have proven their competency in substituting many of the
traditional optimization mechanisms which occasionally fail to
perform suitably in many of the engineering problems. Design of
multiplier-less FIR ﬁlter has also been seriously inﬂuenced by ap-
propriate application of evolutionary optimization techniques,
amongst them genetic algorithm (GA) is most common. In connec-
tion to this, Cemes and Ait-Boudaoud have initiated the GA-based
power-of-two FIR ﬁlter design problem by using simple genetic op-
erators like reproduction, cross-over and mutation to search the
discrete coeﬃcient space of predeﬁned powers-of-two coeﬃ-
cients [43,110]. Their approach has outperformed traditional
techniques that restrict their coeﬃcients to be single power-of-
two terms. Two years later, Gentili and his co workers [45] had
thrown suﬃcient light on the same problem by adopting a specif-
ic ﬁlter coeﬃcient coding scheme. Authors have claimed that their
proposed approach is capable of attaining better or almost com-
parable results than the other methods of interest like MILP [18],
simulated annealing (SA) [92], Parks McClellan method [111], pro-
portional relation preserve (PRP) method [112] and so on. Because
of its implicit parallel nature, GA-based approach can explore many
possible solutions at each generation and hence can be easily imple-
mented on parallel machine. Design of high-speed low-power FIR
ﬁlter has also been facilitated by GA in Reference 113 where the
required goal has been achieved by factorizing a long ﬁlter into
several cascaded subﬁlters each with coeﬃcient values con-
strained to sum of SPT. GA has made it possible to implement ﬁlters
in signed powers-of-two space with near global minimum and low
hardware cost. Very recently, a novel GA is proposed for the design
of multiplier-less linear phase FIR ﬁlters both in single stage and
cascade forms [8]. The discrete search space is partitioned into
220 A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
smaller ones based on pass-band gains and the search eﬃciency has
been improved by adjusting the cross-over and mutation rate in an
adaptive way. Unlike the conventional GA, algorithm in Reference
8 uses the adder cost of the ﬁlter as the objective function and pen-
alties are applied when ripple requirements are not met. The
proposition proves to be greedy over [68,114] in terms of design
time and the hardware cost is saved in most of the cases.
Optimization of FIR ﬁlter over the CSD coeﬃcient space based
on GA has been developed in Reference 115. Proposed optimiza-
tion technique exploits the restoration of CSD numbers in
conjunction with the conventional cross-over and mutation opera-
tors in addition to a new local mutation operator. Application of GA
for optimizing ﬁlters generated by the FRM technique has been pre-
sented in Reference 116. It has been demonstrated that GA is capable
of producing better discrete coeﬃcient solution as obtained from
linear optimization technique and is very close to the continuous
solution obtained from non-linear optimization technique. Another
novel genetic algorithm for the design and discrete optimization of
FRM FIR digital ﬁlters over the conventional CSD as well as new
double base number system (DBNS) multiplier coeﬃcient spaces
has been introduced in Reference 117. Proposed genetic algo-
rithm was based on a pair of indexed look-up tables of permissible
CSD/DBNS numbers whose indices form a closed set under the op-
erations of cross-over and mutation. It automatically leads to
legitimate CSD/DBNS coeﬃcients without any recourse to gene repair
during optimization. Finally, it has been successfully applied to the
design of a pair of low-pass and band-pass FRM FIR digital ﬁlters.
Through proper design examples, authors have established that the
resulting optimized CSD/DBNS ﬁlters outperformed the correspond-
ing inﬁnite precision FRM FIR digital ﬁlters in some cases [117].
Although conventional GA (CGA) has proven itself to be a po-
tential search tool for the design of multiplier-less FIR ﬁlter, it requires
comparatively huge computational time since the repetitive evalu-
ations of a large population of candidate solution are relatively low.
This issue has later been addressed by Cen and Lian [46] who had
incorporated a new variant of GA, known as micro GA (μGA) in the
same design problem. μGA-based algorithm requires small popu-
lation size for its execution which had made the convergence speed
of μGA relatively faster than that of CGA. However, there is a like-
lihood that μGA may be trapped into a local optimum point due to
the presence of a small population. This issue has been addressed
through proper modiﬁcation of μGA by varying the probabilities of
cross-over and mutation during the evolution and consequently
termed as modiﬁed μGA [47]. Authors have claimed that com-
pared to CGA, modiﬁed μGA speeds up the optimization process
signiﬁcantly. This has been substantiated by experimental analy-
sis in the sense that modiﬁed μGA is about seven times faster than
CGA and yields a better solution than MILP-based design. A new
variant of GA, called orthogonal genetic algorithm (OGA), has been
incorporated in the design of cascade formmultiplier-less FIR ﬁlter
[48] which has explored two objective functions based on a single
and multiple amplitude response criterion. Authors have claimed
that the OGA approach leads to improved amplitude response rel-
ative to that of an equivalent direct-form cascade ﬁlter obtained using
the Remez exchange algorithm.
Traferro et al. [42] have added a global constraint which ﬁxes
the total number of shift registers in such a way that each coeﬃ-
cient can be represented using different precisions. Optimization
of FIR ﬁlter coeﬃcient has been solved by a speciﬁc tabu search (TS)
method which is computationally lighter than other heuristics like
SA and GA. Supremacy of the design algorithm has been substan-
tiated by several experimental results and comparisons with
previously reported works like References 112 and 118. A hybrid
genetic algorithm (GST), composed of the main features of adaptive
GA (AGA), simulated annealing (SA) and tabu search (TS), has been
introduced in Reference 119 towards the design of powers-of-two
FIR ﬁlter. AGA with varying population size and varying probabili-
ties of genetic operations works as the basis of the hybrid algorithm.
Use of SA is to help AGA escape from the local optima and prevent
premature convergence. The concept of tabu has been introduced
to speed up convergence by reducing search space according to the
properties of FIR ﬁlter coeﬃcients. It has been established by means
of design examples that the normalized peak ripples of the de-
signed ﬁlters can largely be reduced with the help of GST. Unlike
the other GA, the method of GST improves the solution quality and
reduces the computational effort as well.
Powers-of-two design of FIR ﬁlter has been recently achievedwith
the aid of some evolutionary computational algorithms which have
outperformed GA along with its different variants in many bench-
mark problems. In regard to this, differential evolution (DE) algorithm
was used to design multiplier-less FIR ﬁlter with powers-of-two co-
eﬃcients [49]. Impact of different mutation strategies of DE in the
design process has subsequently been studied in References 120–122
and a new self-adaptive DE algorithm has also been proposed for
the design purpose [123]. The same problem has later been tar-
geted by means of self-organizing random immigrants genetic
algorithm (SORIGA) [49,50] and its supremacy over the previous
design strategies had been established.
Design of a CSD based FRM ﬁlter with reduced computational
complexity has been accomplished by means of swarm optimiza-
tion technique like artiﬁcial bee colony (ABC) algorithm [53]. Reduced
computational complexity has been achieved due to fewer genera-
tions for convergence as well as the reduced dimension of the food
source along with its appropriate initial selection. Moreover, quality
of the solution has been ensured through eﬃcient exploration and
exploitation of the search space in the modiﬁed ABC algorithm.
Design of non-uniform ﬁlter bank trans-multiplexer has been
achieved in Reference 54 where the ﬁlter coeﬃcients are synthe-
sized in the CSD format and ABC algorithm has been employed for
the purpose of optimization. Simulation result has established that
the performance of the proposed algorithm is better than that ob-
tained by rounding the continuous coeﬃcients of the ﬁlter to the
nearest CSD number.
7. Design strategies of two-dimensional multiplier-less FIR
ﬁlter
Design of two-dimensional multiplier-less ﬁlter has also gained
serious attention from researchers over the last few decades. Enor-
mous modiﬁcation has taken place in this ﬁeld since the year 1987
when Pei and Jaw [124] have taken pioneering initiative for the
design of 2D multiplier-less digital FIR ﬁlters using a special class
of multiplier-less 1D ﬁlter with coeﬃcients as sums or differences
of powers-of-two. Authors have incorporated McClellan transfor-
mation to map the one-dimensional ﬁlter into a two-dimensional
one. As far as the hardware implementation of these ﬁlters is con-
cerned, they are very attractive, eﬃcient and reliable for high speed
computation. However, the structure proposed by Pei is valid only
for original ﬁrst order McClellan transformation. In connection to
this, a new analytical approach for the determination of the coef-
ﬁcients of the ﬁrst order McClellan transformation has been
presented accordingly by Kwan and Chan [125]. On comparing the
results with those of the original ﬁrst order McClellan transforma-
tion, authors have established the improvement resulting from their
analytical approach over the original one.
Use of a generalized McClellan transformation with order more
than one for the design of 2D linear phase FIR digital ﬁlters has been
illustrated in Reference 55. The design problem is formulated as a
linear programming (LP) optimization problem tomaximize the tran-
sition width of 1D FIR ﬁlter subject to the inequality constraints in
the 2D frequency domain. A local search method has ﬁnally been
adopted for eﬃciently ﬁnding the appropriate powers-of-two
221A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
coeﬃcients. The optimization algorithm eliminates the draw-
backs of high computational cost and hugememory storage in using
conventional LP based algorithms. Three simple and eﬃcient trans-
formations have been proposed in Reference 126 for designing
circularly symmetric wideband andmultiple bands 2D FIR ﬁlter. The
ﬁrst transformation has been regarded as the kth order version of
the original McClellan transformation and other two transforma-
tions are developed on the basis of kth order McClellan
transformation. Effectiveness and ﬂexibility of the proposed trans-
formations have been fully depicted by the presented illustrations.
Authors have claimed that in comparison with other transforma-
tions, approach in Reference 126 has provided signiﬁcant savings
in the number of multiplies at the expense of slightly large number
of adders and delays.
An optimal minimax design of 2D FIR digital ﬁlters with ﬁnite
precision coeﬃcients and linear phase has been developed in Ref-
erence 127. This algorithm associates linear programming and a
branch and bound technique for which two strategies are com-
pared, namely depth-ﬁrst-search and hybrid strategy consisting of
depth-ﬁrst-search and breadth-ﬁrst-search. A large number of design
examples are presented to show the eﬃciency of the method for
the design of 2D ﬁlters with different speciﬁcations and sizes. One
simulated annealing (SA) based design technique has been pro-
posed for the minimax design of 2D multiplier-less FIR ﬁlters [128]
whose coeﬃcients have been written as the sum or difference of
two power-of-two terms. The algorithm proves to be intrinsically
very ﬂexible. Usefulness of the technique in the context of video
ﬁlters has been demonstrated by a number of ﬁlter design ex-
amples. Minimax design problem of two-dimensional linear phase
FIR ﬁlters with continuous and discrete coeﬃcients has later been
described in Reference 129. Authors have initially formulated the
minimax continuous-coeﬃcient design problem as an LP problem
with inequality constraints. Based on the obtained continuous co-
eﬃcients, an eﬃcient method was proposed for designing 2D ﬁlter
with powers-of-two coeﬃcients in the spatial domain.
A number of artiﬁcially intelligent optimization techniques have
found their suitable application in the design process of 2D
multiplier-less ﬁlter too. In connection to this, the very ﬁrst paper
has appeared in the year 1995 when Sriranganathan and his co-
workers have designed circularly symmetric and diamond shaped
low-pass linear phase powers-of-two FIR ﬁlters with the aid of GA
[56]. Authors have adopted minimax error criterion which leads to
a minimization of weighted ripple in both pass-band and stop-
band. Designed ﬁlter has been found to yield better or comparable
performance than those designed with the aid of LP and SA. Another
eﬃcient designmethod of multiplier-less 2D state-space digital ﬁlters
(SSDF) based on GA has been proposed in Reference 130 which are
found to be attractive for high speed operation and simple imple-
mentation. The design problem is described by Roesser’s local state-
space model and formulated subject to the stability of the resultant
ﬁlter. Thamvichai et al. [131] have incorporated two different types
of GA, namely binary-GA and integer-GA, to ﬁnd the periodically
shift variant (PSV) coeﬃcients of 2D ﬁlter. The design involves ﬁnding
the impulse response of the 2D PSV ﬁlter in closed form and then
using GA to ﬁnd the ﬁlter coeﬃcients.
An effective GA-based approach has been proposed [57] for de-
signing two-dimensional FIR ﬁlters with complex-valued frequency
responses by extending the concept of 1D ﬁlter design. Throughmin-
imization of quadratic measure of error in the frequency band, real-
valued chromosomes are evolved to realize ﬁlter coeﬃcients with
evolutionary algorithm. It has been also shown that some coeﬃ-
cients of the designed ﬁlters are inherent to zero and thus results
in signiﬁcant saving in design time. An advanced GA was devel-
oped in Reference 132 to design 2D FIR ﬁlters which can adapt the
genetic operators during the genetic life while remaining simple and
easy to implement. Adaptive GA has produced ﬁlters with good re-
sponse characteristics while greatly reducing the error criteria and
CPU time. GA combined with singular value decomposition (SVD)
has been used to design 2D FIR ﬁlters in which the role of GA was
to optimize the design of 1D ﬁlter [133]. An improvement to SVD
was made by varying the order of 1D ﬁlter in each branch in ac-
cordance with its singular values. This improvement has resulted
in more eﬃcient design by reducing the number of coeﬃcients by
20% with acceptable error in pass-band and stop-band. Recently,
design of 2D multiplier-less linear phase FIR ﬁlter has been accom-
plished by designing multiplier-free 1D linear phase FRM FIR ﬁlter
followed bymultiplier-less transformation [43,58]. Resulting 1D ﬁlter
is converted to the CSD space using a new discrete optimization
based on modiﬁed gravitational search algorithm (GSA) [58] and
modiﬁed harmony search algorithm (HSA) [134]. GSA and HSA have
been adapted in such a way that during the course of optimiza-
tion, candidate solutions turn out to be integers and eﬃcient
exploration and exploitation of the search space are done. Ap-
proaches in References 58 and 134 are bestowed with the features
of reduced computational complexity and time.
A new strategy of multiplier-less image ﬁlter design with the aid
of DE algorithm has been presented very recently [59]. Designed
ﬁlter has accordingly been used to reduce the effect of Gaussian noise
from standard test images and resulting performance has been
studied with respect to relevant parameters. Authors have claimed
the superiority of their design by comparing those parameters with
other design approaches. One comparative study of evolutionary al-
gorithms applied for the design of 2D FIR ﬁlters has been elaborated
in Reference 135. Several stochastic methodologies capable of han-
dling large spaces have also been explored. Finally, a new GA has
been proposed where some concepts are introduced to optimize the
trade-off between diversity and elitism in the genetic population.
8. Experimental results
This sectionmakes an attempt to throwsuﬃcient light on thepro-
gress and impact of powers-of-two FIR ﬁlter design from various
perspectives. The objective of the design process is to achieve the
desired ﬁlter speciﬁcation with as minimum hardware cost as pos-
sible. Frequency characteristics of the designed ﬁlter have been
governedby fewperformanceparameters like pass-band ripple, tran-
sition and stop-band attenuation, width of the transition-band and
so on. Similarly, hardware eﬃciency of the multiplier-less ﬁlter can
be calculated on the basis of different indices like total number of
powers-of-two terms (TPT), total number of adders (TA) divided into
multiplier adders (MA) and structural adders (SA), total number of D
ﬂip ﬂops (TDF) divided into multiplier D ﬂip ﬂops (MDF) and struc-
tural D ﬂip ﬂops (SDF) and total number of zero-valued ﬁlter
coeﬃcients (ZFC).Adetailed comparative studyamongst varioushard-
ware eﬃcient FIRﬁlters has been summarized in Tables 1 and2below
inwhich Table 1 demonstrates the behaviour of the ﬁlter in frequen-
cy domainwhile Table 2 emphasizes on the associatedhardware cost.
Looking at the numerical entries in Table 1, supremacy of
DEMLFIR ﬁlter [49] can easily be established as it yields higher at-
tenuation value in the transition band of frequency response. On
the other hand, [136] outperforms the other design algorithms by
a large margin in terms of stop-band behaviour of the frequency
characteristics. outperforms the other design algorithms by a large
margin. However, except the design in Reference 85, the rest of the
powers-of-two FIR ﬁlters had produced an acceptable stop-band
behaviour in the sense that the minimum stop-band attenuation
value is always higher than 80 dB. In an attempt to compare the hard-
ware complexity of multiplier-less FIR ﬁlters, associated indices are
calculated per unit length of the ﬁlter as they are of a different order.
It is clearly seen from Table 2 that DEMLFIR [49] provides favourable
design for its implementation in terms of TPT, TDF and ZFC as com-
pared to other powers-of-two ﬁlters. Moreover, most of its
222 A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
coeﬃcients have a value of zero and thus makes the structure more
applicable for low power design.
Hardware complexity of the designed ﬁlter may further be im-
proved by the incorporation of proper representation techniques.
Since the powers-of-two ﬁlters substitute multipliers by means of
adders and shifters only, hardware cost of such ﬁlters are general-
ly measured in terms of full adders (FA) count only. In order to make
one comparative analysis amongst different representation schemes,
all possible binary vectors of length 10, 12 and 14 have been con-
sidered into the present study and subsequently the average number
of FA count had been calculated as listed in Tables 3–5.
Supremacy of MIFP scheme in minimizing the FA count has been
ﬁrmly established from the results in the above tables. It can be ex-
plicitly seen that irrespective of the coeﬃcient word length and
number of non-zero bits, MIFP always requires less FA as com-
pared to direct method or PFP. More speciﬁcally, with a total of 8
non-zero bits in the ﬁlter coeﬃcient, MIFP requires 12.44%, 13.76%
and 14.88%, and 15.87% less full adders than PFP for coeﬃcient word
length of 10, 12 and 14 respectively. Corresponding improvement
with respect to direct method has been respectively found to be
13.92%, 16.45% and 18.58%.
Computation of the total number of full adders has ﬁnal-
ly been carried out by considering an arbitrary coeﬃcient
h = 0 0001001001010010. using an 8-bit quantized input signal.
The coeﬃcientmay bewritten as h = + + + +− − − − −2 2 2 2 24 7 10 12 15 whose
MIFP form is given by h = + + + +( )− − −2 2 2 2 2 210 6 3 0 2 5 . Resultant mul-
tiplier structure has been shown in Fig. 3belowwhich identiﬁes that
MIFP scheme is in need of 56 FAs only. On the other hand, direct
multiply and PFP scheme require 80 and 64 FAs respectively. Hence,
for the example coeﬃcient at hand, MIFP outperforms the other two
representation strategies by 30% and 12.5% respectively.
9. Conclusion
Design of hardware eﬃcient multiplier-less FIR digital ﬁlter has
received signiﬁcant attention from researchers over the last few
decades. A number of promising algorithms have been developed
towards the eﬃcient design of such ﬁlters. These include conven-
tional techniques like integer quadratic programming, MILP, SDP and
so on. Similar bit pattern in such powers-of-two ﬁlter has been elimi-
nated by means of CSE and its improved variants. Contributory
algorithms have been developed towards the reduction of adders
in such ﬁlter circuits. In recent times, this ﬁeld has been properly
enriched with the amalgamation of powerful intelligent optimiza-
tion techniques. This paper attempted to provide an overall picture
of the state-of-the-art research carried out in this particular ﬁeld.
With a comprehensive introduction to the necessity of hardware
Table 1
Comparative analysis with respect to frequency response of hardware eﬃcient FIR ﬁlters.
Method Length of
the ﬁlter
Transition-band
attenuation (dB)
at different frequency
points (rad/pi)
Stop-band
attenuation (dB) at
different frequency
points (rad/pi)
Minimum stop-band
attenuation (dB)
0.35 0.4 0.45 0.65 0.75 0.85
Samueli [69] 25 4.279 15.03 34.02 98.19 87.92 112.2 84.66
Chen and Willson [71] 28 2.958 13.48 39.58 135.2 118.1 124.6 115.8
Kaakinen and Saramaki [85] 29 1.002 4.239 15.76 45.8 50.33 53.6 30.28
Jheng, Jou and Wu [136] 30 2.954 13.77 40.6 162.1 121 168.1 117.9
Xu, Chang and Jong [35] 28 3.732 13.8 38.3 150.2 116.7 89.06 80.25
Feng and Teo [23] 34 2.41 13.71 44.2 154.4 147 143 130.6
Chandra and Chattopadhyay
[49]
29 22.28 40.93 70.05 120.5 125.2 120.5 110.7
Table 2
Comparative analysis with respect to hardware cost per unit length of multiplier-
less FIR ﬁlters.
Method Length of
the ﬁlter
Word
length
Name of the parameter
TPT TA TDF ZFC
Samueli [69] 25 9 1.8 1.76 10.4 0
Chen and Willson [71] 28 12 2.143 2.179 15.893 0.07
Kaakinen and
Saramaki [85]
29 11 1.69 1.862 12.207 0.207
Jheng, Jou and Wu [136] 30 11 1.733 1.9 12.367 0.2
Xu, Chang and Jong [35] 28 13 2.214 2.393 17.679 0.214
Feng and Teo [23] 34 13 2.176 2.324 17.441 0.176
Chandra and
Chattopadhyay [49]
29 8 1.621 1.931 8.483 0.345
Table 3
Average number of full adders for a coeﬃcient word length of 10.
Number of
non-zero bits
Method
Direct multiply PFP MIFP
4 40.8 37.2 33.6476
5 53.6667 50.3333 45.3095
6 66.4286 63.5714 56.719
7 79.125 76.875 67.925
8 91.7778 90.2222 79
Table 4
Average number of full adders for a coeﬃcient word length of 12.
Number of
non-zero bits
Method
Direct multiply PFP MIFP
4 44.4 39.6 35.3354
5 58.3333 53.6667 47.6692
6 72.1429 67.8571 59.7175
7 85.875 82.125 71.5316
8 99.5556 96.4444 83.1778
9 113.2 110.8 94.7091
10 126.8182 125.1818 106.1667
Table 5
Average number of full adders for a coeﬃcient word length of 14.
Number of
non-zero bits
Method
Direct multiply PFP MIFP
6 77.8571 72.1429 62.7253
7 92.625 87.375 75.153
8 107.3333 102.6667 87.3863
9 122 118 99.4805
10 136.6364 133.3636 111.4765
11 151.25 148.75 123.4038
12 165.8462 164.1538 135.2857
223A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
eﬃcient digital ﬁlter design, this article has thrown suﬃcient light
on the evolution of such design process along with their associated
advantages and limitations. It has also provided a brief overview
on the design procedure of two-dimensional image ﬁlter whosemask
coeﬃcients are in the form of powers-of-two.
Implementation of such hardware eﬃcient ﬁlters deals with a
number of design objectives which include required area, con-
sumed power, speed or latency of the designed ﬁlter and associated
throughput. Literature review suggests that most of the design al-
gorithms aim to attain certain speciﬁc objective while keeping the
other objectives unattended. However, appropriate trade-off amongst
those objectives is essentially required in most of the practical ap-
plications. Intelligent optimization techniques of recent interest may
be employed to address this issue. More speciﬁcally, multi-objective
optimization algorithms could have been applied for the same design
problem and the resultant impact over single objective optimiza-
tion may be studied in the future. Role of fuzzy logic and fuzzy
system towards the design of such powers-of-two ﬁlters may emerge
as an active way of research for the next generation researchers.
Design strategy of minimum phase multiplier-less digital ﬁlter may
be focused as a future extension of this area of research. Since ap-
plication of the designed multiplier-less FIR ﬁlters in various ﬁelds
of communication and signal processing has not yet been exam-
ined, it could also be studied extensively in the future.
References
[1] S.K. Mitra, Digital Signal Processing: A Computer-based Approach, 2nd ed.,
McGraw Hill, New York, 2001.
[2] J.G. Proakis, Digital Signal Processing: Principles, Algorithms, and Applications,
Prentice Hall of India, New Delhi, 1997.
[3] B. Somanathan Nair, Digital Signal Processing: Theory, Analysis and Digital-
ﬁlter Design, Prentice-Hall of India, New Delhi, 2004.
[4] L. Tan, Digital Signal Processing: Fundamentals and Applications, Academic
Press, New York, 2011.
[5] Y.C. Lim, S.R. Parker, A.G. Constantinides, Finite word length FIR ﬁlter design
using integer programming over a discrete coeﬃcient space, IEEE Trans. Acoust.
ASSP-30 (4) (1982) 661–664.
[6] J. Tian, G. Li, Q. Li, Hardware-eﬃcient parallel structures for linear-phase FIR
digital ﬁlter, in: Proceedings of 56th IEEE International Midwest Symposium
on Circuits and Systems (MWSCAS 2013), 2013, pp. 995–998.
[7] V. Pavlovic, M. Lutovac, M. Lutovac, Eﬃcient implementation of multiplierless
recursive low pass FIR ﬁlters using computer algebra system, in: Proceedings
of 11th IEEE International Conference on Telecommunication in Modern
Satellite, Cable and Broadcasting Services (TELSIKS 2013), Vol. 1, 2013, pp.
65–68.
[8] W.B. Ye, Y.J. Yu, Single-stage and cascade design of high order multiplierless
linear phase FIR ﬁlters using genetic algorithm, IEEE Trans. Circuits Syst. I
Regular Pap. 60 (11) (2013) 2987–2997.
[9] J.-G. Chung, K.K. Parhi, Frequency spectrum based low-area low-power
parallel FIR ﬁlter design, EURASIP J. Appl. Signal Processing 9 (2002) 944–
953.
[10] C. Cheng, K.K. Parhi, Hardware eﬃcient fast parallel FIR ﬁlter structures based
on iterated short convolution, IEEE Trans. Circuits Syst. I Regular Pap. 51 (8)
(2004) 1492–1500.
[11] C. Cheng, K.K. Parhi, Hardware eﬃcient fast parallel FIR ﬁlter structures based
on iterated short convolution, in: Proceedings of IEEE International Symposium
on Circuits and Systems (ISCAS 2004), 2004, pp. 361–364.
[12] K. Ichige, H. Munemasa, A. Hiroyuki, An eﬃcient signed-power-of-two term
allocation for ﬁlter coeﬃcients in digital communication system, IEICE Trans.
Commun. 89 (12) (2006) 3266–3268.
[13] A.F. Shalash, K.K. Parhi, Power eﬃcient FIR folding transformation for wireline
digital communications, in: Proceedings of 32nd Asilomar Conference on
Signals, Systems and Computers, vol. 2, 1998, pp. 1816–1820.
[14] C. Xu, S. Yin, Y. Qin, H. Zou, A novel hardware eﬃcient FIR ﬁlter for
wireless sensor networks, in: Proceedings of Fifth IEEE International
Conference on Ubiquitous and Future Networks (ICUFN 2013), 2013, pp.
197–201.
[15] E. Avenhaus, On the design of digital ﬁlters with coeﬃcients of limited word
length, IEEE Trans. Audio Electro Acoust. 20 (3) (1972) 206–212.
[16] C. Charalambous, M. Best, Optimization of recursive digital ﬁlters with ﬁnite
word lengths, IEEE Trans. Acoust. 22 (6) (1974) 424–431.
[17] M. Suk, S.K. Mitra, Computer-aided design of digital ﬁlters with ﬁnite word
lengths, IEEE Trans. Audio Electro Acoust. 20 (5) (1972) 356–363.
[18] Y.C. Lim, S.R. Parker, FIR ﬁlter design over a discrete powers-of-two coeﬃcient
space, IEEE Trans. Acoust. 31 (3) (1983) 583–591.
[19] O. Gustafsson, H. Johansson, L. Wanhammar, An MILP approach for the design
of linear-phase FIR ﬁlters with minimum number of signed-power-of-two
terms, in: Proceedings of European Conference on Circuit Theory Design
(ECCTD 2001), 2001, pp. 217–220.
[20] R. Ito, K. Suyama, R. Hirabayashi, Optimal design of FIR ﬁlter with discrete
coeﬃcients based on integer semi-inﬁnite linear programs, in: Proceedings
of IEEE International Symposium on Circuits and Systems (ISCAS 2001), vol.
2, 2001, pp. 629–632.
[21] W.S. Lu, Design of FIR ﬁlters with discrete coeﬃcients: a semi deﬁnite
programming relaxation approach, in: Proceedings of IEEE International
Symposium on Circuits and Systems (ISCAS 2001), vol. 2, 2001, pp. 297–
300.
[22] R. Ito, R. Hirabayashi, Optimal design of FIR ﬁlter with SP2 coeﬃcients based
on semi-inﬁnite linear programmingmethod, in: Proceedings of 14th European
Signal Processing Conference (EUSIPCO 2006), 2006.
[23] Z.G. Feng, K.L. Teo, A discrete ﬁlled function method for the design of FIR ﬁlters
with signed-powers-of-two coeﬃcients, IEEE Trans. Signal Processing 56 (1)
(2008) 134–139.
[24] N.I. Cho, S.U. Lee, Optimal design of ﬁnite precision FIR ﬁlters using linear
programming with reduced constraints, IEEE Trans. Signal Processing 46 (1)
(1998) 195–199.
[25] A.G. Dempster, M.D. Macleod, Use of minimum-adder multiplier blocks in FIR
digital ﬁlters, IEEE Trans. Circuits Syst. II Analog Digit. Signal Processing 42
(9) (1995) 569–577.
Fig. 3. Multiplier structure for the example coeﬃcient using MIFP.
224 A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
[26] O. Gustafsson, A difference based adder graph heuristic for multiple constant
multiplication problems, in: Proceedings of IEEE International Symposium on
Circuits and Systems (ISCAS 2007), 2007, pp. 1097–1100.
[27] Y. Voronenko, M. Puschel, Multiplierless multiple constant multiplication, ACM
Trans. Algorithms 3 (2) (2007) 1–39.
[28] M. Potkonjak, M.B. Srivastava, A.P. Chandrakasan, Multiple constant
multiplications: eﬃcient and versatile framework and algorithms for exploring
common sub expression elimination, IEEE Trans. Comput. Aided Des. Integr.
Circuits Syst. 15 (2) (1996) 151–165.
[29] R. Hartley, Sub expression sharing in ﬁlters using canonic signed digit
multipliers, IEEE Trans. Circuits Syst. II Analog Digit. Signal Processing 43 (10)
(1996) 677–688.
[30] N. Sankarayya, K. Roy, D. Bhattacharya, Algorithms for low power and high
speed FIR ﬁlter realization using differential coeﬃcients, IEEE Trans. Circuits
Syst. II Analog Digit. Signal Processing 44 (6) (1997) 488–497.
[31] O. Gustafsson, H. Ohlsson, L. Wanhammar, Improved multiple constant
multiplication usingminimum spanning trees, in: Proceedings of Thirty-Eighth
Asilomar Conference on Signals, Systems and Computers, vol. 1, 2004, pp.
63–66.
[32] R. Guo, L.S. DeBrunner, K. Johansson, Truncated MCM using pattern
modiﬁcation for FIR ﬁlter implementation, in: Proceedings of IEEE International
Symposium on Circuits and Systems (ISCAS 2010), 2010, pp. 3881–3884.
[33] Y. Wang, K. Roy, A novel low-complexity method for parallel multiplierless
implementation of digital FIR ﬁlters, in: Proceedings of IEEE International
Symposium on Circuits and Systems (ISCAS 2005), 2005, pp. 2020–2023.
[34] Y.Wang, K. Roy, CSDC: a new complexity reduction technique for multiplierless
implementation of digital FIR ﬁlters, IEEE Trans. Circuits Syst. I Regular Pap.
52 (9) (2005).
[35] F. Xu, C.H. Chang, C.C. Jong, Design of low-complexity FIR ﬁlters based on
signed powers-of-two coeﬃcients with reusable common subexpressions, IEEE
Trans. Comput. Aided Des. Integr. Circuits Syst. 26 (10) (2007).
[36] M. Peiro, E.I. Boemo, L. Wanhammar, Design of high-speedmultiplierless ﬁlters
using a nonrecursive signed common subexpression algorithm, IEEE Trans.
Circuits Syst. II Analog Digit. Signal Processing 49 (3) (2002) 196–203.
[37] T.J. Lin, T.H. Yang, C.W. Jen, Area-effective FIR ﬁlter design for multiplier-less
implementation, in: Proceedings of IEEE International Symposium on Circuits
and Systems (ISCAS 2003), vol. 5, 2003, pp. 173–176.
[38] A. Abbaszadeh, K.D. Sadeghipour, A new hardware eﬃcient reconﬁgurable FIR
ﬁlter architecture suitable for FPGA applications, in: Proceedings of 17th IEEE
International Conference on Digital Signal Processing, 2011, pp. 1–4.
[39] A.P. Vinod, C.H. Chang, P.K. Meher, A. Singla, Low power FIR ﬁlter realization
using minimal difference coeﬃcients: part I-complexity analysis, in:
Proceedings of IEEE Asia Paciﬁc Conference on Circuits and Systems (APCCAS
2006), 2006, pp. 1547–1550.
[40] A.P. Vinod, C.H. Chang, P.K. Meher, A. Singla, Low power FIR ﬁlter realization
using minimal difference coeﬃcients: part II-Algorithm, in: Proceedings of
IEEE Asia Paciﬁc Conference on Circuits and Systems (APCCAS 2006), 2006,
pp. 1551–1554.
[41] A.P. Vinod, E.K. Lai, Optimizing vertical common subexpression elimination
using coeﬃcient partitioning for designing low complexity software radio
channelizers, in: Proceedings of IEEE International Symposium on Circuits and
Systems (ISCAS 2005), 2005, pp. 5429–5432.
[42] S. Traferro, F. Capparelli, F. Piazza, A. Uncini, Eﬃcient allocation of power of
two terms in FIR digital ﬁlter design using tabu search, in: Proceedings of IEEE
International Symposium on Circuits and Systems (ISCAS 1999), vol. 3, 1999,
pp. 411–414.
[43] R. Cemes, D. Ait-Boudaoud, Genetic approach to design of multiplierless FIR
ﬁlters, Electron. Lett. 29 (24) (1993) 2090–2091.
[44] G.Wade, A. Roberts, G.Williams, Multiplier-less FIR ﬁlter design using a genetic
algorithm, in: Proceedings of IEE on Vision, Image and Signal Processing, vol.
141, no. 3, 1994, pp. 175–180.
[45] P. Gentili, F. Piazza, A. Uncini, Eﬃcient genetic algorithm design for power-of-
two FIR ﬁlters, in: Proceedings of IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP 1995), vol. 2, 1995, pp. 1268–1271.
[46] L. Cen, Y. Lian, Complexity reduction of high-speed FIR ﬁlters using micro-
genetic algorithm, in: Proceedings of First IEEE International Symposium on
Control, Communications and Signal Processing, 2004, pp. 419–422.
[47] L. Cen, Y. Lian, A modiﬁed micro-genetic algorithm for the design of
multiplierless digital FIR ﬁlters, in: Proceedings of IEEE Region 10 Conference
(TENCON 2004), 2004, pp. 52–55.
[48] S.U. Ahmad, A. Antoniou, Cascade-form multiplierless FIR ﬁlter design using
orthogonal genetic algorithm, in: Proceedings of IEEE International Symposium
on Signal Processing and Information Technology, 2006, pp. 932–937.
[49] A. Chandra, S. Chattopadhyay, A novel approach for coeﬃcient quantization
of low-pass ﬁnite impulse response ﬁlter using differential evolution algorithm,
Signal Image Video Process. 8 (7) (2014) 1307–1321.
[50] A. Chandra, S. Chattopadhyay, Novel design strategy of multiplier-less low-pass
ﬁnite impulse response ﬁlter using self-organizing random immigrants genetic
algorithm, Signal Image Video Process. 8 (3) (2014) 507–522.
[51] A. Chandra, S. Chattopadhyay, Design optimization of powers-of-two FIR ﬁlter
using self-organizing random immigrants GA, Int. J. Electron. 102 (1) (2015)
127–140.
[52] V.J. Manoj, E. Elias, On the design of multiplier-less nonuniform ﬁlter bank
transmultiplexer using particle swarm optimization, in: Proceedings of World
Congress on Nature & Biologically Inspired Computing (NaBIC 2009), 2009,
pp. 55–60.
[53] M. Manuel, E. Elias, Design of multiplier-less FRM FIR ﬁlter using Artiﬁcial Bee
Colony Algorithm, in: Proceedings of 20th IEEE European Conference on Circuit
Theory and Design (ECCTD 2011), 2011, pp. 322–325.
[54] V.J. Manoj, E. Elias, Artiﬁcial bee colony algorithm for the design of multiplier-
less nonuniform ﬁlter bank transmultiplexer, Int. J. Inf. Sci. 192 (2012) 193–203.
[55] C. Chen, J. Lee, McClellam transform based design techniques for two-
dimensional linear phase FIR ﬁlters, an improved polynomial-time algorithm
for designing digital ﬁlters with power-of-two coeﬃcients, IEEE Trans. Circuits
Syst. I Fundam. Theory Appl. 41 (8) (1994) 505–517.
[56] S. Sriranganathan, D.R. Bull, D.W. Redmill, Design of 2-D multiplierless FIR
ﬁlters using genetic algorithms, in: Proceedings of 1st International Conference
on Genetic Algorithms in Engineering Systems: Innovations and Applications,
(GALESIA 1995), 1995, pp. 282–286.
[57] S.T. Tzeng, Design of 2-D FIR digital ﬁlters with speciﬁedmagnitude and group
delay responses by GA approach, Signal Processing 87 (2007) 2036–2044.
[58] M. Manuel, R. Krishnan, E. Elias, Design of multiplierless 2-D sharp wideband
ﬁlters using FRM and GSA, Glob. J. Res. Eng. Electr. Electron. Eng. 12 (2012).
[59] A. Chandra, S. Chattopadhyay, A new strategy of image denoising using
multiplier-less FIR ﬁlter designed with the aid of differential evolution
algorithm, Multimedia Tools Appl. (2014) doi:10.1007/s11042-014-2358-7.
[60] Y.C. Lim, S.R. Parker, Discrete coeﬃcient FIR digital ﬁlter design based upon
an LMS criteria, IEEE Trans. Circuits Syst. 30 (10) (1983) 723–739.
[61] H.J. Oh, Y.H. Lee, Design of discrete coeﬃcient FIR and IIR digital ﬁlters with
preﬁlter-equalizer structure using linear programming, IEEE Trans. Circuits
Syst. II Analog Digit. Signal Processing 47 (6) (2000) 562–565.
[62] R. Ito, T. Fujie, K. Suyama, R. Hirabayashi, New design methods of FIR ﬁlters
with signed power of two coeﬃcients based on a new linear programming
relaxation with triangle inequalities, in: Proceedings of IEEE International
Symposium on Circuits and Systems (ISCAS 2002), vol. 1, 2002, pp. 813–
816.
[63] C.Y. Yao, C.J. Chien, A partial MILP algorithm for the design of linear phase
FIR ﬁlters with SPT coeﬃcients, IEICE Trans. Fundam. E85-A (2002) 2302–2310.
[64] G. Karakonstantis, K. Roy, An optimal algorithm for low power multiplierless
FIR ﬁlter design using Chebyshev criterion, in: Proceedings of IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), vol. 2,
2007, pp. 49–52.
[65] H.H. Dam, A. Cantoni, K.L. Teo, S. Nordholm, FIR variable digital ﬁlter with
signed power-of-two coeﬃcients, IEEE Trans. Circuits Syst. I Regular Pap. 54
(6) (2007) 1348–1357.
[66] H.Q. Ta, T.L. Nhat, Design of FIR ﬁlter with discrete coeﬃcients based onmixed
integer linear programming, in: Proceedings of 9th IEEE International
Conference on Signal Processing (ICSP 2008), 2008, pp. 9–12.
[67] J.F. Sturm, Using SeDuMi 1.02: a MATLAB toolbox for optimization over
symmetric cones, Optim. Methods Softw. 11–12 (1999) 625–653.
[68] M. Aktan, A. Yurdakul, G. Dundar, An algorithm for the design of low-power
hardware eﬃcient FIR ﬁlters, IEEE Trans. Circuits Syst. I Regular Pap. 55 (6)
(2008) 1536–1545.
[69] H. Samueli, An improved search algorithm for the design of multiplierless FIR
ﬁlters with powers-of-two coeﬃcients, IEEE Trans. Circuits Syst. 36 (1) (1989)
1044–1047.
[70] D. Li, J. Song, Y.C. Lim, A polynomial-time algorithm for designing digital ﬁlters
with power-of-two coeﬃcients, in: Proceedings of IEEE International
Symposium on Circuits and Systems(ISCAS 1993), 1993, pp. 84–87.
[71] C.L. Chen, A.N. Willson Jr., A trellis search algorithm for the design of FIR ﬁlters
with signed-powers-of-two coeﬃcients, IEEE Trans. Circuits Syst. II Analog
Digit. Signal Processing 46 (1) (1999) 29–39.
[72] C.L. Chen, K.Y. Khoo, A.N. Willson Jr., An improved polynomial-time algorithm
for designing digital ﬁlters with power-of-two coeﬃcients, in: Proceedings
of IEEE International Symposium on Circuits and Systems (ICSAS1995), vol.
1, 1995, pp. 223–226.
[73] T. Ciloglu, Design of FIR ﬁlters for low implementation complexity, Electron.
Lett. 35 (7) (1999) 529–530.
[74] D. Ait-Boudaoud, R. Cemes, Modiﬁed sensitivity criterion for the design of
powers-of-two FIR ﬁlters, Electron. Lett. 29 (16) (1993) 1467–1469.
[75] S.A. Alam, O. Gustafsson, Design of ﬁnite word length linear-phase FIR ﬁlters
in the logarithmic number system domain, VLSI Des. 2014 (2014) 1–14.
[76] M. Mehendale, S.D. Sherlekar, G. Venkatesh Synthesis of multiplier-less FIR
ﬁlters with minimum number of additions, in: Proceedings of IEEE
International Conference on Computer-Aided Design (ICCAD-95), 1995, pp.
668–671.
[77] R. Pasko, P. Schaumont, V. Derudder, S. Vernalde, D. Durackova, A new
algorithm for elimination of common subexpressions, IEEE Trans. Comput.
Aided Des. Integr. Circuits Syst. 18 (1) (1999) 58–68.
[78] C.Y. Yao, H.H. Chen, T.F. Lin, C.J. Chien, C.T. Hsu, A novel common-
subexpression-eliminationmethod for synthesizing ﬁxed-point FIR ﬁlters, IEEE
Trans. Circuits Syst. I Regular Pap. 51 (11) (2004) 2215–2221.
[79] M.D. Macleod, A.G. Dempster, Multiplierless FIR ﬁlter design algorithms, IEEE
Signal Processing Lett. 12 (3) (2005) 186–189.
[80] A.G. Dempster, M.D. Macleod, Generation of signed-digit representations for
integer multiplication, IEEE Signal Processing Lett. 11 (8) (2004) 663–665.
[81] A.G. Dempster, M.D. Macleod, Constant integer multiplication using minimum
adders, in: Proceedings of IEE on Circuits, Devices and Systems, vol. 141, no.
5, 1994, pp. 407–413.
[82] R. Hartley, Optimization of canonic signed digit multipliers for ﬁlter design,
in: Proceedings of IEEE International Symposium on Circuits and Systems
(ISCAS 1991), 1991, pp. 1992–1995.
225A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
[83] I.C. Park, H.J. Kang, Digital ﬁlter synthesis based on an algorithm to generate
all minimal signed digit representations, IEEE Trans. Comput. Aided Des. Integr.
Circuits Syst. 21 (12) (2002) 1525–1529.
[84] H. Saﬁri, M. Ahmadi, G.A. Jullien, W.C. Miller, A new algorithm for the
elimination of common subexpressions in hardware implementation of digital
ﬁlters by using genetic programming, J. VLSI Signal Processing 31 (2) (2002)
91–100.
[85] J.Y. Kaakinen, T. Saramaki, A systematic algorithm for the design of
multiplierless FIR ﬁlters, in: Proceedings of IEEE International Symposium on
Circuits and Systems (ISCAS 2001), vol. 2, 2001, pp. 185–188.
[86] C.Y. Yao, A study of SPT-term distribution of CSD numbers and its application
for designing ﬁxed-point linear phase FIR ﬁlters, in: Proceedings of
IEEE International Symposium on Circuits and Systems, no. 2, 2001, pp.
301–304.
[87] O. Gustafsson, L. Wanhammar, Design of linear-phase FIR ﬁlters combining
subexpression sharing with MILP, in: Proceedings of IEEE 45th Midwest
Symposium on Circuits and Systems, (MWSCAS 2002), vol. 3, 2002, pp. 9–12.
[88] A. Avizienis, Signed-digit number representations for fast parallel arithmetic,
IRE Trans. Electron. Comput. 3 (1961) 389–400.
[89] D.R. Bull, D.H. Horrocks, Primitive operator digital ﬁlters, in: Proceeding of
IEE Circuits, Devices and Systems, 1991, pp. 401–412.
[90] D. Li, Minimum number of adders for implementing a multiplier and its
application to the design of multiplierless digital ﬁlters, IEEE Trans. Circuits
Syst. 11 Analog Digit. Signal Processing 42 (7) (1995) 453–460.
[91] Y.C. Lim, Design of discrete-coeﬃcient-value linear phase FIR ﬁlters with
optimum normalized peak ripple magnitude, IEEE Trans. Circuits Syst. 37 (12)
(1990) 1480–1486.
[92] N. Benvenuto, M. Marchesi, A. Uncini, Applications of simulated annealing for
the design of special digital ﬁlters, IEEE Trans. Signal Processing 40 (2) (1992)
323–332.
[93] M. Potkonjak, M.B. Srivastava, A.P. Chandrakasan, Eﬃcient substitution of
multiple constant multiplications by shifts and additions using iterative
pair-wise matching, in: Proceedings of 31st Conference on Design Automation,
1994, pp. 189–194.
[94] D.N. Pearson, K.K. Parhi, Low-power FIR digital ﬁlter architectures, in:
Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS
1995), vol. 1, 1995, pp. 231–234.
[95] D.A. Parker, K.K. Parhi, Area-eﬃcient parallel FIR digital ﬁlter implementations,
in: Proceedings of International Conference on Application Speciﬁc Systems,
Architectures and Processors (ASAP 1996), 1996, pp. 93–111.
[96] D.A. Parker, K.K. Parhi, Low-area/power parallel FIR digital ﬁlter
implementations, J. VLSI Signal Processing 17 (1997) 75–92.
[97] H.J. Kang, I.C. Park, FIR ﬁlter synthesis algorithms for minimizing the delay
and the number of adders, IEEE Trans. Circuits Syst. II Analog Digit. Signal
Processing 48 (8) (2001) 770–777.
[98] D.L. Maskell, J. Leiwo, J.C. Patra, The design of multiplierless FIR ﬁlters with
a minimum adder step and reduced hardware complexity, in: Proceedings
of IEEE International Symposium on Circuits and Systems( ISCAS 2006), 2006,
pp. 605–608.
[99] A.P. Vinod, M.K. Lai, On the implementation of eﬃcient channel ﬁlters for
wideband receivers by optimizing common subexpression elimination
methods, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 24 (2005)
295–304.
[100] Y.J. Yu, Y.C. Lim, Optimization of ﬁr ﬁlters in subexpression space with
constrained adder depth, in: Proceedings of the 6th International Symposium
on Image and Signal Processing and Analysis (ISPA 2009), 2009, pp. 766–
769.
[101] Y.J. Yu, Y.C. Lim, Roundoff noise analysis of signals represented using signed
power-of-two terms, in: Proceedings of 14th European Signal Processing
Conference (EUSIPCO 2006), 2006.
[102] T. Chang, C. Kung, C. Jen, A simple processor core design for DCT/IDCT, IEEE
Trans. Circuits Syst. Video Technol. 10 (3) (2000) 439–447.
[103] J.T. Kim, Design and implementation of computationally eﬃcient FIR ﬁlters
and scalable VLSI architectures for discrete wavelet transform, PhD dissertation,
Advanced Institute of Science and Technology, Korea, 1998.
[104] A.P. Vinod, E. Lai, D.L. Maskell, P.K. Meher, An improved common subexpression
elimination method for reducing logic operators in FIR ﬁlter implementations
without increasing logic depth, Integr. VLSI J. 43 (2010) 124–135.
[105] T. Chang, Y. Chu, C. Jen, Low power FIR ﬁlter realization with differential
coeﬃcients and input, IEEE Trans. Circuits Syst. II 47 (2000) 137–145.
[106] A. Chandra, S. Chattopadhyay, Eﬃcient encoding of powers-of-two coeﬃcients
throughminimum index ﬂoating point representation (MIFPR), in: Proceedings
of 2014 International Conference on Control, Instrumentation, Energy and
Communication (CIEC 2014), 2014, pp. 650–653.
[107] A.T. Erdogan, T. Arslan, Low power FIR ﬁlter implementations based on
coeﬃcient ordering algorithm, in: Proceedings of the IEEE Computer Society
Annual Symposium on VLSI Emerging Trends in VLSI Systems Design (ISVLSI
2004), 2004.
[108] P. Persson, S. Nordebo, I. Claesson, Design of discrete coeﬃcient FIR ﬁlters by
a fast entropy-directed deterministic annealing algorithm, IEEE Trans. Signal
Processing 53 (3) (2005) 1006–1014.
[109] R. Baudin, G. Lesthievent, Design of FIR ﬁlters with sum of power-of-two
representation using simulated annealing, in: Proceedings of 2014 7th
Advanced Satellite Multimedia Systems Conference and the 13th Signal
Processing for Space Communications Workshop (ASMS/SPSC), 2014, pp.
339–345.
[110] R. Cemes, D. Ait-Boudaoud, Multiplierless FIR ﬁlter design with power-of-two
coeﬃcients, Inst. Electr. Eng. 6 (1993) 1–4.
[111] T.W. Parks, C.S. Burrus, Digital Filter Design, Wiley, New York, 1989.
[112] Q. Zhao, Y. Tadokoro, A simple design of FIR ﬁlters with powers-of-two
coeﬃcients, IEEE Trans. Circuits Syst. 35 (5) (1988) 566–570.
[113] L. Cen, Y. Lian, High speed frequency response masking ﬁlter design using
genetic algorithm, in: Proceedings of IEEE International Conference on Neural
Networks & Signal Processing, 2003, pp. 735–738.
[114] D. Shi, Y.J. Yu, Design of discrete-valued linear phase FIR ﬁlters in cascade form,
IEEE Trans. Circuits Syst. I Regular Pap. 58 (7) (2011) 1627–1636.
[115] T.G. Fuller, B. Nowrouzian, F. Ashrafzadeh, Optimization of FIR digital ﬁlters
over the canonical signed-digit coeﬃcient space using genetic algorithms, in:
Proceedings of Midwest Symposium on Circuits and Systems, 1998, pp.
456–459.
[116] Y.J. Yu, Y.C. Lim, Genetic algorithm approach for the optimization of
multiplierless sub-ﬁlters generated by the frequency-response masking
technique, in: Proceedings of 9th IEEE International Conference on Electronics,
Circuits and Systems, vol. 3, 2002, pp. 1163–1166.
[117] P. Mercier, S.M. Kilambi, B. Nowrouzian, Optimization of FRM FIR digital ﬁlters
over CSD and CDBNS multiplier coeﬃcient spaces employing a novel genetic
algorithm, J. Comput. 2 (7) (2007) 20–31.
[118] T. Çiloglu, Y. Hoon Lee, Eﬃcient allocation of power-of-two terms in complex
FIR ﬁlter design, in: Proceedings of the 1999 IEEE International Symposium
on Circuits and Systems, (ISCAS’99), vol. 3, 1999, pp. 411–414.
[119] L. Cen, A hybrid genetic algorithm for the design of FIR ﬁlters with SPOT
coeﬃcients, Signal Processing 87 (2007) 528–540.
[120] A. Chandra, S. Chattopadhyay, Selection of computationally eﬃcient mutation
strategy of differential evolution algorithm for the design of multiplier-less
low-pass FIR ﬁlter, in: Proceedings of 14th International Conference on
Computer and Information Technology (ICCIT 2011), 2011, pp. 274–279.
[121] A. Chandra, S. Chattopadhyay, Role of mutation strategies of differential
evolution algorithm in designing hardware eﬃcient multiplier-less low-pass
FIR ﬁlter, J. Multimedia 7 (5) (2012) 353–363.
[122] A. Chandra, S. Chattopadhyay, Computationally eﬃcient design of multiplier-
less low-pass FIR ﬁlter using trigonometric mutation strategy of differential
evolution algorithm, in: Proceedings of Fourth International Conference on
Sustainable Energy and Intelligent System (SEISCON 2013), 2013, pp. 272–277.
[123] A. Chandra, S. Chattopadhyay, A novel self-adaptive differential evolution
algorithm for eﬃcient design of multiplier-less low-pass FIR ﬁlter, in:
Proceedings of Second International Conference on Sustainable Energy and
Intelligent System (SEISCON 2011), 2011, pp. 733–738.
[124] S.C. Pei, S.B. Jaw, Eﬃcient design of 2D multiplierless FIR ﬁlters by
transformation, in: Proceedings of IEEE International Conference of Acoustics,
Speech and Signal Processing (ICASSP 1987), vol. 12, 1987, pp. 1669–1672.
[125] H.K. Kwan, C.L. Chan, Circularly symmetric two-dimensional multiplierless
FIR digital ﬁlter design using an enhanced McClellan transformation, in:
Proceedings of IEE vol. 136, no. 3, 1989, pp. 129–134.
[126] J.C. Liu, Y.L. Tai, Design of 2-D wideband circularly symmetric FIR ﬁlters by
multiplierless high-order transformation, IEEE Trans. Circuits Syst. I Regular
Pap. 58 (4) (2011) 746–754.
[127] P. Siohan, A. Benslimane, Finite precision design of optimal linear phase 2-D
FIR digital ﬁlters, IEEE Trans. Circuits Syst. 36 (1) (1989) 11–22.
[128] L. Banzato, N. Benvenuto, G.M. Cortelazzo, A design technique for two-
dimensional multiplierless FIR ﬁlters for video applications, IEEE Trans. Circuits
Syst. Video Technol. 2 (3) (1992) 273–284.
[129] J. Lee, S. Yang, D. Tang, Minimax design of 2-D linear-phase FIR ﬁlters with
continuous and powers-of-two coeﬃcients, Signal Processing 80 (2000)
1435–1444.
[130] Y.H. Lee, M. Kawamata, T. Higuchi, Design of multiplierless 2-D state-space
digital ﬁlters over a powers-of-two coeﬃcient space, IEICE Trans. Fundam.
E79-A (3) (1996) 374–377.
[131] R. Thamvichai, T. Bose, R.L. Haupt, Design of 2-D multiplierless ﬁlters using
the genetic algorithm, in: Proceedings of 35th Asilomar Conference on Signals,
Systems and Computers, vol. 2, 2001, pp. 588–591.
[132] K. Boudjelaba, F. Ros, D. Chikouche, An advanced genetic algorithm for
designing 2-D FIR ﬁlters, in: Proceedings of Paciﬁc Rim Conference on
Communications, Computers and Signal Processing, 2011, pp. 60–65.
[133] B. Elkarami, M. Ahmadi, An eﬃcient design of 2-D FIR digital ﬁlters by using
singular value decomposition and genetic algorithm with canonical signed
digit (CSD) coeﬃcients, in: Proceedings of IEEE 54th International Midwest
Symposium on Circuits and Systems (MWSCAS), 2011, pp. 1–4.
[134] M. Manuel, E. Elias, Design of sharp 2D multiplier-less circularly symmetric
FIR ﬁlter using harmony search algorithm and frequency transformation, J.
Signal Inf. Processing 3 (2012) 344–351.
[135] K. Boudjelaba, D. Chikouche, F. Ros, Evolutionary techniques for the synthesis
of 2-D FIR ﬁlters, IEEE Stat. Signal Processing Workshop (2011) 601–604.
[136] K. Jheng, S. Jou, A. Wu, A design ﬂow for multiplierless linear-phase FIR
ﬁlters: from system speciﬁcation to Verilog code, in: Proceedings of IEEE
International Symposium on Circuits and Systems (ISCAS 2004), vol. 5, 2004,
pp. 293–296.
226 A. Chandra, S. Chattopadhyay/Engineering Science and Technology, an International Journal 19 (2016) 212–226
