Efficient Bit-level Systolic Array for QMF Banks by J.W. Lin
Efficient Bit-Level Systolic Arrays for QMF Banks
Jia-Wen Lin, Yung-Chang Chen, and Chin-Liang Wang
Institute of Electrical Engineering, National Tsing Hua University, Hsin-Chu, Taiwan, R.O.C.
ABSTRACT
In this paper, various systolic arrays are proposed for the application to quadrature mirror filter (QMF) banks. A word-
level systolic array is firstly presented to realize QMF banks. It is subsequently refined to bit-level array with bit-parallel
arithmetic via the well-known two-level pipelining techniques1 and is then converted to bit-serial form by using the bit-serial
inner product array proposed by Wang et al.2 . By applying the polyphase representation as well as fully utilizing the special
relations among QMFs', aside from the memory cost, the whole filter bank can be constructed by using only about one
half of the hardware expense of a prototype filter. In comparison with the direct realization using polyphase representation,
the number of the systolic multiplier-accumulators (SMAs) required for our architecture is halved. Thus, both the chip area
and transistor-count are reduced. As a result, with today's commercial CMOS technology, the whole filter bank can be
implemented within a single-chip for various video applications.
1. INTRODUCTION
Since it was introduced, subband coding (SBC) is so far one of the most effective coding approaches for video
applications. Fig. 1 shows a configuration for typical l-D two-band subband coding systems. This coding approach
decomposes the source signal into separable encoding bands. Because the human visual system is less sensitive to high-band
errors and most energy is concentrated in low frequency bands, it is possible to encode these bands in various significance,
thereby achieving coding gains and providing hierarchical video services. This makes it especially promising for the future
broadband ISDN environment.
Since SBC needs filter banks to split the input signal into many subbands, aside from the coding mechanism, the
filter bank is the kernel of this coding scheme. In order to avoid errors arising from the resampling and imperfect filtering
processes, a number of filter banks possessing the perfect reconstruction (PR) property have been widely investigated3.
Among them, the QMF banks are the most famous. Since these filter banks usually deal with large amount of data, high-
speed computing hardwares are necessary to meet the real time requirements. Recently, the development of VLSI technology
has made high-speed, parallel processing of large volume of data practical and cost-effective. Therefore, it is an attractive
approach for implementation of real time multirate systems using VLSI circuits. Since the introduction of systolic arrays
by Kung and Leiserson4, they have been widely used to design special-purpose signal processing systems. A systolic
architecture possesses modularity, regularity, local connection, massive parallelism, linear-rate pipelinability, and is thus
suitable for VLSI implementation and has a very high throughput rate.
Many systolic arrays have been proposed for high-speed digital filtering2'59, of which only few are dedicated to subband
filtering . It is not efficient to directly apply a general-purpose filtering array to subband coding systems, since they never
utilize the special properties of such systems. Recently, Pestel et al.1 proposed a dedicated VLSI architecture for subband
filtering by using the polyphase representations of QMF banks. In their architectures, however, the special relations among
the coefficients of QMF banks are not fully utilized. In addition, the highest pipelinability is not achieved in their design.
In this paper, we will focus on the design of systolic arrays for QMF banks.
2. QUADRATURE MIRROR FWTER (QMF) BANKS
The most popular and widely used technique for the analysis/synthesis filter banks of the subband coders is the well-
known quadrature mirror filters (QMF's). By using QMF banks, the aliasing effect due to the imperfect filters and critical
downsampling can be exactly eliminated, so an alias-free reconstruction of the input signal is possible. As well, the phase
1028 ISPIE Vol. 1818 Visual Communications and Image Processing '92 0-8194-1018-7/92/$4.00
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
distortion can be canceled. The amplitude distortion can not be exactly eliminated but can be minimized. Though QMF
banks can not exactly achieve perfect reconstruction, it will be shown that the implementation of QMF banks is very efficient.
Typically, the coefficients of QMF banks satisfy the following conditions:12
h1 (n) = (-1) h0 (n) (1)
go (n) = h0 (ri) (2)
gi (n) = —h1 (n) (3)
ho(n)=ho(L—i—n) ,O<n< —1 (4)
It is important to note that, if 2-D QMF banks are separable (i.e., hik (ni ,n2) = h1 (ni) hk (n2), they can be implemented
by using the tree-structured configuration which comprises of several stages of 1-D QMF banks shown in Fig. 2 (which
omits the synthesis part). This configuration leads to a simpler and more hardware-economical realization of the separable
2-D QMF banks, since it confines the filter design problems of 2-D QMF to 1-D's.
A more efficient representation scheme for QMF banks is to use the polyphase structure. Fig. 3shows the polyphase
representation of two-band 1-D QMF analysis/synthesis banks. The coefficients of filter partition 1, /0 (n), comprise the
odd indices of the prototype half-band lowpass filter, i.e.,
h0 (n) = h0 (2n.) , 0 < n < — 1. (5)
In contrast, the coefficients of filter partition 2, h1 (n), comprise the even indices of the prototype filter, and they are in
the reverse order of partition. l's because of the linear phase property of the prototype filter. That is,
(N \ Nhi(n)= ho(2n+ 1) = h0 -- — 1 —n ,O < n
-i-- — 1. (6)
Note that, this structure takes only about one half of the hardware expense of the original filter banks.
3. A WORD-LEVEL SYSTOLIC ARRAY FOR QMF BANKS
According to the polyphase structure shown in Fig. 3, we get
yLP=ao(m.)xPo(n.—rn.)+al(rn.)xl(n—rn.), (7a)
m=0 m=0
= a0 (rn.) x0 (ii — rn.) — a1 (rn) x1 (n — rn), (7b)
m=0 m=0
a0 (rn.) = lip0 (rn); a1 (rn.) = h1 (rn.) for rn. = 0, 1, ... — 1. Substituting Eq. (6) into Eqs. (7a) and (7b), the above
equations can be rewritten as:
= ao(rn.)xo(n—m)+ ao (
_1_rn.) xi (n—rn.), (8a)m=0 m=0
SPIE Vol. 1818 Visual Communications and Image Processing '92 / 1029
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
Z6 8U!SS0Jd St?wf pUt? suo!p?3!unwwoJ I'flS!A ? I 1 I /Ofl RIdS I OCO I 
( TT) TU2dT_ . . + TUAdxTp O'O = wdx 
oJqM ' [i _ 4v • • • I p 01)] l IO3A 
-JJooo idni- qi pu {T 0 Ut [i _'l.lJ.(J( • . / 11JLl O(t1Ld] } = {md} SJODA 2ndm ojdn- OM JO '{ t '0 = 'U wfi} 
sionpoid iurn oii jo uoTndmo3 op Si juq srsiu ioj po wupunj 'oAoqc pssnosip sii 
iq 'ijjnpy .uonipisuo owi qoq ar ndqnonp pu Aipoidmoo ipiq uoq onanpu jiJs-nq pu jjid-iiq jo suotjoojs oq uoiq sjjo-pi uios 
s/cMr ar oioqi 'IoJJTLL oiq ioj jo p ssooid uo uo nq qpiMpuq Q/ sso pu oxpiq ut uojj 0101-u TflW SiJJ OqOS/S JLrns-Tq 'SUO jjmd-iq q! uosudmoo uj oioaapir jojjid-q qi osoqj jo SUOISJOA 
2UUqS-UJtj j13 S ppii q UO OflTUIJ1LL jiis-uq qm4 sir oTjosis jAj-iq 'rnssoooid p jo UiodMTA oqi woij •l'fl31!° puidid jA-rq idoid TTITM P30Td01 si aIruo4IqoJE JOAOJ-pJOM JO UUOdUIO3 q 'si uj • 1uTTnJdTd J3A0J-0M JO 1doUoo O uiAoidmo q puiqo ai iqj /IoAriodsai 'nouapu Tps-q PU T4IIM OlE SIJ1 
jA-Tq o 3SUJ 1fl1OUjOJ IAIP1O JO SUOiUWdTUI o ijjo o uoos sap UJ 
SNNVfl LHTO IO1 SAVUUV WIOISAS TEIAfl-IIq I' 
sudx arMpmTJ p onpi o sinpqar mo q pojdi q U3 S)U ijg o joojq qoj • •!d U! UMOS 0/V Ja P3Sd q psodoid arnon.rjs oj UiSfl q pzqoi q uo sq çj- jqmdog 
Aoq UOflU9W slDnpoid uui o jo 3U1JJTp oqij pu inns p UT4fl[BA iq JJO inouoq qi prqo am sndjno puq-ids o p 'UIjJ pu iOdx 'Aiuru 's43npoid ium o p ndmo3 o posn ST fl SjO3 (VF'S) 
MJOS/S JO psodmoo si 'UonJdo 3npoJd JUUT p sauojid qoiq 'sjjo uim qi jo uijo JgUTT '-LL UOH4SEJ MO)JS UI I1r UIUJ O 1U SpIOM p POAIJU 4 'Si JO IJJ fl U1OJJ UIJIU IJlç{ IT33 J3X3IdtTflW 
pTI JJ33 UTTIT U331p AJUOZJOq U33M3 pOJ3SUI 3J SAI3P o J3UfflU 31UdO1dd 'JI3doJd suoT4I3do 
UifflJ3did TlLJOJJ3d o I3JO UJ SUOfl3Jjdd j- ioj cJ11o3dso 'UOTdUJflSUOO J3MOd J13AO 3142 S33fl9J OSJ SfljJ SSO 'Tt 2n01T1!M JJUJA3J134UI 3POj3S 3q uo JJ30 JOX3jdTjflW qoo UTJ3U3 SUOA p OM 3141 'SflUJ 3i )JOOI3 IT10AO 31T1 JO 
jjq 3UO Si SJ3SJ3J JHJS 341 JOJ 3J f30j3 31fl 'q 3ON XfW qi IA pJOM-Iq-piOM P3A3IJ31U! S1U3IJS ndui owj oq iU3w3u1r OAOq oq Tpi/ do2 O uiooq moij IJJ xm P SIOIUO J3TJO 3UT SJa1ST31 JHJS pIOM-3UO JO 
UTqO J3UIT IA uio3oq oi do4 WOIJ (iii xnvi °P,, S i O J3JaI 3M '!0TidwS UOi1oU J0J) S1133 J3X3TdUTflUI JO it 3111 OUT 3J SI U3UOdTflO3 3U0 SU3UOdUIO3 3ST4dIIOd o orn p3x3IdTlInw3p sig 3i p ndrn qi ' !d O UTII3J3 SU3T3qJ3O3 OTJI JO 3JO1S 3Tp IOJ lU3Tfl3flflb3l Oqi S3AJ UOW3UJ1 U 3fl UOqS S MOjJ p fldUT 31p JO 1UOUJ3UJJE 
/c)T3TJ:I TA 3U0 JcIUO O SJODOA U3T3JJ3O3 OM 31fl 33flI O ST UI UMOS 3Ifl33fljDJE 3I JO 3pT 3TdTOUUd LL 
t, •!d 
UT p3AJI3p SI sjuq JJI44 puBq-oM4 UT JOJ 31fl133RJ3J O!IOSS 10A31P10M 'ITUJp1O33V [T . . +NLX T+N] 
= '•• ui OSO JO JOJO 3S13A31 3fl U S3UU3 flM JO33A 3q IOJ spus T ()dx = (1x 3JOM 
( UT) iJ(IX = 
( 6) 1?o(I:: = 
s U3flTIM3I 3q U3 IOTfl 'Sj uO!pId ijg I3 JO J3JO 0S13A31 31J1 UT ar Z uotj.prd '0UJ '-P JO SU3T3JJOO3 3q qi ZYBJ 31J1 UISfl I3ATP3dS01 Idxl?()dx dUfi tdX+t)('X d7f 
3J )TUq STS/[BU Th4O puq-oM4 UT TA X Ur3IIS p ndrn JO SUOISJ3A P31311TJ ssEd-qTl pu ssd-MoJ 3q1 'icpUnb -OSUO iTOAT13OdS3J l?tdx pU ¶3()dx 3I S3flpOJd J3UUT UipUOdS31IOO OM Oq IJOTJS UI [OD . :TD T-D] 
= 11? pUE Tv 0D] 
= 01? SJOPOA U3TOJJ3O3 UTpUOdS3flO3 ITOq quM '[T+NX ... UX T_UX] = Tdx puE 
"x] = Odx 'cJOUrU 'SJO4O3A p ndUi OM 3141 JO SlonpOid J3UUI OA1 3141 3lfldmOO 01 OJ SUOt1J3dO uoilnI -OAUOO OMI 34L 1c13A1133d83i SUOT1J3dO UOT1OIOAUOD OMI UUOJJ3d 01 ST Ui31ScS ST141 JO UOTTi3dO II13AO 3141 141 U33S q ' 
(q) (tu — u) dx — — ) Op — (w — u) Tdx (nj) Ov = dHfj 
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
I W—1 W—2 1 0a=a a2 •act2
— /-w—1 -W—2 —1 —0pm,i YLpmi Xpmi • • • Xpni Xpm,i
— ( 2W+G—1 2W+G—2 • • 1 0 \ (14Ym — \Ym Yrn • Ym Yrnj '
G
= iog2 . (15)
The number G is the wordlength growth required for inner product computations. For notation simplicity and without loss
of generality, we have assumed that each input data word and each coefficient word are both of W bits.
4.1 Bit-parallel systolic array
Fig. 6 depicts the bit-parallel SMA with two's complement arithmetic. It is the improved version of the design proposed
by McCanny and McWhirter13. The circuit shown m Fig. 6 is composed of W2 TYPE-Pi cells and W (W + 1) + GW
TYPE-P2 cells. The TYPE-Pi cell performs the bit-level multiply-and-add operation, while the TYPE-P2 cell performs
only bit-level addition operation. The logic functions of these two types of cells are depicted in Fig. 7. It takes W+i
cycles for the bit s'(n) of the result s(n)=a(n)x(n)+s'(n) to appear at the output after a°(n) is fed into the system. It
is noteworthy that, in order to perform two's complement arithmetic properly, each TYPE-Pi cell comprises a built-in
control bit d to manipulate the polarity of each bit of the operand x(n) as shown in Fig. 7. Performing multiplications
with two's complement arithmetic, instead of the sign-extension approach used in some literatures we can reduce the
hardware expense by using the correction-term-compensation approach based on the two's complement arithmetic derived
from Baugh-Wooly algorithi&4. The correction term, can be properly included in the arithmetic results by adding it to the
operand s'(n). That is , we replace the operand s'(n) with s'(n)+ct, where Ct stands for the correction term. This approach
is easily achieved by sending ct as s'(n) at the first SMA in the filter arrays, since, in general, the accumulation term s'(n)
in the first SMA of a filtering array is initially zero. lii comparison with McCanny's array , this improved array simplifies
w (W + 1 ) + GW TYPE-Pi cells to TYPE-P2 cells, thus reduces the hardware expense, since the TYPE-P2 cell is with
less complexity in hardware.
By employing this SMA, a bit-level systolic array with bit-parallel arithmetic for QMF banks is obtained as shown in
Fig. 8. In this array, the coefficient vector a = [ao a1 ai_1] is stored in the latches, denoted by the solid circles, in a
sequential order as shown. As discussed above, the input data are first decomposed into two polyphase components via a
demultiplexer. Subsequently, these two polyphase components flow along two opposite directions and enter the main array
via a linear chain of bit-level multiplexer cells. The arrangement of the shift registers, which are above the multiplexer
cells, is organized according to the word-level realization. The main array is composed of bit-parallel bit-level SMAs,
where N is the filter length. It acts as a dual-input inner product array with a single set of coefficients. The two resulting
inner products enter the linear chain of TYPE-P3 cells interleavingly. As mentioned previously, the lowpass-filtered output
is obtained by summing the two inner products, while the highpass-filtered version is their difference. The linear chain of
TYPE-P3 cells is used to evaluate the sum and the difference of the two inner products, simultaneously. As shown in Fig.
9,each TYPE-P3 cell consists of two full adders and several one-bit latches. In this linear chain, the overall operation of
the upper full adders is to evaluate the sum of two consecutive input bit-parallel data. A "0" is sent to the right input line
above the top cell as the initial carry for the upper full adders. In contrast, the overall operation of the lower full adders
is to evaluate their difference. Thus, one of the two data bits entering each TYPE-P3 cell is inversed, then sent to the
lower full adder. A "1" is fed as the initial carry for the lower full adders. The purpose of such arrangement is to perform
subtraction with two's complement arithmetic. It is noteworthy that, to accommodate the wordlength growth due to the
two arithmetic operations, the one-bit sign extension is appended at the bottom of the leftmost array as shown. The output
results are obtained at a rate of two outputs per two cycles. Since the resulting data are significant only at one out of two
cycles, one must select the timing for grabbing data carefully.
SPIE Vol. 1818 Visual Communications and Image Processing '92 / 103 1
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
4.2 Bit-serial systolic array
A bit-serial systolic array which can perform two distinct inner product computations was proposed by Wang et al.2.
Based on this inner product array (IPA), we can construct our bit-serial realization for QMF banks. For this WA, an
improvement is made by appending additional W+1 ones at the rightmost of the linear chains of TYPE-S2 cells to obtain a
regular skewed output format, thereby facilitating the succeeding processing. Recall that, since the coefficients of the filter
partitions 2 are in the reverse order of the filter partition l's, we can use only one of the even indices of the half-band
prototype filter's coefficients to perform lowpass and highpass filtering processes, simultaneously. These coefficients are
distributed over the internal latches of the TYPE-Pi cells as shown.
Referring to Fig. 10, the input data are first demultiplexed into two polyphase components. One of the two components
enters the first row in a bit-serial format and moves from top to bottom via a number of delays, while the other enters the( — 1)th row and move in the opposite direction. Before entering the first row and the ( — 1)th row, the two input data
bits are repeated twice and then interleaved bit by bit via a linear chain of multiplexer cells (TYPE-S3 cells) controlled by
a MS stream. In order to prevent the increase of the amount of latches due to the repetition of the input data bits, the clock
rate for both sending the input data bits and shifting the data bits in the two rightmost columns of delays (shift registers) is
halved. To achieve this purpose, The CTRL signal can be used as the half-speed clock signal. The number of the delays are
appropriately selected to ensure that the data bits enter each row in a skew parallelogram fashion for pipelining operations.
They are k1 = k3 = W, k2 = W + 1 , k4 = W — 1 . The linear chain of the TYPE-S4 cells is used to evaluate the sum
and difference of the two inner products. The logic functions of TYPE-53 and TYPE-54 cells are shown in Fig. 12. Each
TYPE-S4 cell comprises a one-bit internal register and operates under the control of the MS stream. When MS =0, the
linear chain of TYPE-S4 cells stores the output data word of 2W+G bits emerging from the accumulator cells. When MS
= 1, it evaluates the sums of the stored data word with the entering data word and its two's complement (i.e., the one's
complement of the entering data word plus the value "1" at the carry input of the rightmost TYPE-54 cell), simultaneously.
Thus, two filtered outputs are obtained. Consequently, the resulting lowpass and highpass outputs come out from the bottom
of this array in a bit parallel fashion at a rate of two outputs per 2xW cycles. To further reduce the pincount, a parallel
to serial circuit2 can be used.
5. CONCLUSIONS
In this paper, we have proposed two bit-level systolic arrays with bit-parallel and bit-serial arithmetic for QMF banks.
The former allows faster data rate, while the latter takes less hardware expense and pincount. Thus, they can be applied
to various video applications depending on the trade-off's of data rate, chip area, and prncount required. By applying the
polyphase representation as well as fully utilizing the special relations among QMFs', aside from the memory cost, the whole
filter bank can be constructed by using only about one half of the hardware expense of a prototype filter. In comparison with
the direct realization using polyphase representation, the number of SMAs required for our architecture is halved. Table. 1
illustrates the brief comparisons between the proposed realization and the direct realization10. It shows that, aside from the
memory cost, the hardware expense of our architecture is about one half of Pestel's. For 2-D applications, by using the tree
structure shown in Fig. 5, the data rate for SMAs is one half of the original's for our architecture, and is one quarter for
Pestel's one. Since our designs achieve bit-level pipelinability, with today's commercial 1.2tm CMOS technology allowing
data rate of more than 30 MHz, they can meet the speed requirements of most video sources, even for HDTV applications.
As a result, a single-chip realization for the whole filter bank is achievable.
REFERENCES
[1] H. T. Kung and M. Lam, "Fault tolerance and two-level pipelining in VLSI systolic arrays," in Proc. MIT Conf.
Advanced Res. VLSI, (Cambridge, MA), pp. 74-83, May 1986.
[2] C.-L. Wang, C.-H. Wei, and S-H. Chen, "Efficient bit-level systolic array implementation of FIR and hR digital filters,"
IEEE Journal on Selected Areas in Communications, vol. 6, pp. 484-493, Apr. 1988.
[3] p. P. Vaidyanathan, "Multirate digital filters, filterbank, polyphase networks, and applications: A tutorial," Proc. IEEE,
vol. 78, pp. 56-93, Jan. 1990.
[4] H. T. Kung and C. E. Leiserson, "Systolic arrays (for VLIS)," in Proc. Symp. on Sparse Matrix Computation and their
Applications, pp. 256-282, 1978.
1032 / SPIE Vol. 1818 Visual Communications and Image Processing '92
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
:uo i i Z6, SU!SSaOJd awj pue SUOflD!UflWWOJ !flS!A G I G I jOfl Id 
L6T U 'LOT-coT dd '-z JOA 'sJizndwo3 SUVlj ::ii 'unpuO uoiioijdTiinm jJJEJd iuouijdaioo so r,, 'T00M v i pu in j 3 ftj] 
Zg6T Jd '9ji-j' dd 'Z6T jOA '•3OJcI 'IS'1A 
OJ jq1Tns 1çgjj iijdjnm puiidid 'oAnuo1T jjduioj,, 'inuo o r pu A i [TI 
86T 'IIH-3UU1d :fN sjjqj poou uzssaaoJj /VUl /VJ/cJ JVJlJ/flJAJ 'iuiqw>j j 'j pu iiqooij E[ 'a [zTI 66T 'JAI 'Tc9T-8t'9T dd 'jse( dw juj ,çqj 
•°-cJ UI 'UipOO AIIH JOJ sjuqipj Iuo!suoIuTp-oiw Jo OJflD4iqOJ IS'IA 'J)JzuiM •Fs1 pU •N 'qSJTd d Lu] 
T66T •I:\I 'TZ-t'T •dd 'j j0A 'ipaj oapz •js 7Jfl3JJJ suv1j :qqj 'suflsuoo uoiiuouijdarr IS'IA uupTSuOO suqiijj puqqns AIGH jo uisj,, 'JJ i pu J4Sd fl [UT] 
T66T •'FI 'i dd 'j bA 'Z/3?j o,p!A •isaTg sjinaj SUVJJ L-qqqJ 'uissooid oun iup-joj ioj SJpJ iip çj- puidid T3AT-TH,, "v. w- [61 686T 1Iflf 'j7:TT 
—gj j j dd 'L JOA 'uissa3oJJ /vuz 'zpad 'Jsnow suvij jqjj 'uoT1!soduJoop pu pq-jOO pJO3S u!Sfl urnuj3did :1 LIVdS1UU T1!!P OAISJflDOJ UT WsTJOIjJd pu WAJJU! UTj3dTd,, 1TT1RJ3SJSSIAT 0 U PU Pd )I N [8] 
T66T 'T 'oj-c6jj •dd '6 bA 'Xuzsca3OJj /VUJ7 SuVJj [3J 'iJJ T1TP UZ '0J 1oiscs pAoJduIT uv,, 'jquqg i N [L] 
0661 'Y 'cg-c dcl 'g JOA 'ussaaoij 1vuXz 'zpd 'jsnoa suvjj qHqI 'sTII! TTTP 
UT rnqom jdwi ioj pu SJOIJJ iup a- o uoniuwjdun oqi ioj soinoouqoi juos,, 'pomqy-pig v i' [9] 
i86T P0 'T99 •dd 'TT bOA 
'aoi,j jjj 'uissoooid jus .ioj spoqiow uoTTIdTbnin iOOA pu xuw oqoisg,, 'poo ci pii iqnbi i [c] 
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
Fig. 1. A configuration for typical 1-D two-band subband coding systems
YLL
YLH
YHL
YHH
Fig. 2. A tree-structured configuration of separable 2-D QMF banks
x--±ty-
x
x
LroT2
(even index of coef.
(a)
t() Y\ +7
y/\\
...YHP
y
Fig. 3. Polyphase structures of 1-D two-band QMF banks (a) analysis filter bank, (b) synthesis filter bank
1034 / SPIE Vol. 1818 Visual Communications and Image Processing '92
Filter Partition 1
(odd index of coef.)
(b)
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
YHP YLP
CTRL
input signa'
Fig. 4. A word-level systolic array for QMF banks
L Hh
LhHb
H Lh
HVHh
SPIE Vol. 1818 Visual Communications and Image Processing '92 / 1035
input signal
Fig. 5. The tree-structured realization of 2-D separable QMF proposed by Pestel et a!.
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
sy
m
bo
l: 
d=
O
 
iiI
Iih
h1
 
o
 
o
 0 0 
X
II
 
X
n+
lX
n+
2X
n+
3 
0 
s/n
 
1 1 1 1 
X
n_
1 
X
II
 
X
n+
lX
n+
2 
2 2 2 
X
n_
2 
X
nl
 
X
II
 
X
n+
l 
S 
n
-2
 
sy
m
bo
l: 
Fi
g.
 
7.
 
Th
e 
lo
gi
ca
l 
fu
nc
tio
ns
 
o
f 
TY
PE
-P
 1 and TYPE-P2 cells 
10
36
 / SPIE Vol. 1818 Visual Communications and Image Processing '92 
0 0 
0 0 0 0 S n-6 S S S j3 
1 1 1 1 S S 
n
-6
 S S 
n
-4
 
2 2 2 2 S n-8 S
 
j7 S
 
ti-
6 S n-5 
4 
S/
n-
4 
I 
T 
G
 I 
/5
 
S 
6 
Fi
g.
 6.
 
An
 
im
pr
ov
ed
 
bi
t-p
ar
al
le
l 
sy
sto
lic
 
m
u
lti
pl
ie
r-a
cc
um
ul
at
or
 
TY
PE
-P
i 
ce
ll:
 
TY
PE
-P
2 
ce
ll:
 
Fu
ll 
A
dd
er
 
*
 
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
yl
y2
INPUT SIGNAL
symbol:x*
Fig. 9. The logic function of TYPE-P3 cell
SPIE Vol. 1818 Visual Communications and Image Processing '92 / 1037
Fig. 8. A bit-parallel bit-level systolic array for QMF banks, where N = 6
Cl
Cl'
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
CTRL01
x x
x
010101010
x x x2 x2
x_7 x:i
x_2 x.5 x2 x5 x2 x5 x x3 x1
x_4 x3 x14 x3 x4 x3 x2 x.1
x_6 x1 x6 x1 x6 x1 x4
x_1 x1 :11 xl xl xl x+l
0111100
Fig. 10. A bit-serial systolic array for QMF banks with N = 8
If I III
C,
S
x' — x
Y C ( aX)
C' YC+Y( a X)-i-C ( a X)
PTRL' PTRL
S' —(SC€P)PTRL
S' —(SCP)PTRL
C, — SC+SP+PC
Fig. 11. The logic functions of TYPE-Si and TYPE-S2 cells
1038 / SPIE Vol. 1818 Visual Communications and Image Processing '92
PTRL 0
MS 1 0 1 0 1 0 1 0
88
YLPYHP
xx yy;
xx
xx
xx YY
xx Y1Y
xx 22YLPYHP
xx YL'PYF:P
xx y1y
xx
P
5,
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
SEL
SEL' — SEL
x
Y if SEL=Lthen
z — Y
else
z — x
if MS=O,thenS -X
else
Cl' (S• C1)+(X•C1)+(X•S)
C2' —(S• C2)+(X •C2)+(X •S)Y -SC1X
Z SC2X
Fig. 12. The logic functions of TYPE-S3 and TYPE-S4 cells
SEL
Table 1. Comparison table of our proposed array and Pestel's for l-D QMF banks
SPIE Vol. 1818 Visual Communications and Image Processing '92 / 1039
z
SEL'
Ours Pestel's
No. of SMAs N/2 N
clock rate for SMA input data rate 1/2 input data rate
clock rate for shift
registers
1/2 input data rate 1/2 input data rate
throughput
2 outputs per two
input clock cycles
2 ouputs per two
input clock cycles
Downloaded from SPIE Digital Library on 04 Feb 2012 to 140.114.195.186. Terms of Use:  http://spiedl.org/terms
