Recent methods of syothesizing logic that is fully and robustly testable for dynamic faults, iianiely path 'delay, traiisistor stuck-opeu aiicl gate dplay faults, rely allnost exclusively on flattening given logic expressions intb sum-of-products form, minimizing t,Iie cover to olbtain a fully dynamic-fault testable two-level represenlation of the functions, a.nd performing structural tra.iisformatioiis to resyiit,liesize I.he circuit, iiito a multilevel network, while also maintaining full tlyi~amic-hilt I.es1.ahili t.y. While this technique will work well for ,randoill or control logic, i l is not practical for, many regular struct,ures. 
Introduction
Recent methods of syntliesiziiig logic that. is fully tes1,iiI)le for clyiiamic faults, iiaiiiely phtli delay, transist,or st tick-opeii aid g a k t l~l a y faiil (,s, (e.g. [2, 41) rely aliiiost exclusively 0 1 1 flatteiiiiig giveii logic expressions into sum-of-products form, iiiiiiiiiiizing the cover to obtaiii a fully dynamic-fault t,estable two-level represeiibat.ion of !,lie functions, and perforiiiiiig structural t,raiisforniat.ions t.0 resyiilliesize 1,lie circuit, iiit,o a multilevel network, while also ~i~aiiit.ainiiig fiill tlyiiaiiiir-fault, t.est,abilit.y. While this technique will work well for random or control logic, it. is not, practical for niaiiy regular structures.
There are two major problems with applying bliese syubliesis lerliniques to regular structures. First, for many of these type of circuits, the number of product tern= in the flattened structure bcconies prohibitive. Coiisider a binary adder as a.11 example. III a adder, the i i i i i i iber of product terms grows exponentially with t,lie i i u n i l m of bits. For an N-bit adder, tlie most significant bit of the suiii outpub has '2N+2 -4 product terms in its flattened represeutation. Aiiollier exaiiiplc is a parity generator. A N-bit parit,y generator has 2 N -1 product, ternrs in the flattened representation of tlie circuit. As a result, of this exponential growth in the number of product t.ernis, it, caii qriicltly I m o i i i c prohibitive in terms of bot11 the CPU l,iiiie reqiiiretl aiitl blie iiieiiiory requirements to flatten even relatively small regri1a.r st.ructures such as adders and parity generators.
A second problem is that iii flat,teiiiiig lJie origiiial logic ex])rrssioiis, the struct.ure that the designer has created i i i tIie overall arcliitecture is lost. This can result in a number of problems if the optiiiiizat,ioii algorithms caiiuot synthesize a.n im~~lementatioii which lias similar area or perforrnaiice cliaract~eristics, wliicli is oft,en the case for regular s l ructures. For example, oft,eil a. bit-slice approach works I~est. for inaiiy tla.la path structures. The basic I~uildiiig block caii be optiiiiizrtl aut1 laid out, and then the overall circuit constructed by simply replicatiiig this one block many times. Once tile struct#ure is Ila.(.tcried, a l l of tlic i t i h iiiatioii about the original structure is lost which iiiay not I)e rccovcra1)le by synthesis procedures.
~
As a result of the problems identified with (ryiiig to fla18tciriiig q i i l a r structures and then synthesizing a dyiianiic-fa.ul~ testalde iiiipleiiiciiktion, it is desirable to develop an alberiiative iiiel.liotl for clealiiig with Stuck-at-0 ca.ii be defiiied similarly. Since data path strucriires often form the critical path in a sysleni, it is importpnt that these paths be testable for delay. The adder structures which are analyzed in this section are the full adder which can IJe cascaded to form a ripple adder of arbitrary size, a carry select adder, a carry lookahead adder, and a carry bypass adder.
Ripple Adder
AII N-bit adder can be const,ructed by simply cascading N I-bit full adders in series by connecting the carry output of the i t h bit to the carry input ,of tlie i + 1 t h bit.
If each oiie-bit full adder stage is inipleniented in such a way lliat it is fully robustly path-delay-fault testable, then a N-bit ripple adder con'structedifroni these individual stages will also be fully robustly pathdelay-fault testable [3]. 
Composition

Carry Select Adder
This is summarized in tlie composition rule below.
We ha.ve designed a custom 4-bit ca.rry select adder that is fully pat,lidelay-fault testable and 4-bit stages can be cascaded to form a fully testable N-bit adder for arbitrary N . While tlie 4-bit adder has no a r m or perforillpiice pena,lty over a hand-designed version, larger bitsections do have an area penalty. The carry bypass adder of Sect,ioii 3.4 is superior in all respelcts.
Carry Lookahead Adder
The carry looltaliead adder generates t.he carry informa.t,ion for each st.age in pa.ralle1, instead of t,he serial manner in which the carry wa.s generated for the ripple adder. The advantage is of course a decrease in tlic length of the longest path in the adder, and consequentially an increase i n the performance of the a.dder. This increase in perforniance is o h e t by an increase in the area required t.o implenient, tlie desigii, as well as losing much of the efficient layout structure of the ripple adder where the logic for each bit was identica.1.
, 'I'lie carry 1oolia.lread a.dder creates both the propagate (P) and generate (G) terms for each bit. Tlie pr0paga.t.e term is asserted whenever a carry input would propaga.te through the adder section based on t,he values of tlie operands. Thus for a 1-bit section, P = A @ E . Tlie generate term is assert,ed whenever a carry is generated by an adder sect,ion based on only the va.lues of tlie operands (i.e. regardless of tlie value of the carry input). Thus for a 1-bit section, G = A . E. The carry out,put which is created from the propagate and generate terms is defined as CO = G + P . CZ. In Figure 2 (a), the sta,udard logic for generat,ing tlie propagate, generate, and sum outputs for each bit is shown. An alberiiate representat,ion is shown in Figure 2 (b . Figure 3 shows the logic for generating t.Iie carry signals for a 4-bits Lock.
In t,he alt.ernate p r o p g a t e and generate logic impleinentation shown in Figure 2(b) , tlie propagate signal is defined t,o be P = A + B. Tliis iinplen~ei~talioi~ recognizes t1ia.t. t.lie carry signal will be passed t~liroi~gli whenever either (or bot,li) of t,he operands are asserted, even though the generate ,signal will also cause the carry output to be asserted when both of the operands are asserted. In t.lie followillg text, it is sliown t1ia.t this alt,ernabe representafiop is ,fully tedable for pa.t,li delay faults, while the conventional iiiiplemeiit,atioii is not.
The observation can 'be made that for the 1-bit adder case, crea.t.ing tlie carry by first forn1ing the propa.ga.te and generate terms as shown in tlie alternate iinpleinent,ation is simply performing a n algebraic factorization of the original expression for tlie carry output. The unfactored expression for the carry is simply C1 = A0 . BO + A0 . CO + BO * CO.
Performing an algebraic factorization yields CO = ( 4 0 . BO) + (A0 + BO) . C O = GO + PO. CO. Since it was shown in [2] that algebraic factorization preserves robust path-delay-fault testability, the 1-bit carry lookahead stage is therefore fully robustly path-delay-fanlt testable.
This observation regarding the testability of the carry lookahead circuitry can be ext.ended beyond the one bit case. For a 2-bit adder, tlie expression for the carry output is C 2 = A l~B l + A l~C l + B l~C I . This can be algebraically factored and expressed as C 2 = (A1 . B1) + (A1 + B l ) . C 1 , which when expanded by substituting in the expression for C 1 becomes C2 = G 1 + P I .GO + P1. PO. CO which is the equation for the second bit of tlie carry lookahead logic. Since only algebraic factorization is involved, each of tlie carry outputs of a carry lookahead adder will be fully testable for robust path-delay-faults. Since the adder section is unchanged from that used in the ripple adder, the entire circuit will be fully robustly path-delay-fault testable.
While the carry looldiead implenientation described above is fully testable, the conventional implementation in which the propagate signal is iiiipleriiented as P = A @ B is not fully testable.
Carry Bypass Adder
The carry bypass adder is just a offshoot of the carry lookahead adder. It becomes very ineiGcienl to exlend the carry generation scheme described in the preceding section 1.0 very large adders. Typically the carry signals are only ca.lculat.ed in this manner for up to 4-bit sections. By creating a propagat,e and generat,e signal for each 4-bit section, the carry signal can be bypassed througli each sta.ge as shown in Figure 4 . The logic for tlie cumula.tive propagate and generate signal for ea.ch 4-bit stage can be expressed in t . e r i s of t,he propagate a,nd generate signals for each bit, where P = P O . P l . P Z . P 3 and G = G3+GZ.P3+Gl.P3.P2+GO.P3.
P 2 . PI. An iinplementation is shown in Figure 5 . Using a carry bypass scheme allows each N-bit stage (4-bit in this case) to be identica.1, and t,lius makes layout simpler since only one stage needs to be laid out, and then an arbitrary number of these blocks can be interconnected to form a larger adder. The carry bypass generation is an algebraic fact,oriza.tion of tlie carry signal just like lhe mrry looltahead was. In fact, by combining t,he logic for t,he mrry associated with each stage as shown in Figure 4 wit11 the propagate a.nd generate logic shown in Figure 5 , tlie total circuit is identica.1 t,o t1ia.t for a fiflli bit of a carry loo1taliea.d circuit. This circuit is thus fully t,esl.alde assuming that (.lie individual propaga.t,e signals for each bit are implemented as P = A + B. As in the preceding section, if the propaga.t,e signal is implemented as P = A @ U , then the circuit will not l e fully test.able.
For a %bit ca.rry select adder coniposed of 7 4-bit stages, the longest path for a carry select adder would be 10 stages of logic (each stage being two levels). For the Carry bypass adder, the worst case delay for the carry output signal would be 8 stages of logic -7 carry bypass chains and the propagate/generate logic in tlie first stage. The worst case sum output would go tIirou$l 10 stages of logic -G up to tlie carry input of the last stage, and 4 in the last stage of the adder to generate the sums (assuming a'ripple adder). By adjusting the stage size, the performance of the carry bypass adder can be madc to exceed that of the carry select adder. However, the significant advantage of the caiiy bypass adder is that it does not need to generate two difleient S I I I I~S a t each stage and then multiplex tlie outputs, creating a large savings in area.
Comparators
Binary magnitude comparators are another type of regular structure. It is desirable to have a coinparator which is fully testa.ble for dynanuc faults that can also be scaled to an arbi(.rary niunber of bits. Performance is also an important cost function when evaluating a l k rnate comparator designs. 11; this section the testability of some typical comparator iiiipleilieiitatio~ls and methods to cascade compara tors are analyzed.
Ripple Coinparator
An N-bit comparator ca.n be const,ructed by simply connecting N I-bit extensible comparators in series. An iniplement.a.t~io~i of a 1-bit ext,ensible comparator is shown in Figure G . It has five inputs, 2 of which are the operands A and 8 , and the remaining three are the results of the comparison of tlie less significant bits. The three oul.puk indica1.e whether A is greater t,liaii, less than, or equal to B. Tlie iinplenie~~t.at,ion shown in Figure G is fully testable for dynamic Caults. It has 14 ga.tes, 25 links, and 15 paths, all of which are robustly testable.
Tlie 1-bit extensible comparator shown in Figure G can be cascaded together to form a n N-bit comparator by simply connecting the comparison outputs of t,he i t h stage to the comparison inpu1.s of the i+ 1,1, stage. A 3-bit ripple comparator is shown in Figure 7 . Note t h a t (,lie first stage has been simplified by using the fact that A > Bin = 0, A < R,, = 0, and A = Bj, = 1. The ripple colnparator is fully test,able for path delay faults since each stage is fully testable and thc inpubs that affect each output of a given stage are fully controllable. Note that the entire input set for a given stage is not fully controllable, since t,he inputs A > Bin, A < Bin, and A = Bit, for a given stage ca.n only take 011 the values < 100 >, < 010 >, or < 001 >. However, the output A > B of the ith stage only depends on the inputs Ai, B;, and A > Bi-1. Thus for A > Bi to be test.a.ble for pat,li delay faults only requires that Ai, Bi, and A > Bi-1 be illdependently controllable and that the sbage itself be fully testable. A > Bi-l is independent of A, and U,, so A > U is fully testable. Similar argun~cnts can be made for the outputs 11 < U arid A = D. Each circuit C, can be broken 111' into P parallel circuits, each receibing I, ancl l I -1 k . Then, given Coniposition Rule 3.1, we have the above result. Figure 8 shows a inet.liotl for iniple~nenting a parallel coinparat,or. The cornparat,or in Figure 8 is a &bit comparator, and is composed of 3 144 pat,Iis, a l l of which are robustly t,estable. each or the coniparalors i n tlie input st,age. These outputs are A > B. i t < B , EQA, and EQB. Output A > B is asserted whenever A is greater than Ll, and A < B is asserted whenever A is less II for the bit.s of (.lie opera.iitls which are inputs to the pa.rt.icular Oiit.put. EQA is t,lie miniiiial expression olit,a.inetl from wing A t,he ON-set., ancl A > B as the DC-set,. 'Thus whenever EQA is asserted, A is eit.her greater than or equal to B , but. EQA # A 2 B. Likewise, output EQO is t.he ~nininial expression obtained from using A = f3 as (.lie ON-set., and A < E as the DC-set. Thus wlienever EQB is asserkd, A is either less tliau or equal to B. Figure 10 shows the det,ailed ilnplementatioli~ o l tlie final comparison stage used in llie parallel coniparator of Figure 8 . This logic takes as inpiit,s t,he 4 out.put,s of each a l the inpiit, sbages and geiieraks (.Ire linal ou(.put,s of t,he comparat,or. The output A > B is asserted whenever input A > B, and inputs EQAi+l,,-l are assert.et1, 11 being t,he n u m l w r of input. conipa.rator stages, nund,eretl from 0 to 11 -1. 'l'he E Q A t e r m can be ronsidered t.he equivalent, of h e propagate l e r n s i i i h e carry lookahead adder (see Srctio~i 3.3). 111 order for au a s e r t i o~~ or (.lie comparator lins a delay which grows linearly with the nunibcr of bits in the comparator. It does, Iiowcver, have a compact area. The alt.eriiat,e parallel coinparatoi. has a delay which is largely iiidependent. of the size of tlic coiiiparator. However, it has a hrge area peiia1t.y wliich becomes iiicreasiugly worse for large compa.rators.
Parallel Coinparator
If full robust path-de1a.y-fault (est,ability is not essential, a comparator ca.n be constructed which is fully gate-delay and st,uck-opcn fault ttsl.able, has an area. aiiproximately 30 percent. less t h i l l~a t of l,lie ri1)pIe coniparat.or, ant1 which has performance approaching t,liat, of tlie parallcl comparator. A block diagram of this comparator is shown in Figure ,I 1. It is iiiiplenientetl using a binary tree structure. 4-bit blocks can be replicated to form arbitrarily large, completely testable adders.
Showu that a carry bypass addcr can be made fully path-delay-fault testable and exteiisible for any iiumber of bits with negligible area aiid 110 performance overhead.
Designed a ripple coiiqmrabor that is completely path-delay-fault testable and extensible to an arbitrary number of bits.
Developed t.wo pa.rallel compa.ra.tor designs, the first of which is completely pa.th-delay-fa.ull testable, has negligible performance overliead, but a. siguificaiit area overliead. The second has comparable a.rea a.nd perforniance characterist.ics to t,lie tmditiorial pa.ralle1 comparat,or design, and is not completely path-delay-fault testable, but is fully gatc-delay-fault and stuck-open fault testable.
Analyzed various realizations of parity generators and ALUs for dynamic fault testability.
Desigiied a completely pa.th-delay-fault testable 71 x 2 parallel multiplier, for arbitra.ry i f , and a complel.ely ga.te-delay-fault and stuckope11 fault test.a.ble 71, x m parallel multiplier, for arbitrary n and
In the process of design modifical.ion t.o protlnce fully t,estalde st.ruclures, we have derived a number of iiew coniposibioii rules that niaink.iii robust. t.esl.ability in clyiiaiiiic fault models. These composition rules can be used to analyze a.nd design other regular structures for rohust dynamic testahilily arid to conipose regular structures with control sectioiis to create register-bouiicled subcircuits that are robustly testable for all dynamic h t I t s.
111.
