What decoder is, everyone knows. The paper presents fast and efficient method of layouts design of n-to-2 n -lines decoders. Two scenarios of layout arrangement are proposed and described. Based on a few building blocks only, especially prepared, and appropriate procedure of their placement, a decoder of any size can be build. Layouts of all needed fundamental blocks were designed in CMOS technology, as standard library. Moreover, some important parameters, such area, power dissipation and delay, were assessed and compared for decoders designed with proposed method and traditional. Power consumption were considered under extended model, which takes into account changes of input vectors, not only switching activity factor. All designs were done in UMC 180 CMOS technology.
I. INTRODUCTION
O NE of important functional blocks in digital circuits are decoders, generally n-to-m-line. They can be found in any selecting circuits, such as multiplexers, address decoders etc. They also can be used for realization of logic function. Especially, address decoders play important role in memory designing. Due to large amount of storage cells in today's memories it can be found various solutions of address decoder designs leading to power consumption reduction and performance increasing. Usually different kind of precharging dynamic decoders are used [1] , [2] . Some solutions use hierarchical decoders with predecoding. In [3] authors implement binary tree decoder built of demultiplexers.
The solution of decoders design presented in this paper can be placed between hierarchical decoders and with predecoding ones. Original idea is drawn from [4] . It is based on decoders extension by adding next levels of AND gates and 1-to-2-line decoder. The procedure starts with 1-to-4-line decoder, built of four AND gates and two 1-to-2-line decoders -ordinary inverter gate (see Fig. 1 ).
It can be observed in schematic diagram of 3-to-8-line decoder (Fig. 1) , that it can be partitioned into blocks. And in consequence a size of the decoder can be extended in easy way.
There is a problem with implementation of the decoder in CMOS technology, because AND gates are not "natural" in CMOS -their realization needs two gates, NAND and NOT serially connected. It increases number of transistors, power consumption and delay. So structure of the decoder has to be implemented directly with NOT, NAND and NOR gates only. Firstly, the idea of decoders layout design as standard cell, was presented in [5] . But authors have made some efforts to improve previously presented solutions and in consequence obtained results are quite satisfactory. One of the important is reduction of decoders area and small increasing the speed.
II. DECODER BUILDING BLOCKS IMPLEMENTED IN CMOS

A. General Idea
Classical implementation of a decoding function consists in direct realization of all minterms of the function, as products with AND gates. So, for n-inputs decoder standard products can be represented as follows:
where: l 0 , l 1 , ... l n−1 means literals -direct or negated input variable. Based on de' Morgan laws it is possible to transform (1) in such a way, that it will describe minterms implementation in CMOS technology with two-input NAND and NOR gates. Thus, a product can be exchanged by negated sum as follows:
Starting from last variable l n−1 using double negations and (2) we can progressively obtain equation described implementation of standard product with two-input NAND and NOR gates alternately connected in serial. Thus, following equation described consecutive steps of minterm equation exchange to NAND/NOR realizations can be written down as: m i = l 0 l 1 l 2 ...l n−1 = l 0 l 1 l 2 ...l n−4 l n−3 l n−2 l n−1 = = l 0 l 1 l 2 ...l n−4 l n−3 l n−2 + l n−1 = = l 0 l 1 l 2 ...l n−4 + l n−3 · l n−2 + l n−1 = ...
(3)
Finally, the last level of designed decoder will always consist of NOR gates, but previous with NAND, and so on. Depending on number of variables -the decoder inputs -the first level will based on NAND or NOR gates. Important is, whether it is even or odd number. For better explanation of these dependencies below are considered three-, four-and fivevariable minterms.
The first stage is always a 2-to-4-line decoder realized with NAND or NOR gates, described by literals l 0 and l 1 in (4), (5) , and (6). Considering above equations for whole a decoder it can be noticed that in case of NOR operations literals are negated, but in case of NAND not. It leads to different place of inverters in blocks of NOR and NAND gates. For NANDs the inverter has to be used for less significant half of the decoder outputs, but for NORs for more significant half. Such procedure of a decoder creation ensures logic "1" at selected output. But dual proceeding can be applied to obtain logic "0" at selected output of the decoder. In this case last stage of the decoder will be made of NAND gates.
In the next paragraph detailed description of building blocks and schematic diagrams are presented and further on two scenarios of interconnections are shown.
B. Schematic Diagrams
Based on above presented detailed derivations all blocks needed to create any size decoder can be designed. So, following blocks are developed. Blocks of the first level are BASE NOR and BASE NAND. There are 2-to-4-line decoders ( Fig. 2 ). Next blocks are NORS and NANDS, which consist of four NOR or NAND gates respectively ( Fig. 3 ). And the last ones are NNORS and NNANDS consisting of four appropriate gates with inverter. These blocks are used in next levels. Schematic diagrams of blocks are presented in Fig. 4 .
Designing of any size decoder, on schematic level is very simple using above described blocks and previously presented principles of theirs placing and connecting.
One can observe from Fig. 1 , that for third and next stages of the decoder are enough to use one inverter for upper or lower half of gates. But in first scenario of blocks connections authors used blocks presented in Fig. 4 . It was dictated by equalization of blocks dimensions and simplification of interconnections. Additionally, as small as possible number of blocks were prepared and easy way to placing them was developed. The second scenario of blocks routing based on the same building blocks and new connections cells. Moreover previously mentioned inverter was placed only one time for particular stage of a decoder. Even though the logical diagrams were the same it required designing of new layouts for blocks consists of NAND and NOR gates. Differences will be explained in detail in next paragraphs with layouts description.
C. Layouts
The important stage of the work was such preparation of layouts of the building blocks, in order to obtain easy and flexible way to create decoders of any size. Beside these blocks, with gates let's say -functional, auxiliary blocks with connection lines are needed too. Layouts of connection blocks should be arranged in such way, that building of any decoder will be natural, without additional complex design rules. They should be universal and their number has to be as small as possible. Because in bigger decoders connections can take much more space than functional blocks the second target is to minimize area of the connection cells even at the cost of their layouts design complexity. So, authors prepared two versions of connection cells.
Layouts of all building blocks were designed in CMOS UMC 180nm technology in CADENCE environment. Figure 5 shows the first functional blocks: BASE NOR, BASE NAND, which are used in both versions of decoders designing. And Fig. 6 shows remaining functional blocks: NORS1, NANDS1, NNORS1, and NNANDS1 designed in the first version.
In the second scenario of decoders designing the same blocks: BASE NAND and BASE NOR are used at inputs, but remaining functional blocks are little bit different. Also two blocks with four NAND gates and two blocks with four NOR gates were designed but through them two lines for input signal A are placed. The NANDS2 and NORS2 use direct input signal (A) but NNORS2 and NNANDS2 use negated signal (NA) - Fig. 7 . The line A is made with metal4 and NA with metal1 layers. So, the line A is placed directly above line NA which is connected to inputs of gates in NNORS2 and NNANDS2 cells. In NORS2 and NANDS2 cells the NA line have to bypass vias from the A line to gates inputs, made with metal1. Additionally the inverter from these blocks was moved to supply cell.
The connection blocks designed for the first version of decoders are shown in Fig. 8 . These blocks consist of metal1 and metal2 lines and vias between them. The cells are named appropriately to shapes of metal lines: Z=1, ZH1, ZII1, ZL1, ZY1 ("1" indicates the first version). The connection blocks in the second version have reduced area thanks to use of four metal lines stacked one above other. Layers of metal2 to metal5 form vertical lines and metal1 horizontal. Layouts of all cells are presented in Fig. 9a . Additionally Fig. 9b shows cross-section of one cell. Such solutions reduces area but outputs of the decoder are not in increasing order.
There are other two cells with metal lines only, which are used to forming supply lines for blocks including gatesnamed: S c1 and S g1 (Fig. 10 ). In the second version the inverter from NNORS2 and NNANDS2 is placed in supply cell and feeds the negated line NA (Fig. 11 ). The supply cells ensures creation of supply lines for all stages of the decoder.
Described blocks compose a complete library for building the decoders. In order to efficient usage of the library, appropriate design rules for placing of the cells are needed. Generally, the rules come from theoretical analysis presented in previous paragraph (IIA). Details will be explained in the next paragraph based on examples of decoders designing.
III. LAYOUTS OF EXAMPLE DECODERS
Three example decoders were built with prepared library of cells for universal designing of n-to-2 n -lines decoders. The examples will be used for explanation of cells placing. Figure 12 shows layout of 3-to-8-lines decoder designed with cells in the first version. The decoder is built of two stages. The last level is consists of NOR gates arranged with two blocks. The NORS1 block is used for lower outputs (here: Q0 ÷ Q3), and the NNORS1 for upper outputs (Q4 ÷ Q7). At the first stage the BASE NAND block is used. For connecting of the stages only ZY1 and ZL1 cells are used. Cells S c1, and S g1 ensure supply. Figure 13 shows the same decoder made with cells designed in the second version -appropriate cells were used. It can be seen that its outputs are not arranged in ascending order. The reason is designing of connection cells ZL2 and ZLY2, as is shown in Fig. 9 .
It is the simplest example, so only two connection blocks are enough. In case of larger decoders realization of connections is a little more difficult. It is illustrated in the following examples.
Let us consider the 4-to-16-lines decoder. Its schematic diagram divided in blocks is shown in Fig. 14. Similarly to previous decoder the last stage consists of NOR gates. But now, due to number of the decoder outputs, two NORS and two NNORS blocks have to be used for build of this decoder. Previous stage consists of NAND gates. In case of these gates NNANDS blocks are used for lower outputs, but for higher ones NANDS have to be used -see (3) . It is inversely to blocks with NOR gates. And finally, at the first level the BASE NOR is placed. Interconnections between the first and the second level are the same as in 3-to-8-lines decoder. But between next stages (2 nd and 3 rd ) all designed connection blocks will be used. Layout of the decoder, designed in the first version, is shown in Fig. 15 . The shape of lines corresponds to schematic diagram (Fig. 14) . And for comparison Fig. 16 shows the same decoder made in the second version. The schematic diagram (Fig. 14) does not correspond exactly to the layout in the second version, because there are no inverters in NNORS2 and NNANDS2.
The method of connection cells placing will be better explained using larger design. So, additionally in Fig. 17 is presented layout of 5 to 32 lines decoder with connection cells marked in colors. The placing method can be described based on observation of interconnections between two last stages of gates. Cells ZY and ZL are placed on the top-right to bottomleft diagonals only. Other appropriate cells are placed above and below the diagonals, making specific triangles. Above the ZY diagonal cells are placed with parallel, horizontal lines (Z=). But under the diagonal cross connections are needed, so cells ZH are used. In bottom part with ZH diagonal, above it parallel, vertical connections are used (ZII). And below the diagonal horizontal lines are ensured by Z= cells. This scheme of connection will be used in larger decoders, only number of cells increase exponentially according to number of outputs.
IV. COMPARATIVE ANALYSIS OF DECODERS
Three decoders designed using proposed method in two versions were compared with decoders designed in traditional style -with multi inputs gates. An example of such decoder is shown below, i.e. 3-to-8-lines (Fig. 18 ). From layouts of the decoders netlists were extracted with parasitic elements RC and simulated for assessment of theirs energy and time parameters. Moreover, area of layouts and number of transistors were compared too.
A. Energy Parameters
Authors used extended power model, which based on gates driving way [6] , for power estimation accuracy improvement. This power model consists in taking into account changes of input vectors. Inputs of a circuit are driven together (vector of primary inputs), not separately. The parameter describing the circuit activity is called probability of a gate driving way. Generally, this is a probability of a change of the gate input vector between two values [7] . For considered gate it is needed to have into account all possible changes of input vectors. Thus for n-input gate it gives 2 2n changes. Figure 19 shows all driving ways for 2-input gate.
Currents flowing through the gate terminals are measured for each change of the gate input vectors. Inputs and supply terminal are taken into account. These values and value of supply voltage are used for calculation of the equivalent capacitances. So it represents portion of charge flowing through a terminal of the gate under specific input vector change. Considering all possible changes of input vectors, tables similar to that shown in Fig. 19 are obtained for each terminal of the gate. Generally the model of power consumption can be represented as it is shown in Fig. 20 with set of tables containing energy parameters -equivalent capacitance -for each terminal of the gate.
Decoders designed using proposed method (with blocks) and in traditional way (with multi-input gates) were simulated in conditions allowing obtaining equivalent capacitance for above described extended power model. Input, driving signals had such parameters, that only dynamic, capacitive power consumption occurred in tested circuits. There was no quasishort. Static lose in used technology (180nm) can be omitted.
Obtained results of energy parameters assessment for considered decoders are collected in tables and additionally compared in graphs. For example, only for the smallest decoder, tables below contain values of internal load and all equivalent capacitances. Results for block decoder designed with the first version of cells are collected in Tab. I and Tab. II. Equivalent capacitance for the decoder designed in the second version are shown in Tab. III and Tab. IV. And the last two tables contain values for traditionally designed decoder. The internal capacitance (C Lint ) represents energy loses regarding to internal load of the decoder. The capacitance C Lall is sum of all capacitance values for corresponding driving ways (C Lint and C In x of all inputs). So for example, based on Tab. I and Tab. II input capacitance of the block decoder (version one) can be easily calculated. Values of equivalent capacitance vs. driving way are useful for power consumption estimation but comparison of decoders are arduous. So, for easier assessment of decoders above values can be summed. Table VII presents It can be seen that for larger decoders the lowest power consumption is obtained for block decoders designed in the second scenario.
For better analysis values of equivalent capacitance can be collected for the same driving ways of all considered decoders. Such bar graphs are presented in Fig. 21 . In this way energy parameters of decoders can be easy compared for specific driving ways.
Detailed analysis of such graphs allows a designer to choose decoder characterized by lower power consumption. But the probability of input vectors changes are needed. 
B. Time Parameters
The second analysis of designed circuits was assessment of delay times. Based on simulations the t pHL and t pLH for critical paths were measured. Decoders were loaded with capacitance of 10fF. It is approximately equivalent to three inverters. The rise and falling time of input signal were equal to 100ps. Values of the worst delay times for designed decoders are collected in Tab. IX. Average delay time is marked by t p . It can be seen that in cases of bigger decoder, the block decoders are faster than traditional ones. The best results are obtained for decoders designed in version two. Block decoders have multi-stage structure. Thus, it would seem, that they should be slower. But traditional decoders, at inputs of gates, have long lines, which increase input capacitances. It is also observable in results of energy parameters assessment.
C. Area of Decoders
Another important parameter of integrated circuits is area of layout. Table X includes dimensions of designed decoders. Proposed method of decoder designing results in almost triangular shape of layout. But the table contains width and high of the circuits. So, some part of the area, approximately third part of whole area, can be used for other circuits in case of the first version. Using cells designed in the second version dimensions are decreased and in case of 5-to-32-line decoder the area is reduced to 52% in comparison with the first version. Sometimes area of integrated circuits is represented by number of transistors. But in case of decoders most of area is occupied by connection lines. Fortunately authors have proposed the second version of cells to build of block decoders. Table VII shows quantity of transistors in traditional and block decoders versus number of their inputs.
From above table it is observed that using 2-input gates for decoders design total number of transistors increases slowly for block decoders. In case of 8 input decoder number of transistors in the traditional one, is doubled with comparison to the block decoder.
V. CONCLUSION
The universal method of decoders design was presented in this paper. Suitable library consisting of needed cells for decoders construction in easy way was prepared. Two versions of cells were developed and designed. The method can be easy automated allows synthesis of any size decoders.
Three decoders were designed with using of proposed and traditional method and their parameters were assessed. Power consumption, time delay, and area are considered. Analyzing obtained results it is seen that decoders built with blocks have better parameters in almost all cases. Especially in case of the second version of block decoders obtained parameters are the best. Thanks to dimensions reduction parasitic capacitance was reduced and in consequence better performance of the decoders was reached. Proposed method gives easy and fast designing of decoders and they have better parameters than traditional ones. Power consumption was considered with using of extended model. Such approach allows detailed analysis of designed circuits thanks to exploration of input vectors and in consequence selection of the best solution for given conditions.
