Abstract-Two novel memory decoder designs for reducing energy consumption and delay are presented in this paper. These two decoding schemes are compared to the conventional NOR decoder. Fewer word lines are charged and discharged by the proposed schemes which leads to less energy dissipation. Energy, delay, and area calculations are provided for all three designs under analysis. The two novel decoder schemes range from dissipating 3.9% to 23.6% of the energy dissipated by the conventional decoder. The delays of these designs are 80.8% of the conventional decoder delay. Simulations of the three decoders are performed using a 90nm CMOS technology.
I. INTRODUCTION
Memory modules are widely used in most digital and computer systems. In the large majority of the systems where memory is used, it is accessed at least once at every clock cycle. In microprocessors, the clock network power consumption due to memory devices (i.e. cache memory, register, and pipeline registers) accounts for 51% of the total power [1] . This in turn has generated great interest on studying schemes/methods to reduce power in memories. Schemes used to reduce power consumption ranges from modifying the basic six-transistor SRAM cell [2] to incorporating sense-amplifiers to the bit lines [3] . However, not much research has been done in the design of address decoders which require precharging of all the decoding lines.
In this paper we propose novel schemes to selectively precharge fewer word lines and to avoid connections between V DD and ground (through p-and n-type transistors). The organization of this paper is as follows. We present the designs of the conventional, the proposed AND-NOR, and the proposed Sense-Amp 4-to-16 memory decoders in Section II. In Section III we analyze the energy performance of all three decoding schemes. Section IV compares the performances of these schemes followed by a few concluding remarks in Section V.
II. DECODING SCHEME DESIGNS
This section presents the design schematics and the input-output behavior for the conventional, AND-NOR, and Sense-Amp decoding schemes. Afterwards, the performance of each individual scheme is presented and analyzed.
A. Conventional Decoder
The most commonly used NOR decoder requires to precharge the word lines [4] . This conventional decoder, shown in Fig. 1 , precharges all of the lines and then all are discharged except for the selected word line. This leads to high power consumption as all but one word line are needlessly charged and discharged.
This decoder operates using a precharge-evaluate cycle and uses a positive precharge signal. In the first-half of the period, the decoder precharges all of the word lines. In the second-half of the period, the selected word line remains charged while the decoder discharges all other unselected word lines. An example of the operation of this decoder is provided in Fig. 2 . The simulation displays the word line being charged and selected in the first cycle, being deselected and discharged in the second cycle, and being needlessly precharged and discharged in the third cycle when the word line was not selected. 
B. Proposed AND-NOR Decoder
In our first approach only one-quarter of word lines charge and discharge each cycle. We use the most significant bits of the address to select the portion of the decoder that will be charged. The basic form of the AND-NOR decoding scheme is shown in Fig. 3 . This decoder consumes less power than the conventional decoder due to fewer word lines being charged and discharged.
This decoder operates using the same precharge-evaluate cycle as the conventional decoder and also uses a positive precharge signal. A simulation of the AND-NOR decoder's operation is provided in Fig. 4 . In the first cycle of the simulation, the word line is charged and selected. During the second cycle, the word line is discharged and deselected. The word line is charged and discharged in the third cycle when it has identical most-significant bits (MSBs) as the selected word line. Lastly in the fourth cycle, the word line is not charged when it has different MSBs than the address. It is in this case when energy is saved by not performing unnecessary charges and discharges of the unselected word lines.
C. Proposed Sense-Amp Decoder
The second scheme that we have designed senses when the line is being charged and only allows for the selected line to be fully charged. This scheme is shown in Fig. 5 . Similarly with the AND-NOR decoder, only one-quarter of the word lines will charge. However, the word lines need not be fully charged since the sense-amplifier on the selected word line will charge the line up to V DD . Therefore this decoder can have a shorter precharge pulse and will consume less power than the AND-NOR scheme since only the selected word line is fully precharged.
The Sense-Amp decoder operates using a dischargeprecharge-evaluate cycle and uses a negative precharge signal. In the first third of the cycle all word lines are discharged. At the second third of the cycle one-quarter of the word lines are precharged. During the final third of the cycle the sense-amplifier on the selected word line will charge the selected line up to V DD while the other charged unselected lines will discharge. A diagram of the operation of this decoder is provided in Fig. 6 . During the first cycle of the simulation the word line is charged and selected. In the second cycle of the simulation the word line is discharged and deselected. The third cycle illustrates the word line being charged and discharged when it has identical MSBs as the accessed address. The fourth cycle shows the word line not being charged when it has different MSBs than the selected word line. 
III. DECODING SCHEMES ENERGY PERFORMANCE
In this section energy measurements of the conventional and the two proposed decoders are presented.
The contribution of different word lines to energy consumption is provided for each decoder.
A. Measurement Techniques
The Cadence Virtuoso Schematic Editor with 90nm CMOS technology was used to create the conventional, AND-NOR, and Sense-Amp 4-to-16 decoders. In order to simulate loading on the word lines, inverters with four times the minimum p-and n-type transistor widths were added on each word line. This simulates sufficient loading in order to drive a pipeline register as used in high-performance pipelined memories [5, 6] .
Circuit simulations were performed using Spectre. The energy dissipation of a word line was calculated by finding the integral of the instantaneous energy consumption through one period of the decoder's cycle; the conventional and AND-NOR decoders use a precharge-evaluate cycle while the Sense-Amp decoder uses a discharge-precharge-evaluate cycle. The average word line energy dissipation of each decoder was then calculated for different address transitions.
B. Performance Results
The results of the energy dissipation calculations are provided in Section IV. The performance of each decoder is more closely analyzed in the following three subsections.
1) Conventional
Decoder Performance: The conventional decoder was found to have the longest delay of the three decoders analyzed. A 260ps period (3.846 GHz) precharge signal was used to test the decoder, and logic zeros of approximately 0V and logic ones of approximately 1.1V (V DD ) were obtained.
Equations (1) and (2) summarize the following energy distributions of the conventional 4-to-16 decoder: (1) yields the energy dissipation when a word line is reselected by the decoder and (2) characterizes the energy dissipation when a different word line is selected and the previously selected word line must be discharged. The selected word line (E s ), reselected word line (E rs ), and discharged word line (E dchg ) dissipate the most energy of the decoder's word lines. However, unselected word lines (E ns ) also dissipate much energy since they are needlessly charged and discharged during each cycle. It is observed that the conventional decoder performs fairly uniformly for the variety of address transitions tested. 
2) Proposed AND-NOR Decoder Performance: The proposed AND-NOR decoder has a delay of 210ps as a 4.76 GHz precharge signal was used to test the decoder. Equations (3) (4) (5) summarize the following energy distributions of the AND-NOR 4-to-16 decoder: (3) characterizes when a word line is reselected by the decoder, (4) describes when a different word line with equal MSBs is selected and the previously selected world line must be discharged, and (5) summarizes when a different word line with unequal MSBs is selected and the previously selected word line must be discharged. The selected word line (E s ), reselected word line (E rs ), and discharged word line (E dchg ) dissipate the most energy of the decoder's word lines. Unselected word lines with equal MSBs (E nsMSB ) dissipate more energy than unselected word lines with unequal MSBs (E ns ). The AND-NOR decoder performs best when the address is held constant as energy dissipation increases when a greater number of address bits change. 
3) Proposed Sense-Amp Decoder Performance: The proposed Sense-Amp decoder has a delay of 210ps as it operated at 4.76 GHz using precharge and discharge signals. Equations (6-9) summarize the following energy distributions of the Sense-Amp 4-to-16 decoder: (6) describes when a word line is reselected by the decoder, (7) characterizes when a different word line with equal MSBs is selected and the previously selected word line must be discharged, (8) summarizes when a different word line with equal least-significant bits (LSBs) is selected and the previously selected word line must be discharged, and (9) 
We see that the Sense-Amp decoder performs best when the address is held constant or changes in the LSBs occur. When changes in the MSBs occur, energy dissipation increases, especially in the case of the negative transition 1111 to 0000. Table I shows the energy consumption per cycle and the energy-delay products for the three approaches. It is clear that the new approaches have lower energy consumption per word line than the conventional decoder. We also see that the proposed methods benefit from the principle of locality; the decoders dissipate less energy due to changes in the address's LSBs than changes in the MSBs of the address.
IV. ANALYSIS AND DISCUSSION
Transistor counts for the three 4-to-16 decoders are provided in Table II. This table highlights transistor counts for particular segments of the decoders and provides a total transistor count. The segments represent the transistors used in the precharge pull-up, discharge pull-down, column NOR gates used to control pull-up or pull-down transistors, column NAND gates used in the AND-NOR decoder to control a pull-down transistor based on the MSBs, and the sense-amplifier for the Sense-Amp decoder. A summary of the energy data and transistor counts is provided in Table III.   TABLE I.  DECODER ENERGIES AND ENERGY-DELAY PRODUCTS   TABLE II.  DECODER TRANSISTOR COUNTS   TABLE III. ENERGY, DELAY, AND TRANSISTOR COUNT SUMMARY V. CONCLUDING REMARKS This paper has presented two novel decoding schemes which demonstrate less delay and less energy dissipation than the conventional decoder. The two schemes feature the following comparisons with the conventional decoder.
• Energy consumption ranges between 3.9% and 23.6% (Sense-Amp and AND-NOR decoders, respectively) of the energy consumption of the conventional decoder. The proposed schemes make use of selective precharging which precharges only a quarter of the word lines and saves energy.
• Delay is 80.8% of the delay of the conventional decoder. Word lines of the proposed schemes can be charged faster during precharging and it enables these decoders to be operated at higher frequencies.
The energy and delay savings do not come without tradeoffs though. The conventional decoder is a very simple implementation which uses fewer transistors than the proposed methods. For systems that cannot devote much area to memory decoding, the conventional decoder may be a better choice if the proposed decoders are inadequate. However, the proposed decoding schemes are good choices for high-speed and low-power designs due to the greatly decreased energy dissipation and reduced delay.
