In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consumption in memory components. It reduces the power consumption by conforming memory contents to a precharging value of the memory. It selectively stores normal or inverted data so to reduce the total number of accessing bits different from the precharging value. In this way, bitline toggling during memory access is minimized and this ultimately contributes to a reduction in power consumption. We develop two practical implementations for the proposed method, that are vertical strip, and horizontal strip inversion schemes. Simulation results indicate that implementation of the strip-based inversion schemes contribute to a power reduction up to 50%.
Introduction
The investigation of power consumption in individual sub-blocks of processors has demonstrated that memory components consume the greatest percentage of the total processor power[l, 21. StrongArm processor that is one of the leading low-power processor dissipates over 40% of the total power in cache memory. Similarly, an x86-compatible CISC processor core called ACCENT, which has no internal cache, consumes 35% of its total power inp-code ROM which occupies 8% of the whole chip area. The significant power consumption and large chip area occupied by memory components warrants the application of design methods that limit the power usage of memory components. In addition, the development of integrated systems based on Memory Merged Logic (MML) technology indicates that the performance and power consumption of single chip systems will be increasingly determined by memory.
Architecture design, circuit and logic design, process development, and CAD algorithms have all been researched as a means of reducing power consumption o f microprocessors. For memory devices these works have focused predominantly on circuit and architecture designs. More specifically, the works repotted in the literature related to low-power implementation of memory include special cell structures [3, 41, charge recycling[Sj, limited bitline swing [6] , selected block activation [7] . However, all these works have suggested schemes for reducing power only in the physical point of view. Whilst these works have demonstrated a correlation between physical memory cell design and power consumption, the relationship between the memory content and power consumption has not been addressed.
In this paper, we establish a relationship between memory contents and power consumption of the memory. We propose a novel concept, 'conforming inverted data store', which can be used to alter the memory contents and minimize the power consumption by memory components.
2
To achieve high-speed memory access, most memories utilize sense-amplifying schemes with precharging. Under the precharg- In the large memory arrays the majority of power is consumed when driving the high capacitance of the bit-and wordlines. Moreover, it is observed that the power consumed by a memory component is dominated by the power required to drive the bitlines [6] .
With the precharging mechanism, the 'conforming inverted data store' provides an efficient power reduction by decreasing the number of bitline togglings. The key idea of the scheme is to make the majority of the memory contents conform to the preset precharging bit value: When data are stored in a memoly, it selectively inverts the data such that the resultant data maximally coincide with the precharging bit value.
For the convenience of easy explanation, we define 'conforming bit value' as the same bit value as the precharging bit value, and 'unconforming bit value' as the different bit value from the precharging bit value. With the definition, the conforming bit value does not invoke discharging of a hitline during memory accesses, and the precharged value on the bitline remains until the next precharging operation. No dynamic power dissipation occurs for the access of conforming bit values. Consequently, the number ofthe bits holding the unconforming bit value presents the amount ofthe dynamic power dissipation by bitline togglings. The more we change accessing bits to have the conforming bit value, the more the power consumption reduces. Of course, this scenario works with assumption of a single bitline evaluation and a precharging Value Of ydd Or Vp.d In ~d d i t i d i t tu the reduction of the switching probability of thc bitline, the number o f drain contacts in the bitline and, thus, the rural capacttancc in the bitline IS reduced as seen in t i g 2 Therefore, the effect oftiie confurming inverted store IC pr.stic;rlly larger than that could he expected tiom the decrenienls in the number uf iinconforming bit reading, Table I Whenever memory write occurs. the dccision on inver5ion i s niadc based on the numbcr of unconfonning hits in the hori7on-tal strip assuming the d3ta requ.-ted to be stored has alrcdy been uritten on the srrip 'lhc invcrsion indicdtor is >et \ruth the uncunfomiing value I i the numbcr of unconfoniling hit\ m rhc strip becomes larger than I Ni2' (whxc N I ) thc numh:r o i bit\ uf the ,trip), and the invencd image oithe d3r.i is stored.
Using this strategy. storing the in\crswn indicator ncwr appears as an overhead that overwhelms the protirs of intersion in thc \ iew pnint ofpower. 'The dcctzion alway. lktds IO 3 dccrenwd numhcr <)f uncunfoming hits dier inrenion including the in\ersiun indxrtur hit. Fig. 3 shows <ome examples of the strip invcrwins indicdting the cli3nge in the numhcr oiuncunfurnilng hlts.
Ran data II uf UCIi 11 'Cto~cJ&;dls~ n a f L C U Figure 3 : Representation of data including inversion lndicstar(ll) in the horizontal strip inversion reducing the number of unconforming bits(UCB) assuming the unconforming bit value is '0' Fig. 5 shows the sirnulation results on the horizontal'strip inversion. To get the RAM access traces in the simulation, we used the 'Shade' tool of Sun Microsystems with SPEClnt95 benchmark suite. In the simulation, we tried four fixed-size horizontal strips, i.e., 32-bit, 16-bit, 8-bit and 4-bit strips. Considering the real implementation, we designed simulation models such that the memory access which stores the data having the smaller width than the strip size, does not change the inversion indicator. The inversion indicators are included in the simulation model and power estimation, but the inputloutput multiplexing logic is excluded in the power estimation for we limited the object of power measuring as a memory 'array' that dominates the whole power consumption.
In Fig. 5 , power reduction presents the improvement over that of proper whole plane inversion. To get more information, we trace the RAM accesses separately for code RAM and for data RAM, and estimate the power reduction for each.
The simulation results show a steady increase in power reduction as the strip size becomes smaller. However, the overhead in area in percentage increases as the strip size decreases, because the percentage of additional storage space is required to store the inversion indicators. In Fig. 5 , the power reduction of code RAM is almost flat over various benchmarks of SPECInt95. In contrast to that data RAM reflects the characteristics of benchmarks. For 'compress' and 'ijpeg', which have massive arithmetics, the power reduction of data RAM is very large even up to over SO%, while 'I? and 'm88ksim' having intensive logical operations show rather less power reduction than that of code RAM. Finally, we notice two irregular points in Fig. 5 . 'compress' and 'ijpeg' show exceptional increases in power reduction when horizontal snip size decreases from 8-bit to 4-bit. This means, for the two benchmark programs, the inclusion of inversion indicators begins to be overloaded at the strip size of 8-bit.
Conclusion
In this paper, we proposed an effective way to reduce the power consumption in the memory components. The proposed scheme, conforming inverted data store, reduces the power consumption of memory by selectively storing normal or inverted data, so that the majority of bits being accessed have the precharging value consequently leading to less bit-line toggling and less power consumption, Considering practical implementation in embedded systems, we developed the proposed scheme into realistic solutions, vertical strip and horizontal Strip inversions. The former works with statistics of application traces, but the latter works without any statistics.
Our simulation results show that the strip-based inversions assure the power reduction up to 50% in spite of assuming memory contents are already well-biased for power. The enhancement is really dramatic considering the rather minor implementation overheads. In addition to the the dramatic power reduction, inherent generality and compatibility with the previous circuit-oriented approaches make the proposed schemes more promising. --.
Et i
Figure 5 Powr rrduction achieved hy apphing thr horimnial strip In\rriion to an rmbrddrd R A V for (ariiiw brnchmarkv
