$\pi$  IU



# VHDL Implementation of High-Performance and Dynamically Configured Multi-Port Cache Memory

Hassan Bajwa, Isaac Macwan (Seed Grant Money Project)

Department of Electrical and Computer Engineering, University of Bridgeport, Bridgeport, CT 06604

### **ABSTRACT**

This project presents the implementation of 64x64 multi-port dynamically configured SRAM in VHDL (VHSIC hardware description language). It employs isolation nodes and dynamic memory partitioning algorithm to facilitate simultaneous multi-port accesses without duplicating bit-lines. VHDL test-bench is developed to verify the functionality of the dynamically configured memory. Results demonstrate that critical memory operations such as "read miss", "write miss" and "write bypass" can be performed using newly proposed low power, area efficient dynamically configured memory.

# INTRODUCTION

As on-chip cache size has increased considerably in recent high-performance microprocessor technologies, power dissipation and leakage current in SRAM have become critical. Sub-threshold leakage current flowing from drain to source, even when the transistor is not operating, has become a dominant leakage current in high performance microprocessors. In high performance systems, employing multi-core technologies, bit line leakage current can contribute to as much as 50% of the overall cache memory leakage power. The trend of using multi-port cache in modern microprocessor technologies has exuberated this further. Here we present the architecture of low power area-efficient dynamically configured memory in detail. We will further present the VHDL implementation of multi-port SRAM and a newly proposed dynamically configured multiport SRAM.



State S0 is the reset state for the machine and depending upon the addresses being same or different the next set of states is chosen where the legal memory operations of Read-Read, Read-Write, Write-Read and Write-Write are performed using the appropriate enable signals in each of these individual states. A local signal called 'Address Enable' is utilized for switching between the single-port SRAM and the same working as a dual-port. Based on the addresses being same or different, the 'Address Enable' signal is asserted and the isolation nodes are turned OFF in order to virtually partition the memory array which now acts a dual-port memory with one pair of bit lines.



Figure 1. A Single Port SRAM Cell

BPC

ICL (i)

WL (i)

WL (i+7)

eakage curren

Cell (j)

Set(x)

Cell(j+7)

Set(x+7)

0

\_\_\_\_\_ 0

Figure 1 and Figure 2 show the classic 6-T and 8-T SRAM cells with dedicated word and bit lines. Pre-charging as well as keeping the bit lines high, causes significant power dissipation and contributes heavily to the total power dissipation. When "0" is stored, transistors T1, T5 and T4 dissipate leakage current. When "1" is stored, T2, T3 and T6 dissipate leakage current.





### **RTL VIEW OF THE PROPOSED DYNAMIC MEMORY**

Figure 6. RTL View of the Proposed Dynamic Memory

Figure 6 shows the top level VHDL module of the proposed design consists of three main sub-modules, ICL (Isolation Control Line) Generator, a Controller and the SRAM isolated module. The purpose of ICL Generator is to calculate the ICL signals based on the two incoming addresses being same or different. The controller uses the 'Read' and 'Write' signals on the main module in order to generate a RWE (Read Write Enable) signal, which in turn controls the SRAM isolated module to process the incoming data.

**SIMULATION RESULTS** 

of isolation control line (ICL) and isolation node on each of the bit lines to divide an SRAM block into the upper and lower sections, which are to be accessed by the upper and lower ports, respectively. Dynamic partitioning is performed before accessing memory cell.



#### 40.0 ns 60.0 ns 80.0 ns Addr\_A 101001 Addr\_B 101010 clk data\_A 0][0][0][0][0][0][0][0]7 data\_B 0110110110110110115 iso\_node 000 port\_A 0][0][0][0][0][0][0] port\_B 0][0][0][0][0][0]15 rd\_a rd\_b rst wr\_a wr\_b

### 40.0 ns 60.0 ns Addr\_A 010111



80.0 ns

### Figure 7. Same Addresses

Figure 8. Different Addresses

## CONCLUSION

Compared with the classic hardwired multi-port memory architecture the new DMP facilitates efficient designs reduces the use of silicon area largely due to the elimination of additional bit lines. Shorter active bit lines also means less latency. The results from the implemented VHDL architecture showed that the proposed design follows all the legal operations of the conventional memory.

Figure 2. A Dual Port SRAM Cell