International Journal of Science Engineering and Advance ISSN 2321-6905 Technology, IJSEAT, Vol. 6, Issue 5 May- 2018



# Power Optimized CNN Based CAM Using Clock Gating Techniques

K.SaiLakshmi, K.SivaNagendra

M.TECH (VLSI DESIGN) ,Department of ECE,Aditya Engineering College of Engineering(JNTUK)Surampalem,A.P M.TECH,AssistantProfessor, Department Of ECE,Aditya Engineering College Of Engineering (JNTUK) Surampalem,A.P

### **ABSTRACT:**

A Content Addressable Memory(CAM) is a Memory used in certain very high speed searchapplications. It compares input data against a table of stored data, and returns the address of matching data.Since basic look up table functions performed overall the stored memory information there is a high power dissipation. The proposed clock gating technique to reduce the power consumption. Since delay buffers are accessed sequentially, it adopts ring counter addressing scheme. In the ring counter D- Flip flops are usedare utilized to reduce power consumption.

**Key Words**: Content Addressable Memory, Ring Counter, Delay Flip flops.

### **INTRODUCTION:**

A Content Addressable Memory (CAM) is a sort of memory that can be gotten to utilizing it's substance instead of an unequivocal address. CAM's are quick information parallel inquiry circuits. CAM's are generally utilized as a part of numerous applications like memory mapping, reserve controllers for focal handling unit, information pressure and coding econ zone and so forth on rapid systems and tremendous movement volume the undertaking is to be performed in quick and enormous parallelism. In any case, overseeing high speeds and extensive look into tables requires silicon zone and power utilization. The circuit structure of a CAM word which is made of CAM cells. A CAM cell thinks about it's put away piece against it's comparing seek bit gave on the search line(SL). The consolidated query output for the whole word is produced on the match line(ML) . A Ring counter is used to reason to the objective words to be composed in and perused out. since the ring counter is made from affiliated cluster of D-type flip-flops(DFFs) activated by a clock signal.

#### **EXISTING METHOD:**

CNN based classifier produces the look at empower signals for the CAM sub blocks joined to it. It comprises of a SRAM module, the information tag is first diminished long to q bits, and divided into 'c' approach length parts. Each fragment is exhibited to its comparing one decoder it figures out which column of the SRAM is to be gotten to. The encoding and unraveling methods are utilized to serve the information with no twists. The encoding will encode the information with an address and the interpreting will translate the information with the address. In SRAM ,the read mode is chosen, the yield will be empowered. At the point when the compose mode is in off condition, the written work task will be performed. Therefore when a pursuit is given as read the put away factor will be shown on the off chance that it coordinates alongside the information.



Fig1. Simplified block diagram for the architecture for the CNN generating compare enable signals for the CAM array.

### **DRAWBACKS:**

- 1. More dynamic power consumption.
- 2. More leakage current.
- 3. Less accuracy.

### **PROPOSED METHOD:**

## **CONVENTIONAL DELAY BUFFERS**

The easiest method to actualize a defer cradle is to utilize move enrolls as appeared in fig 2, if the cushion length is N and the word-length is b, at that point a sum of Nb DFFs are required, and it can be very expansive if a standard cell for DFF is utilized. Furthermore, this approach can expend immense measure of energy since on the normal Nb/2 twofold flags influence advances in each clock to cycle. Therefore, this execution is normally utilized as a part of short defer cradles, where region and power are of less concern.



### Fig. 2: Delay buffer implemented by shift registers

SRAM-based defer supports are more well known in long postpone cradles on account of the conservative SRAM cell size and little aggregate zone. Likewise, the power utilization is considerably less than move registers in light of the fact that lone two words are gotten to in each clock cycle: one for write-in and the other for read-out. A twofold counter can be utilized for address age since the memory words are gotten to consecutively. By watching the way that just a single of the DFFs in the ring counter is enacted, the gatedclock method has then been proposed to be connected to the DFFs in their approach, each eight DFFs in the ring counter are gathered into one block. At that point, an "entryway" flag is registered for each block to door the much of the time flipped clock flag when the block can be dormant with the goal that superfluous power squandered in clock flag advances is spared. As appeared in fig 2, when the contribution of the principal DFF in a block is declared, it sets the yield of the R-S flip-flounder to "1" at the following clock edge. In this manner, the approaching "1" can be caught in that block and keep on propagating inside the block. Then again, the fruitful engendering

of "1" to the main DFF in the following block can from now on close down the superfluous check motion in the present block



# .Fig. 3: Ring counter with clock gated by R–S flip-flop

Here the proposed delay buffer uses ring counter with delay flip flop used after applying all techniques. The fig shows proposed delay buffer structure block diagram.



# Fig.4. proposed delay buffer block diagram

# **RESULTS:**

All the synthesis and simulation results are performed by Xilinx ISE 14.2 using Verilog HDL. The simulation results are shown below figures.



Fig. 5. RTL schematic diagram



Fig.6. Technology Schematic diagram

## SIMULATION RESULT:

Here the input data is matched with stored value, the pass condition will be displayed as an output.

| Name             | Value           | 1,999,750 ps | 1,999,800 ps 1,999,850 ps | 1,999,900 ps 1,999,950 ps |
|------------------|-----------------|--------------|---------------------------|---------------------------|
| a dk             | 2               |              |                           |                           |
| . irit           | 1               |              |                           |                           |
| sel              | 2               |              |                           |                           |
| ip[7X]           | 10606011        |              | 1000011                   |                           |
| 🖡 🕌 fop[74]      | 00000010        |              | 00000010                  |                           |
| 🕨 🙀 muop(7:0)    | 10000011        |              | 1000011                   |                           |
| 🕨 🙀 dop[7:0]     | XXXXXXXX        |              | )(0))(0)                  |                           |
| 🕨 💐 ringep[31:0] | <u> KARAKAN</u> |              | 100000000000000000        | 00000                     |
| 🖌 👹 addre[7:0]   | 00000010        |              | 00000010                  |                           |

### Fig.7.simulation result

## **Power Report:**

| On-Chip            | Power (W) | Used                   | Available          | Utilization (%)      |
|--------------------|-----------|------------------------|--------------------|----------------------|
| Clocks             | 0.000     | 4                      | -                  | -                    |
| Logic              | 0.000     | 18                     | 63400              | 0                    |
| Signals            | 0.000     | 37                     | -                  | -                    |
| 10s                | 0.000     | 18                     | 210                | 9                    |
| Leakage            | 0.082     |                        |                    |                      |
| Total              | 0.082     |                        |                    |                      |
| Thermal Properties |           | Effective TJA<br>(C/W) | Max Ambient<br>(C) | Junction Temp<br>(C) |
|                    |           | 4.6                    | 84.6               | 25.4                 |

### **Conclusion:**

In this paper, it presented a low delay buffer architecture which adopt several novel techniques to reduce power consumption. The clock gating technique used for the clock distribution networks can eliminate the power wasted on drivers that need not be a activated. another gated multiplexer and demultiplexer are used for the input and output driving circuitry to decreases the loading of the input and output data bus.

### **REFERNECES:**

[1] A. Agarwal, S. Hsu, S. Mathew, M. Anders, H. Kaul, F. Sheikh, and R. Krishnamurthy, "A 128x128b high-speed wide-and match-line content addressable memory in 32nm CMOS," in *ESSCIRC* (*ESSCIRC*), 2011Proceedings of the, Sep. 2011, pp. 83–86.

[2] N.-F. Huang, W.-E.Chen, J.-Y.Luo, and J.-M. Chen, "Design of multi-field IPv6 packet classifiers using ternary CAMs," in Global Telecommunications Conference, 2001. GLOBECOM '01. IEEE, vol. 3.2001. 1877-1881. pp. [3] N. Onizawa, S. Matsunaga, V. C. Gaudet, and T. Hanyu, "High throughput low-energy contentaddressable memory based on self-time overlapped mechanism," search in Proc. International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2012, 41-48. pp. [4] C.-S. Lin, J.-C.Chang, and B.-D. Liu, "A lowpower pre computation based fully parallel contentaddressable memory," IEEE Journal of Solid-State Circuits, vol. 38, no. 4, pp. 654 - 662, Apr. 2003. [5] S.-J. Ruan, C.-Y.Wu, and J.-Y. Hsieh, "Low power design of pre computation-based contentaddressable memory," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 16, no. 3, pp.331-335, 2008. Mar. [6] K. Pagiamtzis and A. Sheikholeslami, "Contentaddressable memory(CAM) circuits and architectures: a tutorial and survey," Solid-State Circuits, IEEE Journal of, vol. 41, no. 3, pp. 712-727. march 2006. [7] H. Jarollahi, N. Onizawa, V. Gripon, and W. J. Gross, "Architecture and implementation of an associative memory using sparse clustered networks," in IEEE International Symposium on Circuits and Systems(ISCAS), Seoul, Korea, May 2901-2904. 2012, pp. [8] J.-S. Wang, H.-Y.Li, C.-C.Chen, and C. Yeh, "An AND-type match-line scheme for energy-efficient Content Addressable Memories," in Solid State Circuits Conference, 2005. Digest of Technical Papers. ISSCC.2005 IEEE International, Feb. 2005, 464-610 Vol. 1. pp. [9] Y.-J. Chang and Y.-H. Liao, "Hybrid-type CAM design for both power and performance efficiency," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 16, no. 8, pp. 965-974, Aug. 2008.