37 research outputs found
TCA<i>m</i>M<sup>CogniGron</sup>::Energy Efficient Memristor-Based TCAM for Match-Action Processing
The Internet relies heavily on programmable match-action processors for matching network packets against locally available network rules and taking actions, such as forwarding and modification of network packets. This match-action process must be performed at high speed, i.e., commonly within one clock cycle, using a specialized memory unit called Ternary Content Addressable Memory (TCAM). Building on transistor-based CMOS designs, state-of-the-art TCAM architectures have high energy consumption and lack resilient designs for incorporating novel technologies for performing appropriate actions. In this article, we motivate the use of a novel fundamental component, the ‘Memristor’, for the development of TCAM architecture for match-action processing. Memristors can provide energy efficiency, non-volatility and better resource density as compared to transistors. We have proposed a novel memristor-based TCAM architecture called TCAmMCogniGron, built upon the voltage divider principle and requiring only two memristors and five transistors for storage and search operations compared to sixteen transistors in the traditional TCAM architecture. We analyzed its performance over an experimental data set of Nb-doped SrTiO3-based memristor. The analysis of TCAmMCogniGron showed promising power consumption statistics of 16 uW and 1 uW for match and mismatch operations along with twice the improvement in resources density as compared to the traditional architectures
Towards Energy Efficient Memristor-based TCAM for Match-Action Processing
Match-action processors play a crucial role of communicating end-users in the Internet by computing network paths and enforcing administrator policies. The computation process uses a specialized memory called Ternary Content Addressable Memory (TCAM) to store processing rules and use header information of network packets to perform a match within a single clock cycle. Currently, TCAM memories consume huge amounts of energy resources due to the use of traditional transistor-based CMOS technology. In this article, we motivate the use of a novel component, the memristor, for the development of a TCAM architecture. Memristors can provide energy efficiency, non-volatility, and better resource density as compared to transistors. We have proposed a novel memristor-based TCAM architecture built upon the voltage divider principle for energy efficient match-action processing. Moreover, we have tested the performance of the memristor-based TCAM architecture using the experimental data of a novel Nb-doped SrTiO3 memristor. Energy analysis of the proposed TCAM architecture for given memristor shows promising power consumption statistics of 16 μW for a match operation and 1 μW for a mismatch operation
Memristor MOS Content Addressable Memory (MCAM): Hybrid Architecture for Future High Performance Search Engines
Large-capacity Content Addressable Memory (CAM) is a key element in a wide
variety of applications. The inevitable complexities of scaling MOS transistors
introduce a major challenge in the realization of such systems. Convergence of
disparate technologies, which are compatible with CMOS processing, may allow
extension of Moore's Law for a few more years. This paper provides a new
approach towards the design and modeling of Memristor (Memory resistor) based
Content Addressable Memory (MCAM) using a combination of memristor MOS devices
to form the core of a memory/compare logic cell that forms the building block
of the CAM architecture. The non-volatile characteristic and the nanoscale
geometry together with compatibility of the memristor with CMOS processing
technology increases the packing density, provides for new approaches towards
power management through disabling CAM blocks without loss of stored data,
reduces power dissipation, and has scope for speed improvement as the
technology matures.Comment: 10 pages, 11 figure
Long-Term Memory for Cognitive Architectures: A Hardware Approach Using Resistive Devices
A cognitive agent capable of reliably performing complex tasks over a long time will acquire a large store of knowledge. To interact with changing circumstances, the agent will need to quickly search and retrieve knowledge relevant to its current context. Real time knowledge search and cognitive processing like this is a challenge for conventional computers, which are not optimised for such tasks. This thesis describes a new content-addressable memory, based on resistive devices, that can perform massively parallel knowledge search in the memory array. The fundamental circuit block that supports this capability is a memory cell that closely couples comparison logic with non-volatile storage. By using resistive devices instead of transistors in both the comparison circuit and storage elements, this cell improves area density by over an order of magnitude compared to state of the art CMOS implementations. The resulting memory does not need power to maintain stored information, and is therefore well suited to cognitive agents with large long-term memories. The memory incorporates activation circuits, which bias the knowledge retrieval process according to past memory access patterns. This is achieved by approximating the widely used base-level activation function using resistive devices to store, maintain and compare activation values. By distributing an instance of this circuit to every row in memory, the activation for all memory objects can be updated in parallel. A test using the word sense disambiguation task shows this circuit-based activation model only incurs a small loss in accuracy compared to exact base-level calculations. A variation of spreading activation can also be achieved in-memory. Memory objects are encoded with high-dimensional vectors that create association between correlated representations. By storing these high-dimensional vectors in the new content-addressable memory, activation can be spread to related objects during search operations. The new memory is scalable, power and area efficient, and performs operations in parallel that are infeasible in real-time for a sequential processor with a conventional memory hierarchy.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 201
Analog Content-Addressable Memory from Complementary FeFETs
To address the increasing computational demands of artificial intelligence
(AI) and big data, compute-in-memory (CIM) integrates memory and processing
units into the same physical location, reducing the time and energy overhead of
the system. Despite advancements in non-volatile memory (NVM) for matrix
multiplication, other critical data-intensive operations, like parallel search,
have been overlooked. Current parallel search architectures, namely
content-addressable memory (CAM), often use binary, which restricts density and
functionality. We present an analog CAM (ACAM) cell, built on two complementary
ferroelectric field-effect transistors (FeFETs), that performs parallel search
in the analog domain with over 40 distinct match windows. We then deploy it to
calculate similarity between vectors, a building block in the following two
machine learning problems. ACAM outperforms ternary CAM (TCAM) when applied to
similarity search for few-shot learning on the Omniglot dataset, yielding
projected simulation results with improved inference accuracy by 5%, 3x denser
memory architecture, and more than 100x faster speed compared to central
processing unit (CPU) and graphics processing unit (GPU) per similarity search
on scaled CMOS nodes. We also demonstrate 1-step inference on a kernel
regression model by combining non-linear kernel computation and matrix
multiplication in ACAM, with simulation estimates indicating 1,000x faster
inference than CPU and GPU