3,409 research outputs found

    Analog Content-Addressable Memory from Complementary FeFETs

    Full text link
    To address the increasing computational demands of artificial intelligence (AI) and big data, compute-in-memory (CIM) integrates memory and processing units into the same physical location, reducing the time and energy overhead of the system. Despite advancements in non-volatile memory (NVM) for matrix multiplication, other critical data-intensive operations, like parallel search, have been overlooked. Current parallel search architectures, namely content-addressable memory (CAM), often use binary, which restricts density and functionality. We present an analog CAM (ACAM) cell, built on two complementary ferroelectric field-effect transistors (FeFETs), that performs parallel search in the analog domain with over 40 distinct match windows. We then deploy it to calculate similarity between vectors, a building block in the following two machine learning problems. ACAM outperforms ternary CAM (TCAM) when applied to similarity search for few-shot learning on the Omniglot dataset, yielding projected simulation results with improved inference accuracy by 5%, 3x denser memory architecture, and more than 100x faster speed compared to central processing unit (CPU) and graphics processing unit (GPU) per similarity search on scaled CMOS nodes. We also demonstrate 1-step inference on a kernel regression model by combining non-linear kernel computation and matrix multiplication in ACAM, with simulation estimates indicating 1,000x faster inference than CPU and GPU

    CNFET-based design ternary logic design and arithmetic circuit simulation using HSPICE

    Get PDF
    This project report focuses on the multiple-value logic (MVL) or commonly known as ternary logic gates by using carbon nanotube (CNT) FETs devices (CNTFETs). It is shown ternary logic has promising future in CNTFETs when compare to conventional binary logic design, due to its simplicity and energy efficiency in digital design reduced circuit overhead such as chip area and interconnection. In this research, existing CNTFET-based binary inverter and standard ternary inverter with resistive-load (STI-R) for comparison with the other three types of inverter are proposed - Complementary Standard Ternary Inverter (CSTI); Standard Ternary Inverter with 1 resistor and 3 NCNTFET (NSTI-R); Standard Ternary Inverter with 1 resistor and 3 PCNTFET (PSTI-R) to analysis the performance, structure design and application. In addition, the research covers all the basic logic Ternary NAND gate and Ternary NOR gate for further benchmarking. All simulation results using SPICE are obtained and analyzed in the Direct Current (DC) setting and verifed using half adder. Further study behavior of ternary logic includes the implementation of partial binary design into the ternary design and performance benchmarking. The result shows the CSTI have advantage on low power design with low leakage while NSTI-R has advantage on high-speed design inverter. In addition, partial binary design in the arithmetic circuit ternary design with CSTI shows added advantage in a low power design

    Digital Circuit Design Using Floating Gate Transistors

    Get PDF
    Floating gate (flash) transistors are used exclusively for memory applications today. These applications include SD cards of various form factors, USB flash drives and SSDs. In this thesis, we explore the use of flash transistors to implement digital logic circuits. Since the threshold voltage of flash transistors can be modified at a fine granularity during programming, several advantages are obtained by our flash-based digital circuit design approach. For one, speed binning at the factory can be controlled with precision. Secondly, an IC can be re-programmed in the field, to negate effects such as aging, which has been a significant problem in recent times, particularly for mission-critical applications. Thirdly, unlike a regular MOSFET, which has one threshold voltage level, a flash transistor can have multiple threshold voltage levels. The benefit of having multiple threshold voltage levels in a flash transistor is that it allows the ability to encode more symbols in each device, unlike a regular MOSFET. This allows us to implement multi-valued logic functions natively. In this thesis, we evaluate different flash-based digital circuit design approaches and compare their performance with a traditional CMOS standard cell-based design approach. We begin by evaluating our design approach at the cell level to optimize the design’s delay, power energy and physical area characteristics. The flash-based approach is demonstrated to be better than the CMOS standard cell approach, for these performance metrics. Afterwards, we present the performance of our design approach at the block level. We describe a synthesis flow to decompose a circuit block into a network of interconnected flash-based circuit cells. We also describe techniques to optimize the resulting network of flash-based circuit cells using don’t cares. Our optimization approach distinguishes itself from other optimization techniques that use don’t cares, since it a) targets a flash-based design flow, b) optimizes clusters of logic nodes at once instead of one node at a time, c) attempts to reduce the number of cubes instead of reducing the number of literals in each cube and d) performs optimization on the post-technology mapped netlist which results in a direct improvement in result quality, as compared to pre-technology mapping logic optimization that is typically done in the literature. The resulting network characteristics (delay, power, energy and physical area) are presented. These results are compared with a standard cell-based realization of the same block (obtained using commercial tools) and we demonstrate significant improvements in all the design metrics. We also study flash-based FPGA designs (both static and dynamic), and present the tradeoff of delay, power dissipation and energy consumption of the various designs. Our work differs from previously proposed flash-based FPGAs, since we embed the flash transistors (which store the configuration bits) directly within the logic and interconnect fabrics. We also present a detailed description of how the programming of the configuration bits is accomplished, for all the proposed designs

    Digital Circuit Design Using Floating Gate Transistors

    Get PDF
    Floating gate (flash) transistors are used exclusively for memory applications today. These applications include SD cards of various form factors, USB flash drives and SSDs. In this thesis, we explore the use of flash transistors to implement digital logic circuits. Since the threshold voltage of flash transistors can be modified at a fine granularity during programming, several advantages are obtained by our flash-based digital circuit design approach. For one, speed binning at the factory can be controlled with precision. Secondly, an IC can be re-programmed in the field, to negate effects such as aging, which has been a significant problem in recent times, particularly for mission-critical applications. Thirdly, unlike a regular MOSFET, which has one threshold voltage level, a flash transistor can have multiple threshold voltage levels. The benefit of having multiple threshold voltage levels in a flash transistor is that it allows the ability to encode more symbols in each device, unlike a regular MOSFET. This allows us to implement multi-valued logic functions natively. In this thesis, we evaluate different flash-based digital circuit design approaches and compare their performance with a traditional CMOS standard cell-based design approach. We begin by evaluating our design approach at the cell level to optimize the design’s delay, power energy and physical area characteristics. The flash-based approach is demonstrated to be better than the CMOS standard cell approach, for these performance metrics. Afterwards, we present the performance of our design approach at the block level. We describe a synthesis flow to decompose a circuit block into a network of interconnected flash-based circuit cells. We also describe techniques to optimize the resulting network of flash-based circuit cells using don’t cares. Our optimization approach distinguishes itself from other optimization techniques that use don’t cares, since it a) targets a flash-based design flow, b) optimizes clusters of logic nodes at once instead of one node at a time, c) attempts to reduce the number of cubes instead of reducing the number of literals in each cube and d) performs optimization on the post-technology mapped netlist which results in a direct improvement in result quality, as compared to pre-technology mapping logic optimization that is typically done in the literature. The resulting network characteristics (delay, power, energy and physical area) are presented. These results are compared with a standard cell-based realization of the same block (obtained using commercial tools) and we demonstrate significant improvements in all the design metrics. We also study flash-based FPGA designs (both static and dynamic), and present the tradeoff of delay, power dissipation and energy consumption of the various designs. Our work differs from previously proposed flash-based FPGAs, since we embed the flash transistors (which store the configuration bits) directly within the logic and interconnect fabrics. We also present a detailed description of how the programming of the configuration bits is accomplished, for all the proposed designs

    8x8 Reconfigurable quantum photonic processor based on silicon nitride waveguides

    Get PDF
    The development of large-scale optical quantum information processing circuits ground on the stability and reconfigurability enabled by integrated photonics. We demonstrate a reconfigurable 8x8 integrated linear optical network based on silicon nitride waveguides for quantum information processing. Our processor implements a novel optical architecture enabling any arbitrary linear transformation and constitutes the largest programmable circuit reported so far on this platform. We validate a variety of photonic quantum information processing primitives, in the form of Hong-Ou-Mandel interference, bosonic coalescence/anticoalescence and high-dimensional single-photon quantum gates. We achieve fidelities that clearly demonstrate the promising future for large-scale photonic quantum information processing using low-loss silicon nitride.Comment: Added supplementary materials, extended introduction, new figures, results unchange

    Memristor MOS Content Addressable Memory (MCAM): Hybrid Architecture for Future High Performance Search Engines

    Full text link
    Large-capacity Content Addressable Memory (CAM) is a key element in a wide variety of applications. The inevitable complexities of scaling MOS transistors introduce a major challenge in the realization of such systems. Convergence of disparate technologies, which are compatible with CMOS processing, may allow extension of Moore's Law for a few more years. This paper provides a new approach towards the design and modeling of Memristor (Memory resistor) based Content Addressable Memory (MCAM) using a combination of memristor MOS devices to form the core of a memory/compare logic cell that forms the building block of the CAM architecture. The non-volatile characteristic and the nanoscale geometry together with compatibility of the memristor with CMOS processing technology increases the packing density, provides for new approaches towards power management through disabling CAM blocks without loss of stored data, reduces power dissipation, and has scope for speed improvement as the technology matures.Comment: 10 pages, 11 figure

    High-Performance and Energy-Efficient Leaky Integrate-and-Fire Neuron and Spike Timing-Dependent Plasticity Circuits in 7nm FinFET Technology

    Get PDF
    In designing neuromorphic circuits and systems, developing compact and energy-efficient neuron and synapse circuits is essential for high-performance on-chip neural architectures. Toward that end, this work utilizes the advanced low-power and compact 7nm FinFET technology to design leaky integrate-and-fire (LIF) neuron and spike-timing-dependent plasticity (STDP) circuits. In the proposed STDP circuit, only six FinFETs and three small capacitors (two 10fF and 20fF) have been utilized to realize STDP learning. Moreover, 12 transistors and two capacitors (20fF) have been employed for designing the LIF neuron circuit. The evaluation results demonstrate that besides 60% area saving, the proposed STDP circuit achieves 68% improvement in total average power consumption and 43% lower energy dissipation compared to previous works. The proposed LIF neuron circuit demonstrates 34% area saving, 46% power, and 40% energy saving compared to its counterparts. The neuron can also tune the firing frequency within 5MHz-330MHz using an external control voltage. These results emphasize the potential of the proposed neuron and STDP learning circuits for compact and energy-efficient neuromorphic computing systems
    • …
    corecore