27 research outputs found
Compact and High-Performance TCAM Based on Scaled Double-Gate FeFETs
Ternary content addressable memory (TCAM), widely used in network routers and
high-associativity caches, is gaining popularity in machine learning and
data-analytic applications. Ferroelectric FETs (FeFETs) are a promising
candidate for implementing TCAM owing to their high ON/OFF ratio,
non-volatility, and CMOS compatibility. However, conventional single-gate
FeFETs (SG-FeFETs) suffer from relatively high write voltage, low endurance,
potential read disturbance, and face scaling challenges. Recently, a
double-gate FeFET (DG-FeFET) has been proposed and outperforms SG-FeFETs in
many aspects. This paper investigates TCAM design challenges specific to
DG-FeFETs and introduces a novel 1.5T1Fe TCAM design based on DG-FeFETs. A
2-step search with early termination is employed to reduce the cell area and
improve energy efficiency. A shared driver design is proposed to reduce the
peripherals area. Detailed analysis and SPICE simulation show that the 1.5T1Fe
DG-TCAM leads to superior search speed and energy efficiency. The 1.5T1Fe TCAM
design can also be built with SG-FeFETs, which achieve search latency and
energy improvement compared with 2FeFET TCAM.Comment: Accepted by Design Automation Conference (DAC) 202
Long-Term Memory for Cognitive Architectures: A Hardware Approach Using Resistive Devices
A cognitive agent capable of reliably performing complex tasks over a long time will acquire a large store of knowledge. To interact with changing circumstances, the agent will need to quickly search and retrieve knowledge relevant to its current context. Real time knowledge search and cognitive processing like this is a challenge for conventional computers, which are not optimised for such tasks. This thesis describes a new content-addressable memory, based on resistive devices, that can perform massively parallel knowledge search in the memory array. The fundamental circuit block that supports this capability is a memory cell that closely couples comparison logic with non-volatile storage. By using resistive devices instead of transistors in both the comparison circuit and storage elements, this cell improves area density by over an order of magnitude compared to state of the art CMOS implementations. The resulting memory does not need power to maintain stored information, and is therefore well suited to cognitive agents with large long-term memories. The memory incorporates activation circuits, which bias the knowledge retrieval process according to past memory access patterns. This is achieved by approximating the widely used base-level activation function using resistive devices to store, maintain and compare activation values. By distributing an instance of this circuit to every row in memory, the activation for all memory objects can be updated in parallel. A test using the word sense disambiguation task shows this circuit-based activation model only incurs a small loss in accuracy compared to exact base-level calculations. A variation of spreading activation can also be achieved in-memory. Memory objects are encoded with high-dimensional vectors that create association between correlated representations. By storing these high-dimensional vectors in the new content-addressable memory, activation can be spread to related objects during search operations. The new memory is scalable, power and area efficient, and performs operations in parallel that are infeasible in real-time for a sequential processor with a conventional memory hierarchy.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 201
Application Centric Networks-On-Chip Design Solutions for Future Multicore Systems
With advances in technology, future multicore systems scaled to 100s and 1000s of cores/accelerators are being touted as an effective solution for extracting huge performance gains using parallel programming paradigms. However with the failure of Dennard Scaling all the components on the chip cannot be run simultaneously without breaking the power and thermal constraints leading to strict chip power envelops. The scaling up of the number of on chip components has also brought upon Networks-On-Chip (NoC) based interconnect designs like 2D mesh. The contribution of NoC to the total on chip power and overall performance has been increasing steadily and hence high performance power-efficient NoC designs are becoming crucial.
Future multicore paradigms can be broadly classified, based on the applications they are tailored to, into traditional Chip Multi processor(CMP) based application based systems, characterized by low core and NoC utilization, and emerging big data application based systems, characterized by large amounts of data movement necessitating high throughput requirements. To this order, we propose NoC design solutions for power-savings in future CMPs tailored to traditional applications and higher effective throughput gains in multicore systems tailored to bandwidth intensive applications. First, we propose Fly-over, a light-weight distributed mechanism for power-gating routers attached to switched off cores to reduce NoC power consumption in low load CMP environment. Secondly, we plan on utilizing a promising next generation memory technology, Spin-Transfer Torque Magnetic RAM(STT-MRAM), to achieve enhanced NoC performance to satisfy the high throughput demands in emerging bandwidth intensive applications, while reducing the power consumption simultaneously. Thirdly, we present a hardware data approximation framework for NoCs, APPROX-NoC, with an online data error control mechanism, which can leverage the approximate computing paradigm in the emerging data intensive big data applications to attain higher performance per watt
Intelligent Computing: The Latest Advances, Challenges and Future
Computing is a critical driving force in the development of human
civilization. In recent years, we have witnessed the emergence of intelligent
computing, a new computing paradigm that is reshaping traditional computing and
promoting digital revolution in the era of big data, artificial intelligence
and internet-of-things with new computing theories, architectures, methods,
systems, and applications. Intelligent computing has greatly broadened the
scope of computing, extending it from traditional computing on data to
increasingly diverse computing paradigms such as perceptual intelligence,
cognitive intelligence, autonomous intelligence, and human-computer fusion
intelligence. Intelligence and computing have undergone paths of different
evolution and development for a long time but have become increasingly
intertwined in recent years: intelligent computing is not only
intelligence-oriented but also intelligence-driven. Such cross-fertilization
has prompted the emergence and rapid advancement of intelligent computing.
Intelligent computing is still in its infancy and an abundance of innovations
in the theories, systems, and applications of intelligent computing are
expected to occur soon. We present the first comprehensive survey of literature
on intelligent computing, covering its theory fundamentals, the technological
fusion of intelligence and computing, important applications, challenges, and
future perspectives. We believe that this survey is highly timely and will
provide a comprehensive reference and cast valuable insights into intelligent
computing for academic and industrial researchers and practitioners
SpiNNaker - A Spiking Neural Network Architecture
20 years in conception and 15 in construction, the SpiNNaker project has delivered the world’s largest neuromorphic computing platform incorporating over a million ARM mobile phone processors and capable of modelling spiking neural networks of the scale of a mouse brain in biological real time. This machine, hosted at the University of Manchester in the UK, is freely available under the auspices of the EU Flagship Human Brain Project. This book tells the story of the origins of the machine, its development and its deployment, and the immense software development effort that has gone into making it openly available and accessible to researchers and students the world over. It also presents exemplar applications from ‘Talk’, a SpiNNaker-controlled robotic exhibit at the Manchester Art Gallery as part of ‘The Imitation Game’, a set of works commissioned in 2016 in honour of Alan Turing, through to a way to solve hard computing problems using stochastic neural networks. The book concludes with a look to the future, and the SpiNNaker-2 machine which is yet to come
SpiNNaker - A Spiking Neural Network Architecture
20 years in conception and 15 in construction, the SpiNNaker project has delivered the world’s largest neuromorphic computing platform incorporating over a million ARM mobile phone processors and capable of modelling spiking neural networks of the scale of a mouse brain in biological real time. This machine, hosted at the University of Manchester in the UK, is freely available under the auspices of the EU Flagship Human Brain Project. This book tells the story of the origins of the machine, its development and its deployment, and the immense software development effort that has gone into making it openly available and accessible to researchers and students the world over. It also presents exemplar applications from ‘Talk’, a SpiNNaker-controlled robotic exhibit at the Manchester Art Gallery as part of ‘The Imitation Game’, a set of works commissioned in 2016 in honour of Alan Turing, through to a way to solve hard computing problems using stochastic neural networks. The book concludes with a look to the future, and the SpiNNaker-2 machine which is yet to come
A Practical Hardware Implementation of Systemic Computation
It is widely accepted that natural computation, such as brain computation, is far superior to typical computational approaches addressing tasks such as learning and parallel processing. As conventional silicon-based technologies are about to reach their physical limits, researchers have drawn inspiration from nature to found new computational paradigms. Such a newly-conceived paradigm is Systemic Computation (SC). SC is a bio-inspired model of computation. It incorporates natural characteristics and defines a massively parallel non-von Neumann computer architecture that can model natural systems efficiently. This thesis investigates the viability and utility of a Systemic Computation hardware implementation, since prior software-based approaches have proved inadequate in terms of performance and flexibility. This is achieved by addressing three main research challenges regarding the level of support for the natural properties of SC, the design of its implied architecture and methods to make the implementation practical and efficient. Various hardware-based approaches to Natural Computation are reviewed and their compatibility and suitability, with respect to the SC paradigm, is investigated. FPGAs are identified as the most appropriate implementation platform through critical evaluation and the first prototype Hardware Architecture of Systemic computation (HAoS) is presented. HAoS is a novel custom digital design, which takes advantage of the inbuilt parallelism of an FPGA and the highly efficient matching capability of a Ternary Content Addressable Memory. It provides basic processing capabilities in order to minimize time-demanding data transfers, while the optional use of a CPU provides high-level processing support. It is optimized and extended to a practical hardware platform accompanied by a software framework to provide an efficient SC programming solution. The suggested platform is evaluated using three bio-inspired models and analysis shows that it satisfies the research challenges and provides an effective solution in terms of efficiency versus flexibility trade-off
High-Density Solid-State Memory Devices and Technologies
This Special Issue aims to examine high-density solid-state memory devices and technologies from various standpoints in an attempt to foster their continuous success in the future. Considering that broadening of the range of applications will likely offer different types of solid-state memories their chance in the spotlight, the Special Issue is not focused on a specific storage solution but rather embraces all the most relevant solid-state memory devices and technologies currently on stage. Even the subjects dealt with in this Special Issue are widespread, ranging from process and design issues/innovations to the experimental and theoretical analysis of the operation and from the performance and reliability of memory devices and arrays to the exploitation of solid-state memories to pursue new computing paradigms