10 research outputs found

    Computation-in-Memory based on Memristive Devices

    No full text
    In-memory computing is a promising computing paradigm due to its capability to alleviate the memory bottleneck. It has even higher potential when implemented usingmemristive devices ormemristors with various beneficial characteristics such as nonvolatility, high scalability, near-zero standby power consumption, high density, and CMOS compatibility. Exploring in-memory computing architectures in the combination withmemristor technology is still in its infancy phase. Therefore, it faces challenges with respect to the development of the devices, circuits, architectures, compilers and applications.This thesis focuses on exploring and developing in-memory computing in terms of architectures (including classification, limited schemes of instruction set,micro-architecture, communication and controller, as well as automation and simulator), and circuits (including logic synthesis flow and interconnect network schemes).Computer Engineerin

    GPU-based simulation of brain neuron models

    No full text
    The human brain is an incredible system which can process, store, and transfer information with high speed and volume. Inspired by such system, engineers and scientists are cooperating to construct a digital brain with these characteristics. The brain is composed by billions of neurons which can be modeled by mathematical equations. The first step to reach that goal is to be able to construct these neuron models in real time. The Inferior Olive (IO) model is a selected model to achieve the real time simulation of a large neuron network. The model is quite complex with three compartments which are based on the Hodgkin Huxley model. Although the Hodgkin Huxley model is considered as the most biological plausible model, it has quite high complexity. The three compartments also make the model become even more computationally intensive. A CPU platform takes a long time to simulate such a complex model. Besides, FPGA platform does not handle effectively floating point operations. With GPU's capability of high performance computing and floating point operations, GPU platform promises to facilitate computational intensive applications successfully. In this thesis, two GPU platforms of the two latest Nvidia GPU architectures are used to simulate the IO model in a network setting. The performance is improved significantly on both platforms in comparison with that on the CPU platform. The speed-up of double precision simulation is 68.1 and 21.0 on Tesla C2075 and GeForce GT640, respectively. The single precision simulation is nearly twice faster than the double precision simulation. The performance of the GeForce GT640 platform is 67% less than that on the Tesla C2075 platfom, while the cost efficiency on the GeForce GT640 is eight times higher than that on the Tesla C2075 platform. The real time execution is achieved with approximately 256 neural cells. In conclusion, the Tesla C2075 platform is essential for double precision simulation and the GeForce GT640 platform is more suitable for reducing execution time of single precision simulation.Computer EngineeringMicroelectronics & Computer EngineeringElectrical Engineering, Mathematics and Computer Scienc

    Memristive devices for computation-in-memory

    No full text
    CMOS technology and its continuous scaling have made electronics and computers accessible and affordable for almost everyone on the globe; in addition, they have enabled the solutions of a wide range of societal problems and applications. Today, however, both the technology and the computer architectures are facing severe challenges/walls making them incapable of providing the demanded computing power with tight constraints. This motivates the need for the exploration of novel architectures based on new device technologies; not only to sustain the financial benefit of technology scaling, but also to develop solutions for extremely demanding emerging applications. This paper presents two computation-in-memory based accelerators making use of emerging memristive devices; they are Memristive Vector Processor and RRAM Automata Processor. The preliminary results of these two accelerators show significant improvement in terms of latency, energy and area as compared to today's architectures and design.Accepted author manuscriptComputer Engineerin

    Memristive Device Based Circuits for Computation-in-Memory Architectures

    No full text
    Emerging computing applications (such as big-data and Internet-of-things) are extremely demanding in terms of storage, energy and computational efficiency, while today’s architectures and device technologies are facing major challenges making them incapable to meet these demands. Computation-in-Memory (CIM) architecture based on memristive devices is one of the alternative computing architectures being explored to address these limitations. Enabling such architectures relies on the development of efficient memristive circuits being able to perform logic and arithmetic operations within the non-volatile memory core. This paper addresses memristive circuit designs for CIM architectures. It gives a complete overview of all designs, both for logic as well as arithmetic operations, and presents the most popular designs in details. In addition, it analyzes and classifies them, shows how they result in different CIM flavours and how these architectures distinguish themselves from traditional ones. The paper also presents different potential applications that could significantly benefit from CIM architectures, based on their kernel that could be accelerated.Accepted author manuscriptComputer Engineerin

    A Mapping Methodology of Boolean Logic Circuits on Memristor Crossbar

    No full text
    Alternatives to CMOS logic circuit implementations are under research for future scaled electronics. Memristor crossbar-based logic circuit is one of the promising candidates to at least partially replace CMOS technology, which is facing many challenges such as reduced scalability, reliability, and performance gain. Memristor crossbar offers many advantages including scalability, high integration density, nonvolatility, etc. The state-of-the-art for memristor crossbar logic circuit design can only implement simple and small circuits. This paper proposes a mapping methodology of large Boolean logic circuits on memristor crossbar. Appropriate place-and-route schemes, to efficiently map the circuits on the crossbar, as well as several optimization schemes are also proposed. To illustrate the potential of the methodology, a multibit adder and other nine more complex benchmarks are studied; the delay, area and power consumption induced by both crossbar and its CMOS control part are evaluated.Accepted Author ManuscriptComputer EngineeringFTQC/Bertels LabQuantum & Computer Engineerin

    Time-division Multiplexing Automata Processor

    No full text
    Automata Processor (AP) is a special implementation of non-deterministic finite automata that performs pattern matching by exploring parallel state transitions. The implementation typically contains a hierarchical switching network, causing long latency. This paper proposes a methodology to split such a hierarchical switching network into multiple pipelined stages, making it possible to process several input sequences in parallel by using time-division multiplexing. We use a new resistive RAM based AP (instead of known DRAM or SRAM based) to illustrate the potential of our method. The experimental results show that our approach increases the throughput by almost a factor of 2 at a cost of marginal area overhead.Computer Engineerin

    APmap: An Open-Source Compiler for Automata Processors

    No full text
    A novel type of hardware accelerators called automata processors (APs) have been proposed to accelerate finite-state automata. The bone structure of an AP is a hierarchical routing matrix that connects many memory arrays. With this structure, an AP can process an input symbol every clock cycle, and hence achieve much higher performance compared to conventional architectures. However, the design automation for the APs is not well researched. This article proposes a fully automated tool named APmap for mapping the automata to APs that use a two-level routing matrix. APmap first partitions a large automaton into small graphs and then maps them. Multiple transformations are applied to the automaton by APmap to meet hardware constraints. The experiments on a standard benchmark suite show that our approach leads to around 19% less storage utilization compared to state-of-the-art.Computer EngineeringQuantum & Computer Engineerin

    The Power of Computation-in-Memory Based on Memristive Devices

    No full text
    Conventional computing architectures and the CMOS technology that they are based on are facing major challenges such as the memory bottleneck making the memory access for data transfer a major killer of energy and performance. Computation-in-memory (CIM) paradigm is seen as a potential alternative that could alleviate such problems by adding computational resources to the memory, and significantly reducing the communication. Memristive devices are promising enablers of a such CIM paradigm, as they are able to support both storage and computing. This paper shows the power of memristive device based CIM paradigm in enabling new efficient application-specific architectures as well as efficient implementations of some known domain-specific architectures. In addition, the paper discusses the potential applications that could benefit from such paradigm and highlights the major challenges.Computer EngineeringQuantum & Computer Engineerin

    A Classification of Memory-Centric Computing

    No full text
    Technological and architectural improvements have been constantly required to sustain the demand of faster and cheaper computers. However, CMOS down-scaling is suffering from three technology walls: leakage wall, reliability wall, and cost wall. On top of that, a performance increase due to architectural improvements is alsogradually saturating due to three well-known architecture walls: memory wall, power wall, and instruction level parallelism (ILP) wall. Hence, a lot of research is focusing on proposing and developing new technologies and architectures. In this article, we present a comprehensive classification of memory-centric computing architectures; it is based on three metrics: computation location, level of parallelism, and used memory technology. The classification not only provides an overview of existing architectures with their pros and cons but also unifies the terminology that uniquely identifies these architectures and highlights the potential future architectures that can be further explored. Hence, it sets up a direction for future research in the field.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Computer Engineerin

    A Survey on Memory-centric Computer Architectures

    No full text
    Faster and cheaper computers have been constantly demanding technological and architectural improvements. However, current technology is suffering from three technology walls: leakage wall, reliability wall, and cost wall. Meanwhile, existing architecture performance is also saturating due to three well-known architecture walls: memory wall, power wall, and instruction-level parallelism (ILP) wall. Hence, a lot of novel technologies and architectures have been introduced and developed intensively. Our previous work has presented a comprehensive classification and broad overview of memory-centric computer architectures. In this article, we aim to discuss the most important classes of memory-centric architectures thoroughly and evaluate their advantages and disadvantages. Moreover, for each class, the article provides a comprehensive survey on memory-centric architectures available in the literature.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Computer Engineerin
    corecore