20 research outputs found
Beyond Moore's technologies: operation principles of a superconductor alternative
The predictions of Moore's law are considered by experts to be valid until
2020 giving rise to "post-Moore's" technologies afterwards. Energy efficiency
is one of the major challenges in high-performance computing that should be
answered. Superconductor digital technology is a promising post-Moore's
alternative for the development of supercomputers. In this paper, we consider
operation principles of an energy-efficient superconductor logic and memory
circuits with a short retrospective review of their evolution. We analyze their
shortcomings in respect to computer circuits design. Possible ways of further
research are outlined.Comment: OPEN ACCES
New Logic-In-Memory Paradigms: An Architectural and Technological Perspective
Processing systems are in continuous evolution thanks to the constant technological advancement and architectural progress. Over the years, computing systems have become more and more powerful, providing support for applications, such as Machine Learning, that require high computational power. However, the growing complexity of modern computing units and applications has had a strong impact on power consumption. In addition, the memory plays a key role on the overall power consumption of the system, especially when considering data-intensive applications. These applications, in fact, require a lot of data movement between the memory and the computing unit. The consequence is twofold: Memory accesses are expensive in terms of energy and a lot of time is wasted in accessing the memory, rather than processing, because of the performance gap that exists between memories and processing units. This gap is known as the memory wall or the von Neumann bottleneck and is due to the different rate of progress between complementary metal-oxide semiconductor (CMOS) technology and memories. However, CMOS scaling is also reaching a limit where it would not be possible to make further progress. This work addresses all these problems from an architectural and technological point of view by: (1) Proposing a novel Configurable Logic-in-Memory Architecture that exploits the in-memory computing paradigm to reduce the memory wall problem while also providing high performance thanks to its flexibility and parallelism; (2) exploring a non-CMOS technology as possible candidate technology for the Logic-in-Memory paradigm
In-memory computing on a photonic platform
This is the final version. Available from the publisher via the DOI in this record.All data needed to evaluate the conclusions in the paper are present in the paper
and/or the Supplementary Materials. Additional data related to this paper may be requested
from the authors or Oxford Research Archive for Data (https://ora.ox.ac.uk).Collocated data processing and storage are the norm in biological computing systems such as the mammalian brain. As our ability to create better hardware improves, new computational paradigms are being explored beyond von Neumann architectures. Integrated photonic circuits are an attractive solution for on-chip computing which can leverage the increased speed and bandwidth potential of the optical domain, and importantly, remove the need for electro-optical conversions. Here we show that we can combine integrated optics with collocated data storage and processing to enable all-photonic in-memory computations. By employing nonvolatile photonic elements based on the phase-change material, Ge2Sb2Te5, we achieve direct scalar and matrix-vector multiplication, featuring a novel single-shot Write/Erase and a drift-free process. The output pulse, carrying the information of the light-matter interaction, is the result of the computation. Our all-optical approach is novel, easy to fabricate and operate, and sets the stage for development of entirely photonic computers.Engineering and Physical Sciences Research Council (EPSRC)Deutsche Forschungsgemeinschaft (DFG)European Research Council (ERC
Custom Memory Design for Logic-in-Memory: Drawbacks and Improvements over Conventional Memories
The speed of modern digital systems is severely limited by memory latency (the “Memory Wall” problem). Data exchange between Logic and Memory is also responsible for a large part of the system energy consumption. Logic-in-Memory (LiM) represents an attractive solution to this problem. By performing part of the computations directly inside the memory the system speed can be improved while reducing its energy consumption. LiM solutions that offer the major boost in performance are based on the modification of the memory cell. However, what is the cost of such modifications? How do these impact the memory array performance? In this work, this question is addressed by analysing a LiM memory array implementing an algorithm for the maximum/minimum value computation. The memory array is designed at physical level using the FreePDK 45nm CMOS process, with three memory cell variants, and its performance is compared to SRAM and CAM memories. Results highlight that read and write operations performance is worsened but in-memory operations result to be very efficient: a 55.26% reduction in the energy-delay product is measured for the AND operation with respect to the SRAM read one. Therefore, the LiM approach represents a very promising solution for low-density and high-performance memories
Recommended from our members
Hybrid Analog-Digital Co-Processing for Scientific Computation
In the past 10 years computer architecture research has moved to more heterogeneity and less adherence to conventional abstractions. Scientists and engineers hold an unshakable belief that computing holds keys to unlocking humanity's Grand Challenges. Acting on that belief they have looked deeper into computer architecture to find specialized support for their applications. Likewise, computer architects have looked deeper into circuits and devices in search of untapped performance and efficiency. The lines between computer architecture layers---applications, algorithms, architectures, microarchitectures, circuits and devices---have blurred. Against this backdrop, a menagerie of computer architectures are on the horizon, ones that forgo basic assumptions about computer hardware, and require new thinking of how such hardware supports problems and algorithms.
This thesis is about revisiting hybrid analog-digital computing in support of diverse modern workloads. Hybrid computing had extensive applications in early computing history, and has been revisited for small-scale applications in embedded systems. But architectural support for using hybrid computing in modern workloads, at scale and with high accuracy solutions, has been lacking.
I demonstrate solving a variety of scientific computing problems, including stochastic ODEs, partial differential equations, linear algebra, and nonlinear systems of equations, as case studies in hybrid computing. I solve these problems on a system of multiple prototype analog accelerator chips built by a team at Columbia University. On that team I made contributions toward programming the chips, building the digital interface, and validating the chips' functionality. The analog accelerator chip is intended for use in conjunction with a conventional digital host computer.
The appeal and motivation for using an analog accelerator is efficiency and performance, but it comes with limitations in accuracy and problem sizes that we have to work around.
The first problem is how to do problems in this unconventional computation model. Scientific computing phrases problems as differential equations and algebraic equations. Differential equations are a continuous view of the world, while algebraic equations are a discrete one. Prior work in analog computing mostly focused on differential equations; algebraic equations played a minor role in prior work in analog computing. The secret to using the analog accelerator to support modern workloads on conventional computers is that these two viewpoints are interchangeable. The algebraic equations that underlie most workloads can be solved as differential equations,
and differential equations are naturally solvable in the analog accelerator chip. A hybrid analog-digital computer architecture can focus on solving linear and nonlinear algebra problems to support many workloads.
The second problem is how to get accurate solutions using hybrid analog-digital computing. The reason that the analog computation model gives less accurate solutions is it gives up representing numbers as digital binary numbers, and instead uses the full range of analog voltage and current to represent real numbers. Prior work has established that encoding data in analog signals gives an energy efficiency advantage as long as the analog data precision is limited. While the analog accelerator alone may be useful for energy-constrained applications where inputs and outputs are imprecise, we are more interested in using analog in conjunction with digital for precise solutions. This thesis gives novel insight that the trick to do so is to solve nonlinear problems where low-precision guesses are useful for conventional digital algorithms.
The third problem is how to solve large problems using hybrid analog-digital computing. The reason the analog computation model can't handle large problems is it gives up step-by-step discrete-time operation, instead allowing variables to evolve smoothly in continuous time. To make that happen the analog accelerator works by chaining hardware for mathematical operations end-to-end. During computation analog data flows through the hardware with no overheads in control logic and memory accesses. The downside is then the needed hardware size grows alongside problem sizes. While scientific computing researchers have for a long time split large problems into smaller subproblems to fit in digital computer constraints, this thesis is a first attempt to consider these divide-and-conquer algorithms as an essential tool in using the analog model of computation.
As we enter the post-Moore’s law era of computing, unconventional architectures will offer specialized models of computation that uniquely support specific problem types. Two prominent examples are deep neural networks and quantum computers. Recent trends in computer science research show these unconventional architectures will soon have broad adoption. In this thesis I show another specialized, unconventional architecture is to use analog accelerators to solve problems in scientific computing. Computer architecture researchers will discover other important models of computation in the future. This thesis is an example of the discovery process, implementation, and evaluation of how an unconventional architecture supports specialized workloads
A Model for the Evaluation of Monostable Molecule Signal Energy in Molecular Field-Coupled Nanocomputing
Molecular Field-Coupled Nanocomputing (FCN) is a computational paradigm promising high-frequency information elaboration at ambient temperature. This work proposes a model to evaluate the signal energy involved in propagating and elaborating the information. It splits the evaluation into several energy contributions calculated with closed-form expressions without computationally expensive calculation. The essential features of the 1,4-diallylbutane cation are evaluated with Density Functional Theory (DFT) and used in the model to evaluate circuit energy. This model enables understanding the information propagation mechanism in the FCN paradigm based on monostable molecules. We use the model to verify the bistable factor theory, describing the information propagation in molecular FCN based on monostable molecules, analyzed so far only from an electrostatic standpoint. Finally, the model is integrated into the SCERPA tool and used to quantify the information encoding stability and possible memory effects. The obtained results are consistent with state-of-the-art considerations and comparable with DFT calculation
An Ultra-Energy-Efficient Reversible Quantum-Dot Cellular Automata 8:1 Multiplexer Circuit
Energy efficiency considerations in terms of reduced power dissipation are a significant issue in the design of digital circuits for very large-scale integration (VLSI) systems. Quantum-dot cellular automata (QCA) is an emerging ultralow power dissipation approach, distinct from traditional, complementary metal-oxide semiconductor (CMOS) technology, for building digital computing circuits. Developing fully reversible QCA circuits has the potential to significantly reduce energy dissipation. Multiplexers are fundamental elements in the construction of useful digital circuits. In this paper, a novel, multilayer, fully reversible QCA 8:1 multiplexer circuit with ultralow energy dissipation is introduced. The power dissipation of the proposed multiplexer is simulated using the QCADesigner-E version 2.2 tool, describing the microscopic physical mechanisms underlying the QCA operation. The results show that the proposed reversible QCA 8:1 multiplexer consumes 89% less energy than the most energy-efficient 8:1 multiplexer circuit previously presented in the literature