Search CORE

1,794 research outputs found

autoAx: An Automatic Design Space Exploration and Circuit Building Methodology utilizing Libraries of Approximate Components

Author: Hanif Muhammad Abdullah
Mrazek Vojtech
Sekanina Lukas
Shafique Muhammad
Vasicek Zdenek
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2019
Field of study

Approximate computing is an emerging paradigm for developing highly energy-efficient computing systems such as various accelerators. In the literature, many libraries of elementary approximate circuits have already been proposed to simplify the design process of approximate accelerators. Because these libraries contain from tens to thousands of approximate implementations for a single arithmetic operation it is intractable to find an optimal combination of approximate circuits in the library even for an application consisting of a few operations. An open problem is "how to effectively combine circuits from these libraries to construct complex approximate accelerators". This paper proposes a novel methodology for searching, selecting and combining the most suitable approximate circuits from a set of available libraries to generate an approximate accelerator for a given application. To enable fast design space generation and exploration, the methodology utilizes machine learning techniques to create computational models estimating the overall quality of processing and hardware cost without performing full synthesis at the accelerator level. Using the methodology, we construct hundreds of approximate accelerators (for a Sobel edge detector) showing different but relevant tradeoffs between the quality of processing and hardware cost and identify a corresponding Pareto-frontier. Furthermore, when searching for approximate implementations of a generic Gaussian filter consisting of 17 arithmetic operations, the proposed approach allows us to identify approximately

10^3

highly important implementations from

10^{23}

possible solutions in a few hours, while the exhaustive search would take four months on a high-end processor.Comment: Accepted for publication at the Design Automation Conference 2019 (DAC'19), Las Vegas, Nevada, US

arXiv.org e-Print Archive

Crossref

Synthesis and Optimization of Reversible Circuits - A Survey

Author: Arabzadeh M.
Cheung D.
Cheung D.
Cuccaro S. A.
De Vos A.
Doucçot B.
Fazel K.
Glück R.
Hirata Y.
Igor L. Markov
Korf R.
Kutin S.
Kutin S. A.
Lee S.
Markov I. L.
Markov I. L.
Mehdi Saeedi
Miller D.
Mishchenko A.
Patel K. N.
Politi A.
Saeedi M.
Saeedi M.
Saeedi M.
Shende V. V.
Shi Z.
Soeken M.
Storme L.
Takahashi Y.
Takahashi Y.
Viamontes G. F.
Wille R.
Yamashita S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/03/2013
Field of study

Reversible logic circuits have been historically motivated by theoretical research in low-power electronics as well as practical improvement of bit-manipulation transforms in cryptography and computer graphics. Recently, reversible circuits have attracted interest as components of quantum algorithms, as well as in photonic and nano-computing technologies where some switching devices offer no signal gain. Research in generating reversible logic distinguishes between circuit synthesis, post-synthesis optimization, and technology mapping. In this survey, we review algorithmic paradigms --- search-based, cycle-based, transformation-based, and BDD-based --- as well as specific algorithms for reversible synthesis, both exact and heuristic. We conclude the survey by outlining key open challenges in synthesis of reversible and quantum logic, as well as most common misconceptions.Comment: 34 pages, 15 figures, 2 table

arXiv.org e-Print Archive

Crossref

Optimization of Circuits for IBM's five-qubit Quantum Computers

Author: Banerjee Anindita
Dueck Gerhard W.
Pathak Anirban
Rahman Md Mazder
Shukla Abhishek
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/09/2018
Field of study

IBM has made several quantum computers available to researchers around the world via cloud services. Two architectures with five qubits, one with 16, and one with 20 qubits are available to run experiments. The IBM architectures implement gates from the Clifford+T gate library. However, each architecture only implements a subset of the possible CNOT gates. In this paper, we show how Clifford+T circuits can efficiently be mapped into the two IBM quantum computers with 5 qubits. We further present an algorithm and a set of circuit identities that may be used to optimize the Clifford+T circuits in terms of gate count and number of levels. It is further shown that the optimized circuits can considerably reduce the gate count and number of levels and thus produce results with better fidelity

arXiv.org e-Print Archive

Crossref

Design of ALU and Cache Memory for an 8 bit ALU

Author: Chandran Pravin chander
Publication venue: Clemson University Libraries
Publication date: 01/12/2007
Field of study

The design of an ALU and a Cache memory for use in a high performance processor was examined in this thesis. Advanced architectures employing increased parallelism were analyzed to minimize the number of execution cycles needed for 8 bit integer arithmetic operations. In addition to the arithmetic unit, an optimized SRAM memory cell was designed to be used as cache memory and as fast Look Up Table. The ALU consists of stand alone units for bit parallel computation of basic integer arithmetic operations. Addition and subtraction were performed using Kogge Stone parallel prefix hardware operating at 330MHz. A high performance multiplier was built using Radix 4 Modified Booth Encoder (MBE) and a Wallace Tree summation array. The multiplier requires single clock cycle for 8 bit integer multiplication and operates at a maximum frequency of 100MHz. Multiplicative division hardware was built for executing both integer division and square root. The division hardware computes 8-bit division and square root in 4 clock cycles. Multiplier forms the basic building block of all these functional units, making high level of resource sharing feasible with this architecture. The optimal operating frequency for the arithmetic unit is 70MHz. A 6T CMOS SRAM cell measuring 90 µm2 was designed using minimum size transistors. The layout allows for horizontal overlap resulting in effective area of 76 µm2 for an 8x8 array. By substituting equivalent bit line capacitance of P4 L1 Cache, the memory was simulated to have a read time of 3.27ns. An optimized set of test vectors were identified to enable high fault coverage without the need for any additional test circuitry. Sixteen test cases were identified that would toggle all the nodes and provide all possible inputs to the sub units of the multiplier. A correlation based semi automatic method was investigated to facilitate test case identification for large multipliers. This method of testability eliminates performance and area overhead associated with conventional testability hardware. Bottom up design methodology was employed for the design. The performance and area metrics are presented along with estimated power consumption. A set of Monte Carlo analysis was carried out to ensure the dependability of the design under process variations as well as fluctuations in operating conditions. The arithmetic unit was found to require a total die area of 2mm2 (approx.) in 0.35 micron process

Clemson University: TigerPrints

A Study of Optimal 4-bit Reversible Toffoli Circuits and Their Synthesis

Author: Golubitsky Oleg
Maslov Dmitri
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/01/2012
Field of study

Optimal synthesis of reversible functions is a non-trivial problem. One of the major limiting factors in computing such circuits is the sheer number of reversible functions. Even restricting synthesis to 4-bit reversible functions results in a huge search space (16! {\approx} 2^{44} functions). The output of such a search alone, counting only the space required to list Toffoli gates for every function, would require over 100 terabytes of storage. In this paper, we present two algorithms: one, that synthesizes an optimal circuit for any 4-bit reversible specification, and another that synthesizes all optimal implementations. We employ several techniques to make the problem tractable. We report results from several experiments, including synthesis of all optimal 4-bit permutations, synthesis of random 4-bit permutations, optimal synthesis of all 4-bit linear reversible circuits, synthesis of existing benchmark functions; we compose a list of the hardest permutations to synthesize, and show distribution of optimal circuits. We further illustrate that our proposed approach may be extended to accommodate physical constraints via reporting LNN-optimal reversible circuits. Our results have important implications in the design and optimization of reversible and quantum circuits, testing circuit synthesis heuristics, and performing experiments in the area of quantum information processing.Comment: arXiv admin note: substantial text overlap with arXiv:1003.191

arXiv.org e-Print Archive

Crossref

Development of Urban Electric Bus Drivetrain

Author: Chyský J.
Musálek L.
Novák J.
Novák L.
Novák M.
Sivkov O.
Publication venue: 'Czech Technical University in Prague - Central Library'
Publication date: 01/01/2019
Field of study

The development of the drivetrain for a new series of urban electric buses is presented in the paper. The traction and design properties of several drive variants are compared. The efficiency of the drive was tested using simulation calculations of the vehicle rides based on data from real bus lines in Prague. The results of the design work and simulation calculations are presented in the paper

Digital Library of the Czech Technical University in Prague