1,794 research outputs found
autoAx: An Automatic Design Space Exploration and Circuit Building Methodology utilizing Libraries of Approximate Components
Approximate computing is an emerging paradigm for developing highly
energy-efficient computing systems such as various accelerators. In the
literature, many libraries of elementary approximate circuits have already been
proposed to simplify the design process of approximate accelerators. Because
these libraries contain from tens to thousands of approximate implementations
for a single arithmetic operation it is intractable to find an optimal
combination of approximate circuits in the library even for an application
consisting of a few operations. An open problem is "how to effectively combine
circuits from these libraries to construct complex approximate accelerators".
This paper proposes a novel methodology for searching, selecting and combining
the most suitable approximate circuits from a set of available libraries to
generate an approximate accelerator for a given application. To enable fast
design space generation and exploration, the methodology utilizes machine
learning techniques to create computational models estimating the overall
quality of processing and hardware cost without performing full synthesis at
the accelerator level. Using the methodology, we construct hundreds of
approximate accelerators (for a Sobel edge detector) showing different but
relevant tradeoffs between the quality of processing and hardware cost and
identify a corresponding Pareto-frontier. Furthermore, when searching for
approximate implementations of a generic Gaussian filter consisting of 17
arithmetic operations, the proposed approach allows us to identify
approximately highly important implementations from possible
solutions in a few hours, while the exhaustive search would take four months on
a high-end processor.Comment: Accepted for publication at the Design Automation Conference 2019
(DAC'19), Las Vegas, Nevada, US
Synthesis and Optimization of Reversible Circuits - A Survey
Reversible logic circuits have been historically motivated by theoretical
research in low-power electronics as well as practical improvement of
bit-manipulation transforms in cryptography and computer graphics. Recently,
reversible circuits have attracted interest as components of quantum
algorithms, as well as in photonic and nano-computing technologies where some
switching devices offer no signal gain. Research in generating reversible logic
distinguishes between circuit synthesis, post-synthesis optimization, and
technology mapping. In this survey, we review algorithmic paradigms ---
search-based, cycle-based, transformation-based, and BDD-based --- as well as
specific algorithms for reversible synthesis, both exact and heuristic. We
conclude the survey by outlining key open challenges in synthesis of reversible
and quantum logic, as well as most common misconceptions.Comment: 34 pages, 15 figures, 2 table
Optimization of Circuits for IBM's five-qubit Quantum Computers
IBM has made several quantum computers available to researchers around the
world via cloud services. Two architectures with five qubits, one with 16, and
one with 20 qubits are available to run experiments. The IBM architectures
implement gates from the Clifford+T gate library. However, each architecture
only implements a subset of the possible CNOT gates. In this paper, we show how
Clifford+T circuits can efficiently be mapped into the two IBM quantum
computers with 5 qubits. We further present an algorithm and a set of circuit
identities that may be used to optimize the Clifford+T circuits in terms of
gate count and number of levels. It is further shown that the optimized
circuits can considerably reduce the gate count and number of levels and thus
produce results with better fidelity
Design of ALU and Cache Memory for an 8 bit ALU
The design of an ALU and a Cache memory for use in a high performance processor was examined in this thesis. Advanced architectures employing increased parallelism were analyzed to minimize the number of execution cycles needed for 8 bit integer arithmetic operations. In addition to the arithmetic unit, an optimized SRAM memory cell was designed to be used as cache memory and as fast Look Up Table. The ALU consists of stand alone units for bit parallel computation of basic integer arithmetic operations. Addition and subtraction were performed using Kogge Stone parallel prefix hardware operating at 330MHz. A high performance multiplier was built using Radix 4 Modified Booth Encoder (MBE) and a Wallace Tree summation array. The multiplier requires single clock cycle for 8 bit integer multiplication and operates at a maximum frequency of 100MHz. Multiplicative division hardware was built for executing both integer division and square root. The division hardware computes 8-bit division and square root in 4 clock cycles. Multiplier forms the basic building block of all these functional units, making high level of resource sharing feasible with this architecture. The optimal operating frequency for the arithmetic unit is 70MHz. A 6T CMOS SRAM cell measuring 90 µm2 was designed using minimum size transistors. The layout allows for horizontal overlap resulting in effective area of 76 µm2 for an 8x8 array. By substituting equivalent bit line capacitance of P4 L1 Cache, the memory was simulated to have a read time of 3.27ns. An optimized set of test vectors were identified to enable high fault coverage without the need for any additional test circuitry. Sixteen test cases were identified that would toggle all the nodes and provide all possible inputs to the sub units of the multiplier. A correlation based semi automatic method was investigated to facilitate test case identification for large multipliers. This method of testability eliminates performance and area overhead associated with conventional testability hardware. Bottom up design methodology was employed for the design. The performance and area metrics are presented along with estimated power consumption. A set of Monte Carlo analysis was carried out to ensure the dependability of the design under process variations as well as fluctuations in operating conditions. The arithmetic unit was found to require a total die area of 2mm2 (approx.) in 0.35 micron process
A Study of Optimal 4-bit Reversible Toffoli Circuits and Their Synthesis
Optimal synthesis of reversible functions is a non-trivial problem. One of
the major limiting factors in computing such circuits is the sheer number of
reversible functions. Even restricting synthesis to 4-bit reversible functions
results in a huge search space (16! {\approx} 2^{44} functions). The output of
such a search alone, counting only the space required to list Toffoli gates for
every function, would require over 100 terabytes of storage. In this paper, we
present two algorithms: one, that synthesizes an optimal circuit for any 4-bit
reversible specification, and another that synthesizes all optimal
implementations. We employ several techniques to make the problem tractable. We
report results from several experiments, including synthesis of all optimal
4-bit permutations, synthesis of random 4-bit permutations, optimal synthesis
of all 4-bit linear reversible circuits, synthesis of existing benchmark
functions; we compose a list of the hardest permutations to synthesize, and
show distribution of optimal circuits. We further illustrate that our proposed
approach may be extended to accommodate physical constraints via reporting
LNN-optimal reversible circuits. Our results have important implications in the
design and optimization of reversible and quantum circuits, testing circuit
synthesis heuristics, and performing experiments in the area of quantum
information processing.Comment: arXiv admin note: substantial text overlap with arXiv:1003.191
Development of Urban Electric Bus Drivetrain
The development of the drivetrain for a new series of urban electric buses is presented in the paper. The traction and design properties of several drive variants are compared. The efficiency of the drive was tested using simulation calculations of the vehicle rides based on data from real bus lines in Prague. The results of the design work and simulation calculations are presented in the paper
- …