Search CORE

47,290 research outputs found

Automated Circuit Approximation Method Driven by Data Distribution

Author: Mrazek Vojtech
Sekanina Lukas
Vasicek Zdenek
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/03/2019
Field of study

We propose an application-tailored data-driven fully automated method for functional approximation of combinational circuits. We demonstrate how an application-level error metric such as the classification accuracy can be translated to a component-level error metric needed for an efficient and fast search in the space of approximate low-level components that are used in the application. This is possible by employing a weighted mean error distance (WMED) metric for steering the circuit approximation process which is conducted by means of genetic programming. WMED introduces a set of weights (calculated from the data distribution measured on a selected signal in a given application) determining the importance of each input vector for the approximation process. The method is evaluated using synthetic benchmarks and application-specific approximate MAC (multiply-and-accumulate) units that are designed to provide the best trade-offs between the classification accuracy and power consumption of two image classifiers based on neural networks.Comment: Accepted for publication at Design, Automation and Test in Europe (DATE 2019). Florence, Ital

arXiv.org e-Print Archive

Crossref

AC OPF in Radial Distribution Networks - Parts I,II

Author: Boudec Jean-Yves Le
Christakou Konstantina
Paolone Mario
Tomozei Dan-Cristian
Publication venue
Publication date: 08/06/2015
Field of study

The optimal power-flow problem (OPF) has played a key role in the planning and operation of power systems. Due to the non-linear nature of the AC power-flow equations, the OPF problem is known to be non-convex, therefore hard to solve. Most proposed methods for solving the OPF rely on approximations that render the problem convex, but that may yield inexact solutions. Recently, Farivar and Low proposed a method that is claimed to be exact for radial distribution systems, despite no apparent approximations. In our work, we show that it is, in fact, not exact. On one hand, there is a misinterpretation of the physical network model related to the ampacity constraint of the lines' current flows. On the other hand, the proof of the exactness of the proposed relaxation requires unrealistic assumptions related to the unboundedness of specific control variables. We also show that the extension of this approach to account for exact line models might provide physically infeasible solutions. Recently, several contributions have proposed OPF algorithms that rely on the use of the alternating-direction method of multipliers (ADMM). However, as we show in this work, there are cases for which the ADMM-based solution of the non-relaxed OPF problem fails to converge. To overcome the aforementioned limitations, we propose an algorithm for the solution of a non-approximated, non-convex OPF problem in radial distribution systems that is based on the method of multipliers, and on a primal decomposition of the OPF. This work is divided in two parts. In Part I, we specifically discuss the limitations of BFM and ADMM to solve the OPF problem. In Part II, we provide a centralized version and a distributed asynchronous version of the proposed OPF algorithm and we evaluate its performances using both small-scale electrical networks, as well as a modified IEEE 13-node test feeder

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Indicating Asynchronous Array Multipliers

Author: Balasubramanian P
Maskell D L
Publication venue
Publication date: 01/01/2019
Field of study

Multiplication is an important arithmetic operation that is frequently encountered in microprocessing and digital signal processing applications, and multiplication is physically realized using a multiplier. This paper discusses the physical implementation of many indicating asynchronous array multipliers, which are inherently elastic and modular and are robust to timing, process and parametric variations. We consider the physical realization of many indicating asynchronous array multipliers using a 32/28nm CMOS technology. The weak-indication array multipliers comprise strong-indication or weak-indication full adders, and strong-indication 2-input AND functions to realize the partial products. The multipliers were synthesized in a semi-custom ASIC design style using standard library cells including a custom-designed 2-input C-element. 4x4 and 8x8 multiplication operations were considered for the physical implementations. The 4-phase return-to-zero (RTZ) and the 4-phase return-to-one (RTO) handshake protocols were utilized for data communication, and the delay-insensitive dual-rail code was used for data encoding. Among several weak-indication array multipliers, a weak-indication array multiplier utilizing a biased weak-indication full adder and the strong-indication 2-input AND function is found to have reduced cycle time and power-cycle time product with respect to RTZ and RTO handshaking for 4x4 and 8x8 multiplications. Further, the 4-phase RTO handshaking is found to be preferable to the 4-phase RTZ handshaking for achieving enhanced optimizations of the design metrics.Comment: arXiv admin note: text overlap with arXiv:1903.0943

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

A Scalable Correlator Architecture Based on Modular FPGA Hardware, Reuseable Gateware, and Data Packetization

Author: Aaron Parsons
Andrew Siemion
Arash Parsa
Blackman R.
Bradley R.
Dan Werthimer
David MacMahon
Demorest P.
Donald Backer
Heiles C.
Henry Chen
Jason Manley
Melvyn Wright
Peter McMahon
Pierre Droz
Terry Filiba
Weinreb S.
Yen J. L.
Publication venue: 'University of Chicago Press'
Publication date: 17/03/2009
Field of study

A new generation of radio telescopes is achieving unprecedented levels of sensitivity and resolution, as well as increased agility and field-of-view, by employing high-performance digital signal processing hardware to phase and correlate large numbers of antennas. The computational demands of these imaging systems scale in proportion to BMN^2, where B is the signal bandwidth, M is the number of independent beams, and N is the number of antennas. The specifications of many new arrays lead to demands in excess of tens of PetaOps per second. To meet this challenge, we have developed a general purpose correlator architecture using standard 10-Gbit Ethernet switches to pass data between flexible hardware modules containing Field Programmable Gate Array (FPGA) chips. These chips are programmed using open-source signal processing libraries we have developed to be flexible, scalable, and chip-independent. This work reduces the time and cost of implementing a wide range of signal processing systems, with correlators foremost among them,and facilitates upgrading to new generations of processing technology. We present several correlator deployments, including a 16-antenna, 200-MHz bandwidth, 4-bit, full Stokes parameter application deployed on the Precision Array for Probing the Epoch of Reionization.Comment: Accepted to Publications of the Astronomy Society of the Pacific. 31 pages. v2: corrected typo, v3: corrected Fig. 1

arXiv.org e-Print Archive

Crossref

autoAx: An Automatic Design Space Exploration and Circuit Building Methodology utilizing Libraries of Approximate Components

Author: Hanif Muhammad Abdullah
Mrazek Vojtech
Sekanina Lukas
Shafique Muhammad
Vasicek Zdenek
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2019
Field of study

Approximate computing is an emerging paradigm for developing highly energy-efficient computing systems such as various accelerators. In the literature, many libraries of elementary approximate circuits have already been proposed to simplify the design process of approximate accelerators. Because these libraries contain from tens to thousands of approximate implementations for a single arithmetic operation it is intractable to find an optimal combination of approximate circuits in the library even for an application consisting of a few operations. An open problem is "how to effectively combine circuits from these libraries to construct complex approximate accelerators". This paper proposes a novel methodology for searching, selecting and combining the most suitable approximate circuits from a set of available libraries to generate an approximate accelerator for a given application. To enable fast design space generation and exploration, the methodology utilizes machine learning techniques to create computational models estimating the overall quality of processing and hardware cost without performing full synthesis at the accelerator level. Using the methodology, we construct hundreds of approximate accelerators (for a Sobel edge detector) showing different but relevant tradeoffs between the quality of processing and hardware cost and identify a corresponding Pareto-frontier. Furthermore, when searching for approximate implementations of a generic Gaussian filter consisting of 17 arithmetic operations, the proposed approach allows us to identify approximately

10^3

highly important implementations from

10^{23}

possible solutions in a few hours, while the exhaustive search would take four months on a high-end processor.Comment: Accepted for publication at the Design Automation Conference 2019 (DAC'19), Las Vegas, Nevada, US

arXiv.org e-Print Archive

Crossref