Search CORE

4,391 research outputs found

autoAx: An Automatic Design Space Exploration and Circuit Building Methodology utilizing Libraries of Approximate Components

Author: Hanif Muhammad Abdullah
Mrazek Vojtech
Sekanina Lukas
Shafique Muhammad
Vasicek Zdenek
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2019
Field of study

Approximate computing is an emerging paradigm for developing highly energy-efficient computing systems such as various accelerators. In the literature, many libraries of elementary approximate circuits have already been proposed to simplify the design process of approximate accelerators. Because these libraries contain from tens to thousands of approximate implementations for a single arithmetic operation it is intractable to find an optimal combination of approximate circuits in the library even for an application consisting of a few operations. An open problem is "how to effectively combine circuits from these libraries to construct complex approximate accelerators". This paper proposes a novel methodology for searching, selecting and combining the most suitable approximate circuits from a set of available libraries to generate an approximate accelerator for a given application. To enable fast design space generation and exploration, the methodology utilizes machine learning techniques to create computational models estimating the overall quality of processing and hardware cost without performing full synthesis at the accelerator level. Using the methodology, we construct hundreds of approximate accelerators (for a Sobel edge detector) showing different but relevant tradeoffs between the quality of processing and hardware cost and identify a corresponding Pareto-frontier. Furthermore, when searching for approximate implementations of a generic Gaussian filter consisting of 17 arithmetic operations, the proposed approach allows us to identify approximately

10^3

highly important implementations from

10^{23}

possible solutions in a few hours, while the exhaustive search would take four months on a high-end processor.Comment: Accepted for publication at the Design Automation Conference 2019 (DAC'19), Las Vegas, Nevada, US

arXiv.org e-Print Archive

Crossref

Recent Progress in Image Deblurring

Author: Tao Dacheng
Wang Ruxin
Publication venue
Publication date: 24/09/2014
Field of study

This paper comprehensively reviews the recent development of image deblurring, including non-blind/blind, spatially invariant/variant deblurring techniques. Indeed, these techniques share the same objective of inferring a latent sharp image from one or several corresponding blurry images, while the blind deblurring techniques are also required to derive an accurate blur kernel. Considering the critical role of image restoration in modern imaging systems to provide high-quality images under complex environments such as motion, undesirable lighting conditions, and imperfect system components, image deblurring has attracted growing attention in recent years. From the viewpoint of how to handle the ill-posedness which is a crucial issue in deblurring tasks, existing methods can be grouped into five categories: Bayesian inference framework, variational methods, sparse representation-based methods, homography-based modeling, and region-based methods. In spite of achieving a certain level of development, image deblurring, especially the blind case, is limited in its success by complex application conditions which make the blur kernel hard to obtain and be spatially variant. We provide a holistic understanding and deep insight into image deblurring in this review. An analysis of the empirical evidence for representative methods, practical issues, as well as a discussion of promising future directions are also presented.Comment: 53 pages, 17 figure

arXiv.org e-Print Archive

CiteSeerX

Biblioteca para diseño basado en modelos de algoritmos de procesado de imágenes en FPGA

Author: Brox Jiménez Piedad
Cabrera Sarmiento Alejandro José
Garcés Socarrás Luis Manuel
Sánchez Solano Santiago
Publication venue: 'Universidad Pedagogica y Tecnologica de Colombia'
Publication date: 01/01/2013
Field of study

This paper describes a library (XSGImgLib) that includes parameterizable blocks to implement low-level image processing tasks on FPGAs. A modelbased design technique provided by Xilinx System Generator (XSG) has been used to design the blocks, which implement point operation (binarization) and neighborhood operations (linear and non-linear filtering) in grayscale images. The blocks are parameterizable for input/output data precision, window size, normalization strategy, and implementation options (area versus speed optimization). The paper includes the implementation results obtained after fixing these options and exemplifies the combination of several blocks of the library to build a complete design for image segmentation purposes.Este artículo describe una biblioteca de bloques parametrizables (XSGImgLib) para la implementación de tareas de procesado de imágenes en FPGA. Se ha utilizado la técnica de diseño basado en modelos proporcionada por Xilinx System Generator (XSG) para diseñar diferentes bloques de procesado que implementan operaciones puntuales (binarización) y basadas en vecindad (filtros lineales y no-lineales) para imágenes en escala de grises. La parametrización de los bloques permite configurar la precisión de los datos de entrada/salida, el tamaño de la ventana, la estrategia de normalización y distintas opciones de implementación (optimización en área o velocidad). El artículo muestra los resultados de implementación para las diferentes opciones de configuración y ejemplifica la combinación de los bloques de procesado en el desarrollo de un sistema para segmentado de imágenes.Agencia Española de Cooperación Internacional para el Desarrollo PCID/024124/09, PCID/030769/1

idUS. Depósito de Investigación Universidad de Sevilla

Vector processing-aware advanced clock-gating techniques for low-power fused multiply-add

Author: Cristal Kestelman Adrián
Palomar Pérez Óscar
Ratkovic Ivan
Stanic Milan
Unsal Osman Sabri
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The need for power efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a retailoring for the mobile market that they are entering now. Floating-point (FP) fused multiply-add (FMA), being a functional unit with high power consumption, deserves special attention. Although clock gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector FMA units (VFUs). These techniques ensure power savings without jeopardizing the timing. We evaluate the proposed techniques using both synthetic and “real-world” application-based benchmarking. Using vector masking and vector multilane-aware clock gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector FP instructions. Finally, when evaluating all techniques together, using “real-world” benchmarking, the power reductions are up to 80%. Additionally, in accordance with processor design trends, we perform this research in a fully parameterizable and automated fashion.The research leading to these results has received funding from the RoMoL ERC Advanced Grant GA 321253 and is supported in part by the European Union (FEDER funds) under contract TTIN2015-65316-P. The work of I. Ratkovic was supported by a FPU research grant from the Spanish MECD.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Energy proportional computing with OpenCL on a FPGA-based overlay architecture

Author: Nunez-Yanez Jose Luis
Sani Awais Hussain
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/12/2016
Field of study

Crossref

Explore Bristol Research