Search CORE

199 research outputs found

High throughput spatial convolution filters on FPGAs

Author: Al-Dujaili Abdullah
Fahmy Suhaib A.
Ioannou Lenos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/04/2020
Field of study

Digital signal processing (DSP) on field- programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30–60 FPS, while maintaining functional flexibility

Warwick Research Archives Portal Repository

Diseño hardware de la transformada wavelet discreta: un análisis de complejidad, precisión y frecuencia de operación

Author: Ballesteros Dora Maria
Pedraza Luis Fernando
Renza Diego
Publication venue: 'Universidad EAFIT'
Publication date: 01/01/2016
Field of study

The purpose of this paper is to present a comparative analysis of hardware design of the Discrete Wavelet Transform (DWT) in terms of three design goals: accuracy, hardware cost and operating frequency. Every design should take into account the following facts: method (non-polyphase, polyphase and lifting), topology (multiplier-based and multiplierless-based), structure (conventional or pipelined), and quantization format (floatingpoint, fixed-point, CSD or integer). Since DWT is widely used in several applications (e.g. compression, filtering, coding, pattern recognition among others), selection of adequate parameters plays an important role in the performance of these systems.El propósito de este documento es presentar un análisis comparativo de esquemas hardware de la Transformada Wavelet Discreta, DWT, en términos de tres objetivos de diseño: precisión, complejidad y frecuencia de operación. Cada diseño debe considerar los siguientes aspectos: método (no polifásico, polifásico y lifting), topología (basados en multiplicadores y sin multiplicadores), estructura (convencional o pipeline) y formato de cuantización (punto flotante, punto fijo, CSD o entero). Dado que la DWT es ampliamente utilizada en diversas aplicaciones (por ejemplo en compresión, filtrado, codificación, reconocimiento de patrones, entre otras), la selección adecuada de parámetros de diseño desempeña un papel importante en el diseño de estos sistemas

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Repositorio Institucional Universidad EAFIT

DIALNET

Revistas académicas Universidad EAFIT

Multiplierless CSD techniques for high performance FPGA implementation of digital filters.

Author: Wang Yunhua.
Publication venue
Publication date: 01/01/2007
Field of study

I leverage FastCSD to develop a new, high performance iterative multiplierless structure based on a novel real-time CSD recoding, so that more zero partial products are introduced. Up to 66.7% zero partial products occur compared to 50% in the traditional modified Booth's recoding. Also, this structure reduces the non-zero partial products to a minimum. As a result, the number of arithmetic operations in the carry-save structure is reduced. Thus, an overall speed-up, as well as low-power consumption can be achieved. Furthermore, because the proposed structure involves real time CSD recoding and does not require a fixed value for the multiplier input to be known a priori, the proposed multiplier can be applied to implement digital filters with non-fixed filter coefficients, such as adaptive filters.My work is based on a dramatic new technique for converting between 2's complement and CSD number systems, and results in high-performance structures that are particularly effective for implementing adaptive systems in reconfigurable logic.My research focus is on two key ideas for improving DSP performance: (1) Develop new high performance, efficient shift-add techniques ("multiplierless") to implement the multiply-add operations without the need for a traditional multiplier structure. (2) There is a growing trend toward design prototyping and even production in FPGAs as opposed to dedicated DSP processors or ASICs; leverage this trend synergistically with the new multiplierless structures to improve performance.Implementation of digital signal processing (DSP) algorithms in hardware, such as field programmable gate arrays (FPGAs), requires a large number of multipliers. Fast, low area multiply-adds have become critical in modern commercial and military DSP applications. In many contemporary real-time DSP and multimedia applications, system performance is severely impacted by the limitations of currently available speed, energy efficiency, and area requirement of an onboard silicon multiplier.I also introduce a new multi-input Canonical Signed Digit (CSD) multiplier unit, which requires fewer shift/add/subtract operations and reduced CSD number conversion overhead compared to existing techniques. This results in reduced power consumption and area requirements in the hardware implementation of DSP algorithms. Furthermore, because all the products are produced simultaneously, the multiplication speed and thus the throughput are improved. The multi-input multiplier unit is applied to implement digital filters with non-fixed filter coefficients, such as adaptive filters. The implementation cost of these digital filters can be further reduced by limiting the wordlength of the input signal with little or no sacrifice to the filter performance, which is confirmed by my simulation results. The proposed multiplier unit can also be applied to other DSP algorithms, such as digital filter banks or matrix and vector multiplications.Finally, the tradeoff between filter order and coefficient length in the design and implementation of high-performance filters in Field Programmable Gate Arrays (FPGAs) is discussed. Non-minimum order FIR filters are designed for implementation using Canonical Signed Digit (CSD) multiplierless implementation techniques. By increasing the filter order, the length of the coefficients can be decreased without reducing the filter performance. Thus, an overall hardware savings can be achieved.Adaptive system implementations require real-time conversion of coefficients to Canonical Signed Digit (CSD) or similar representations to benefit from multiplierless techniques for implementing filters. Multiplierless approaches are used to reduce the hardware and increase the throughput. This dissertation introduces the first non-iterative hardware algorithm to convert 2's complement numbers to their CSD representations (FastCSD) using a fixed number of shift and logic operations. As a result, the power consumption and area requirements required for hardware implementation of DSP algorithms in which the coefficients are not known a priori can be greatly reduced. Because all CSD digits are produced simultaneously, the conversion speed and thus the throughput are improved when compared to overlap-and-scan techniques such as Booth's recoding

SHAREOK repository

Programmable architectures for the automated design of digital FIR filters using evolvable hardware

Author: Hounsell Benjamin Iain
Publication venue: The University of Edinburgh
Publication date: 01/01/2001
Field of study

Edinburgh Research Archive

Design and multiplierless implementation of two-channel biorthogonal IIR filter banks with low system delay

Author: Antoniou A
Chan SC
Lu WS
Mao JS
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

An efficient method for the design of low-delay two-channel, perfect reconstruction IIR filter banks is proposed. The design problem is formulated in terms of minimax designs of a general stable IIR filter that can be obtained using semidefinite programming and an FIR filter that can be obtained using the Remez exchange algorithm. A multiplierless implementation on this filter bank is also proposed and investigated.published_or_final_versio

HKU Scholars Hub

Reconfigurable Adaptive Multiple Transform Hardware Solutions for Versatile Video Coding

Author: Fanni T.
Ligas D.
Palumbo F.
Raffo L.
Sau C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Computer aided design is nowadays a must to quickly provide optimized circuits, to cope with stringent time to market constraints, and to be able to guarantee colliding constrained requirements. Design automation is exploited, whenever possible, to speed up the design process and relieve the developers from error prone customization, optimization and tuning phases. In this work we study the possibility of adopting automated algorithms for the optimization of reconfigurable multiple constant multiplication circuits. In particular, an exploration of novel reconfigurable Adaptive Multiple Transform circuital solutions adoptable in video coding applications has been conducted. These solutions have also been compared with the unique similar work at the state of the art, revealing to be beneficial under certain constraints. Moreover, the proposed approach has been generalized with some guidelines helpful to designers facing similar problems

Archivio istituzionale della ricerca - Università di Cagliari

An area-efficient 2-D convolution implementation on FPGA for space applications

Author: Di Carlo Stefano
Gambardella Giulio
Indaco Marco
Prinetto Paolo Ernesto
Rolfo Daniele
Tiotto Gabriele
Publication venue: IEEE Computer Society
Publication date: 01/01/2011
Field of study

The 2-D Convolution is an algorithm widely used in image and video processing. Although its computation is simple, its implementation requires a high computational power and an intensive use of memory. Field Programmable Gate Arrays (FPGA) architectures were proposed to accelerate calculations of 2-D Convolution and the use of buffers implemented on FPGAs are used to avoid direct memory access. In this paper we present an implementation of the 2-D Convolution algorithm on a FPGA architecture designed to support this operation in space applications. This proposed solution dramatically decreases the area needed keeping good performance, making it appropriate for embedded systems in critical space application

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Digital Filters

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

The new technology advances provide that a great number of system signals can be easily measured with a low cost. The main problem is that usually only a fraction of the signal is useful for different purposes, for example maintenance, DVD-recorders, computers, electric/electronic circuits, econometric, optimization, etc. Digital filters are the most versatile, practical and effective methods for extracting the information necessary from the signal. They can be dynamic, so they can be automatically or manually adjusted to the external and internal conditions. Presented in this book are the most advanced digital filters including different case studies and the most relevant literature

Directory of Open Access Books (DOAB)