606 research outputs found
High throughput spatial convolution filters on FPGAs
Digital signal processing (DSP) on field- programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30–60 FPS, while maintaining functional flexibility
Top-down design of digital signal processing systems
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.Includes bibliographical references (leaves 45-46).by Amy M. Singer.M.Eng
Approaches towards Implementation of Multi-bit Digital Receiver using Fast Fourier Transform
This paper compares different digital receiver signal processing schemes as applied to current ESM/RWR systems. The schemes include fast fourier transform (FFT)-based, FIR filter-based and mixed architectures. Use of polyphase FFT and IIR filters is also discussed. The specifications and signal processing requirements of a modern digital electronic warfare (EW) receiver are discussed. The design procedures and architectures for all the schemes are brought out. The tradeoffs involved in selection of different parameters for these schemes are also discussed. The digital receiver schemes are modeled and analyzed for different metrics such as, Parameter measurement accuracies, Pulse handling capability, Frequency separation capability, Number of multipliers required for implementation etc. The analysis is done for a 500 MHz BW digital receiver and assumes 8 bit ADC in the front end. The results obtained for the comparison are discussed in the paper. Limited simulations show that overlapped FFT scheme is a better approach for digital receiver processing.Defence Science Journal, 2013, 63(2), pp.198-203, DOI:http://dx.doi.org/10.14429/dsj.63.426
An efficient design or fractional-delay digital FIR filters using the Farrow structure
Fractional-delay digital filter (FD-DF), implemented using the Farrow (1988) structure, is very attractive in providing online tuning delay of digital signals. This paper proposes a new method for the design of such Farrow-based FD-DF using sum-of-powers-of-two (SOPOT) coefficients. Using the SOPOT coefficient representation, coefficient multiplication can be implemented with limited number of shifts and additions. Design examples show that the proposed method can greatly reduce the design time and complexity of the Farrow structure while providing comparable phase and amplitude responses.published_or_final_versio
Recommended from our members
Efficient FPGA implementation and power modelling of image and signal processing IP cores
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage
and signal processing application areas such as consumer electronics, instrumentation,
medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA
devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the
work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of
cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area.
A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM
is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed
Hardware acceleration for real time processing systems
This Master Thesis presents different Hardware acceleration algorithms and its benefits compared to the software implementation. The proposed algorithms are implemented on Xilinx ZYNQ-7000 series XC7Z020 SoC using High-Level-Synthesis (HLS) tool. With todays System-on-Chips from Xilinx or Intel, a process can be chosen to be implemented in the Programmable Logic or in the Processing System. In order to have a better acceleration factor, different approximate and accurate adders and multipliers were instantiated in Verilog, synthesized and simulated using Vivado and finally they were compared between each other to see if they really offer benefits or not. In the case of approximated adders, they showed very promising results for the application written in this Thesis. On the other hand, approximated multipliers exhibited worse results than the accurate ones
- …