2,054 research outputs found
Efficient FPGA implementation of high-throughput mixed radix multipath delay commutator FFT processor for MIMO-OFDM
This article presents and evaluates pipelined architecture designs for an improved high-frequency Fast Fourier
Transform (FFT) processor implemented on Field Programmable Gate Arrays (FPGA) for Multiple Input Multiple Output
Orthogonal Frequency Division Multiplexing (MIMO-OFDM). The architecture presented is a Mixed-Radix Multipath Delay
Commutator. The presented parallel architecture utilizes fewer hardware resources compared to Radix-2 architecture,
while maintaining simple control and butterfly structures inherent to Radix-2 implementations. The high-frequency
design presented allows enhancing system throughput without requiring additional parallel data paths common in
other current approaches, the presented design can process two and four independent data streams in parallel
and is suitable for scaling to any power of two FFT size N. FPGA implementation of the architecture demonstrated
significant resource efficiency and high-throughput in comparison to relevant current approaches within
literature. The proposed architecture designs were realized with Xilinx System Generator (XSG) and evaluated
on both Virtex-5 and Virtex-7 FPGA devices. Post place and route results demonstrated maximum frequency
values over 400 MHz and 470 MHz for Virtex-5 and Virtex-7 FPGA devices respectively
Hardware/Software Co-design Applied to Reed-Solomon Decoding for the DMB Standard
This paper addresses the implementation of Reed-
Solomon decoding for battery-powered wireless
devices. The scope of this paper is constrained by the
Digital Media Broadcasting (DMB). The most critical
element of the Reed-Solomon algorithm is implemented
on two different reconfigurable hardware
architectures: an FPGA and a coarse-grained
architecture: the Montium, The remaining parts are
executed on an ARM processor. The results of this
research show that a co-design of the ARM together
with an FPGA or a Montium leads to a substantial
decrease in energy consumption. The energy
consumption of syndrome calculation of the Reed-
Solomon decoding algorithm is estimated for an FPGA
and a Montium by means of simulations. The Montium
proves to be more efficient
High throughput spatial convolution filters on FPGAs
Digital signal processing (DSP) on field- programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30–60 FPS, while maintaining functional flexibility
Parametric, Secure and Compact Implementation of RSA on FPGA
We present a fast, efficient, and parameterized modular multiplier and a secure exponentiation circuit especially intended for FPGAs on the low end of the price range. The design utilizes dedicated block multipliers as the main functional unit and Block-RAM as storage unit for the operands. The adopted design methodology allows adjusting the number of multipliers, the radix used in the multipliers, and number of words to meet the system requirements such as
available resources, precision and timing constraints. The architecture, based on the Montgomery modular multiplication algorithm, utilizes a pipelining technique that allows concurrent operation of hardwired multipliers. Our
design completes 1020-bit and 2040-bit modular multiplications in 7.62 ÎĽs and 27.0 ÎĽs, respectively. The multiplier uses a moderate amount of system resources while achieving the best area-time product in literature. 2040-bit modular exponentiation engine can easily fit into Xilinx Spartan-3E 500; moreover the exponentiation circuit withstands known side channel attacks
- …