7,468 research outputs found

    A New Low Complexity Uniform Filter Bank Based on the Improved Coefficient Decimation Method

    Get PDF
    In this paper, we propose a new uniform filter bank (FB) based on the improved coefficient decimation method (ICDM). In the proposed FB’s design, the ICDM is used to obtain different multi-band frequency responses using a single lowpass prototype filter. The desired subbands are individually obtained from these multi-band frequency responses by using low order frequency response masking filters and their corresponding ICDM output frequency responses. We show that the proposed FB is a very low complexity alternative to the other FBs in literature, especially the widely used discrete Fourier transform based FB (DFTFB) and the CDM based FB (CDFB). The proposed FB can have a higher number of subbands with twice the center frequency resolution when compared with the CDFB and DFTFB. Design example and implementation results show that our FB achieves 86.59% and 58.84% reductions in resource utilizations and 76.95% and 47.09% reductions in power consumptions when compared with the DFTFB and CDFB respectively

    A Reconfigurable Tile-Based Architecture to Compute FFT and FIR Functions in the Context of Software-Defined Radio

    Get PDF
    Software-defined radio (SDR) is the term used for flexible radio systems that can deal with multiple standards. For an efficient implementation, such systems require appropriate reconfigurable architectures. This paper targets the efficient implementation of the most computationally intensive kernels of two significantly different standards, viz. Bluetooth and HiperLAN/2, on the same reconfigurable hardware. These kernels are FIR filtering and FFT. The designed architecture is based on a two-dimensional arrangement of 17 tiles. Each tile contains a multiplier, an adder, local memory and multiplexers allowing flexible communication with the neighboring tiles. The tile-base data path is complemented with a global controller and various memories. The design has been implemented in SystemC and simulated extensively to prove equivalence with a reference all-software design. It has also been synthesized and turns out to outperform significantly other reconfigurable designs with respect to speed and area

    Embedded FIR filter design for real-time refocusing using a standard plenoptic video camera

    Get PDF
    Copyright 2014 Society of Photo-Optical Instrumentation Engineers and IS&T—The Society for Imaging Science and Technology. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.A novel and low-cost embedded hardware architecture for real-time refocusing based on a standard plenoptic camera is presented in this study. The proposed layout design synthesizes refocusing slices directly from micro images by omitting the process for the commonly used sub-aperture extraction. Therefore, intellectual property cores, containing switch controlled Finite Impulse Response (FIR) filters, are developed and applied to the Field Programmable Gate Array (FPGA) XC6SLX45 from Xilinx. Enabling the hardware design to work economically, the FIR filters are composed of stored product as well as upsampling and interpolation techniques in order to achieve an ideal relation between image resolution, delay time, power consumption and the demand of logic gates. The video output is transmitted via High-Definition Multimedia Interface (HDMI) with a resolution of 720p at a frame rate of 60 fps conforming to the HD ready standard. Examples of the synthesized refocusing slices are presented

    Non-uniform wordlength delay lines for FIR filters

    Get PDF
    When FIR filters are designed floating point arithmetic is generally used. However when implemented on hardware such as ASICs, fixed point arithmetic must be used to minimise cost and power requirements. Research to minimise hardware costs has mainly focused on the quantization effects of fixed point wordlengths for the coefficients, multipliers and adders of FIR filters, but with the actual data delays assigned a uniform wordlength and essentially not optimised. This paper proposes that the wordlengths of the delay line can be non-uniform with a minimal increase in quantization noise for parallel implementation of FIR filters where there are differences in the magnitudes of the coefficients. A non-uniform delay line allows hardware savings in terms of delay register wordlengths, delay signal wordlengths and multiplier wordlengths. Results for an FIR design are presented which demonstrate the hardware savingswhen using a non-uniform wordlength delay lin

    High throughput spatial convolution filters on FPGAs

    Get PDF
    Digital signal processing (DSP) on field- programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30–60 FPS, while maintaining functional flexibility

    A Scalable Correlator Architecture Based on Modular FPGA Hardware, Reuseable Gateware, and Data Packetization

    Full text link
    A new generation of radio telescopes is achieving unprecedented levels of sensitivity and resolution, as well as increased agility and field-of-view, by employing high-performance digital signal processing hardware to phase and correlate large numbers of antennas. The computational demands of these imaging systems scale in proportion to BMN^2, where B is the signal bandwidth, M is the number of independent beams, and N is the number of antennas. The specifications of many new arrays lead to demands in excess of tens of PetaOps per second. To meet this challenge, we have developed a general purpose correlator architecture using standard 10-Gbit Ethernet switches to pass data between flexible hardware modules containing Field Programmable Gate Array (FPGA) chips. These chips are programmed using open-source signal processing libraries we have developed to be flexible, scalable, and chip-independent. This work reduces the time and cost of implementing a wide range of signal processing systems, with correlators foremost among them,and facilitates upgrading to new generations of processing technology. We present several correlator deployments, including a 16-antenna, 200-MHz bandwidth, 4-bit, full Stokes parameter application deployed on the Precision Array for Probing the Epoch of Reionization.Comment: Accepted to Publications of the Astronomy Society of the Pacific. 31 pages. v2: corrected typo, v3: corrected Fig. 1

    Normalizing or not normalizing? An open question for floating-point arithmetic in embedded systems

    Get PDF
    Emerging embedded applications lack of a specific standard when they require floating-point arithmetic. In this situation they use the IEEE-754 standard or ad hoc variations of it. However, this standard was not designed for this purpose. This paper aims to open a debate to define a new extension of the standard to cover embedded applications. In this work, we only focus on the impact of not performing normalization. We show how eliminating the condition of normalized numbers, implementation costs can be dramatically reduced, at the expense of a moderate loss of accuracy. Several architectures to implement addition and multiplication for non-normalized numbers are proposed and analyzed. We show that a combined architecture (adder-multiplier) can halve the area and power consumption of its counterpart IEEE-754 architecture. This saving comes at the cost of reducing an average of about 10 dBs the Signal-to-Noise Ratio for the tested algorithms. We think these results should encourage researchers to perform further investigation in this issue.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech
    corecore