484 research outputs found

    Evolvable hardware platform for fault-tolerant reconfigurable sensor electronics

    Get PDF

    Object-oriented domain specific compilers for programming FPGAs

    No full text
    Published versio

    RECONFIGURABLE LOW POWER AND AREA EFFICIENT ESPFFIR FILTER USING VHBCSE MULTIPLIER

    Get PDF
    Reconfigurable Even Symmetric Parallel Fast Finite Impulse Response (RESPFFIR) filter shall be utilized as the Processing Element (PE) in Software Defined Radio (SDR) design to improve the throughput. The number of multipliers required in RESPFFIR filter increases when parallelism length increases. The Constant Multiplier (CM) technique is used to diminish the power consumption in FIR filters by reducing the number of Logical Operators (LO) and Logical Depth (LD). Binary Common Subexpression Elimination (BCSE) method is suitable to exploit symmetric coefficient in FIR filters. The Vertical Horizontal Binary Common Subexpression Elimination (VHBCSE) technique based Constant Multiplier (CM) design further diminish the number of LO and LD. The 2-bit BCSE algorithm has been applied vertically across neighboring coefficients and HCSE makes use of CSs that arise within each coefficient to eradicate redundant computations, which intern reduce logical operator in constant multiplier. This paper presents the design of Reconfigurable Even Symmetric Parallel Fast Finite Impulse Response (RESPFFIR) filter using VHBCSE based CM multiplier, which is reconfigurable with reduced power and area consumption without degrading the throughput. The power consumption reduces by 12% and the area required gets reduced by 24% in the proposed design when compared with existing CSE Hcub-n Multiple Constant Multiplier based  ESPFFIR filter design. The analysis is done using Cadence RC synthesize tools

    High throughput spatial convolution filters on FPGAs

    Get PDF
    Digital signal processing (DSP) on field- programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30–60 FPS, while maintaining functional flexibility

    Low power techniques and architectures for multicarrier wireless receivers

    Get PDF

    Multiplierless CSD techniques for high performance FPGA implementation of digital filters.

    Get PDF
    I leverage FastCSD to develop a new, high performance iterative multiplierless structure based on a novel real-time CSD recoding, so that more zero partial products are introduced. Up to 66.7% zero partial products occur compared to 50% in the traditional modified Booth's recoding. Also, this structure reduces the non-zero partial products to a minimum. As a result, the number of arithmetic operations in the carry-save structure is reduced. Thus, an overall speed-up, as well as low-power consumption can be achieved. Furthermore, because the proposed structure involves real time CSD recoding and does not require a fixed value for the multiplier input to be known a priori, the proposed multiplier can be applied to implement digital filters with non-fixed filter coefficients, such as adaptive filters.My work is based on a dramatic new technique for converting between 2's complement and CSD number systems, and results in high-performance structures that are particularly effective for implementing adaptive systems in reconfigurable logic.My research focus is on two key ideas for improving DSP performance: (1) Develop new high performance, efficient shift-add techniques ("multiplierless") to implement the multiply-add operations without the need for a traditional multiplier structure. (2) There is a growing trend toward design prototyping and even production in FPGAs as opposed to dedicated DSP processors or ASICs; leverage this trend synergistically with the new multiplierless structures to improve performance.Implementation of digital signal processing (DSP) algorithms in hardware, such as field programmable gate arrays (FPGAs), requires a large number of multipliers. Fast, low area multiply-adds have become critical in modern commercial and military DSP applications. In many contemporary real-time DSP and multimedia applications, system performance is severely impacted by the limitations of currently available speed, energy efficiency, and area requirement of an onboard silicon multiplier.I also introduce a new multi-input Canonical Signed Digit (CSD) multiplier unit, which requires fewer shift/add/subtract operations and reduced CSD number conversion overhead compared to existing techniques. This results in reduced power consumption and area requirements in the hardware implementation of DSP algorithms. Furthermore, because all the products are produced simultaneously, the multiplication speed and thus the throughput are improved. The multi-input multiplier unit is applied to implement digital filters with non-fixed filter coefficients, such as adaptive filters. The implementation cost of these digital filters can be further reduced by limiting the wordlength of the input signal with little or no sacrifice to the filter performance, which is confirmed by my simulation results. The proposed multiplier unit can also be applied to other DSP algorithms, such as digital filter banks or matrix and vector multiplications.Finally, the tradeoff between filter order and coefficient length in the design and implementation of high-performance filters in Field Programmable Gate Arrays (FPGAs) is discussed. Non-minimum order FIR filters are designed for implementation using Canonical Signed Digit (CSD) multiplierless implementation techniques. By increasing the filter order, the length of the coefficients can be decreased without reducing the filter performance. Thus, an overall hardware savings can be achieved.Adaptive system implementations require real-time conversion of coefficients to Canonical Signed Digit (CSD) or similar representations to benefit from multiplierless techniques for implementing filters. Multiplierless approaches are used to reduce the hardware and increase the throughput. This dissertation introduces the first non-iterative hardware algorithm to convert 2's complement numbers to their CSD representations (FastCSD) using a fixed number of shift and logic operations. As a result, the power consumption and area requirements required for hardware implementation of DSP algorithms in which the coefficients are not known a priori can be greatly reduced. Because all CSD digits are produced simultaneously, the conversion speed and thus the throughput are improved when compared to overlap-and-scan techniques such as Booth's recoding

    Low Latency Prefix Accumulation Driven Compound MAC Unit for Efficient FIR Filter Implementation

    Get PDF
    135–138This article presents hierarchical single compound adder-based MAC with assertion based error correction for speculation variations in the prefix addition for FIR filter design. The VLSI implementation of approximation in prefix adder results show a significant delay and complexity reductions, all this at the cost of latency measures when speculation fails during carry propagation, which is the main reason preventing the use of speculation in parallel-prefix adders in DSP applications. The speculative adder which is based on Han Carlson parallel prefix adder structure accomplishes better reduction in latency. Introducing a structured and efficient shift-add technique and explore latency reduction by incorporating approximation in addition. The improvements made in terms of reduction in latency and merits in performance by the proposed MAC unit are showed through the synthesis done by FPGA hardware. Results show that proposed method outpaces both formerly projected MAC designs using multiplication methods for attaining high speed

    Design Of FIR Filter For Important Capability In Reconfigurable Capabilities

    Get PDF
    In addition, the MCM concern about damage is caused by a low complexity pattern of mass application of composite FIR filters. To obtain an identical filter period and thus the same block size, the projected structure requires thirteen sets of much less ADP and 12. EPS is eight percent less than that of the FIR block's direct victory form. To support these results, we generally tend to provide a subject for direct shape selection and transform shape configuration that is mainly based on filter lengths and block period to obtain location delay and maximum strength and inexpensive block FIR structures. The ASIC synthesis indicates the final result indicating that the projected model for block length four and filter length sixty-four contains less than 40 days of ADP product (ADP) and four hundred less EPS than the most effective FIR rate. The variable-shaped finite impulse response (FIR) structures are inherently tubular, and the manual multiple fixed multiplexers (MCM) ultimately culminate in the bioavailability of the computation. However, the configuration of the transfer form may not immediately direct the assessment block technique to direct configuration. We have currently obtained a form multiplier that is mainly based on the expected mass of the conversion form scanned for the reconfigurable packages

    Joint Optimization of Low-power DCT Architecture and Effcient Quantization Technique for Embedded Image Compression

    Get PDF
    International audienceThe Discrete Cosine Transform (DCT)-based image com- pression is widely used in today's communication systems. Signi cant research devoted to this domain has demonstrated that the optical com- pression methods can o er a higher speed but su er from bad image quality and a growing complexity. To meet the challenges of higher im- age quality and high speed processing, in this chapter, we present a joint system for DCT-based image compression by combining a VLSI archi- tecture of the DCT algorithm and an e cient quantization technique. Our approach is, rstly, based on a new granularity method in order to take advantage of the adjacent pixel correlation of the input blocks and to improve the visual quality of the reconstructed image. Second, a new architecture based on the Canonical Signed Digit and a novel Common Subexpression Elimination technique is proposed to replace the constant multipliers. Finally, a recon gurable quantization method is presented to e ectively save the computational complexity. Experimental results obtained with a prototype based on FPGA implementation and com- parisons with existing works corroborate the validity of the proposed optimizations in terms of power reduction, speed increase, silicon area saving and PSNR improvement
    corecore