347 research outputs found

    The design and multiplier-less realization of software radio receivers with reduced system delay

    Get PDF
    This paper studies the design and multiplier-less realization of a new software radio receiver (SRR) with reduced system delay. It employs low-delay finite-impulse response (FIR) and digital allpass filters to effectively reduce the system delay of the multistage decimators in SRRs. The optimal least-square and minimax designs of these low-delay FIR and allpass-based filters are formulated as a semidefinite programming (SDP) problem, which allows zero magnitude constraint at ω = π to be incorporated readily as additional linear matrix inequalities (LMIs). By implementing the sampling rate converter (SRC) using a variable digital filter (VDF) immediately after the integer decimators, the needs for an expensive programmable FIR filter in the traditional SRR is avoided. A new method for the optimal minimax design of this VDF-based SRC using SDP is also proposed and compared with traditional weight least squares method. Other implementation issues including the multiplier-less and digital signal processor (DSP) realizations of the SRR and the generation of the clock signal in the SRC are also studied. Design results show that the system delay and implementation complexities (especially in terms of high-speed variable multipliers) of the proposed architecture are considerably reduced as compared with conventional approaches. © 2004 IEEE.published_or_final_versio

    A versatile iterative framework for the reconstruction of bandlimited signals from their nonuniform samples

    Get PDF
    In this paper, we study a versatile iterative framework for the reconstruction of uniform samples from nonuniform samples of bandlimited signals. Assuming the input signal is slightly oversampled, we first show that its uniform and nonuniform samples in the frequency band of interest can be expressed as a system of linear equations using fractional delay digital filters. Then we develop an iterative framework, which enables the development and convergence analysis of efficient iterative reconstruction algorithms. In particular, we study the Richardson iteration in detail to illustrate how the reconstruction problem can be solved iteratively, and show that the iterative method can be efficiently implemented using Farrow-based variable digital filters with few general-purpose multipliers. Under the proposed framework, we also present a completed and systematic convergence analysis to determine the convergence conditions. Simulation results show that the iterative method converges more rapidly and closer to the true solution (i.e. the uniform samples) than conventional iterative methods using truncation of sinc series. © 2010 The Author(s).published_or_final_versionSpringer Open Choice, 21 Feb 201

    New design and realization techniques for a class of perfect reconstruction two-channel FIR filterbanks and wavelets bases

    Get PDF
    This paper proposes two new methods for designing a class of two-channel perfect reconstruction (PR) finite impulse response (FIR) filterbanks (FBs) and wavelets with K-regularity of high order and studies its multiplier-less implementation. It is based on the two-channel structural PR FB proposed by Phoong et al. The basic principle is to represent the K-regularity condition as a set of linear equality constraints in the design variables so that the least square and minimax design problems can be solved, respectively, as a quadratic programming problem with linear equality constraints (QPLC) and a semidefinite programming (SDP) problem. We also demonstrate that it is always possible to realize such FBs with sum-of-powers-of-two (SOPOT) coefficients while preserving the regularity constraints using Bernstein polynomials. However, this implementation usually requires long coefficient wordlength and another direct-form implementation, which can realize multiplier-less wavelets with K-regularity condition up to fifth order, is proposed. Several design examples are given to demonstrate the effectiveness of the proposed methods. © 2004 IEEE.published_or_final_versio

    Design and implementation of DA FIR filter for bio-inspired computing architecture

    Get PDF
    This paper elucidates the system construct of DA-FIR filter optimized for design of distributed arithmetic (DA) finite impulse response (FIR) filter and is based on architecture with tightly coupled co-processor based data processing units. With a series of look-up-table (LUT) accesses in order to emulate multiply and accumulate operations the constructed DA based FIR filter is implemented on FPGA. The very high speed integrated circuit hardware description language (VHDL) is used implement the proposed filter and the design is verified using simulation. This paper discusses two optimization algorithms and resulting optimizations are incorporated into LUT layer and architecture extractions. The proposed method offers an optimized design in the form of offers average miminimizations of the number of LUT, reduction in populated slices and gate minimization for DA-finite impulse response filter. This research paves a direction towards development of bio inspired computing architectures developed without logically intensive operations, obtaining the desired specifications with respect to performance, timing, and reliability

    Digital Filters

    Get PDF
    The new technology advances provide that a great number of system signals can be easily measured with a low cost. The main problem is that usually only a fraction of the signal is useful for different purposes, for example maintenance, DVD-recorders, computers, electric/electronic circuits, econometric, optimization, etc. Digital filters are the most versatile, practical and effective methods for extracting the information necessary from the signal. They can be dynamic, so they can be automatically or manually adjusted to the external and internal conditions. Presented in this book are the most advanced digital filters including different case studies and the most relevant literature

    Towards adaptive balanced computing (ABC) using reconfigurable functional caches (RFCs)

    Get PDF
    The general-purpose computing processor performs a wide range of functions. Although the performance of general-purpose processors has been steadily increasing, certain software technologies like multimedia and digital signal processing applications demand ever more computing power. Reconfigurable computing has emerged to combine the versatility of general-purpose processors with the customization ability of ASICs. The basic premise of reconfigurability is to provide better performance and higher computing density than fixed configuration processors. Most of the research in reconfigurable computing is dedicated to on-chip functional logic. If computing resources are adaptable to the computing requirement, the maximum performance can be achieved. To overcome the gap between processor and memory technology, the size of on-chip cache memory has been consistently increasing. The larger cache memory capacity, though beneficial in general, does not guarantee a higher performance for all the applications as they may not utilize all of the cache efficiently. To utilize on-chip resources effectively and to accelerate the performance of multimedia applications specifically, we propose a new architecture---Adaptive Balanced Computing (ABC). ABC uses dynamic resource configuration of on-chip cache memory by integrating Reconfigurable Functional Caches (RFC). RFC can work as a conventional cache or as a specialized computing unit when necessary. In order to convert a cache memory to a computing unit, we include additional logic to embed multi-bit output LUTs into the cache structure. We add the reconfigurability of cache memory to a conventional processor with minimal modification to the load/store microarchitecture and with minimal compiler assistance. ABC architecture utilizes resources more efficiently by reconfiguring the cache memory to computing units dynamically. The area penalty for this reconfiguration is about 50--60% of the memory cell cache array-only area with faster cache access time. In a base array cache (parallel decoding caches), the area penalty is 10--20% of the data array with 1--2% increase in the cache access time. However, we save 27% for FIR and 44% for DCT/IDCT in area with respect to memory cell array cache and about 80% for both applications with respect to base array cache if we were to implement all these units separately (such as ASICs). The simulations with multimedia and DSP applications (DCT/IDCT and FIR/IIR) show that the resource configuration with the RFC speedups ranging from 1.04X to 3.94X in overall applications and from 2.61X to 27.4X in the core computations. The simulations with various parameters indicate that the impact of reconfiguration can be minimized if an appropriate cache organization is selected

    On the design and multiplierless realization of perfect reconstruction triplet-based FIR filter banks and wavelet bases

    Get PDF
    This paper proposes new methods for the efficient design and realization of perfect reconstruction (PR) two-channel finite-impulse response (FIR) triplet filter banks (FBs) and wavelet bases. It extends the linear-phase FIR triplet FBs of Ansari et al. to include FIR triplet FBs with lower system delay and a prescribed order of K regularity. The design problem using either the minimax error or least-squares criteria is formulated as a semidefinite programming problem, which is a very flexible framework to incorporate linear and convex quadratic constraints. The K regularity conditions are also expressed as a set of linear equality constraints in the variables to be optimized and they are structurally imposed into the design problem by eliminating the redundant variables. The design method is applicable to linear-phase as well as low-delay triplet FBs. Design examples are given to demonstrate the effectiveness of the proposed method. Furthermore, it was found that the analysis and synthesis filters of the triplet FB have a more symmetric frequency responses. This property is exploited to construct a class of PR M-channel uniform FBs and wavelets with M = 2 L, where L is a positive integer, using a particular tree structure. The filter lengths of the two-channel FBs down the tree are approximately reduced by a factor of two at each level or stage, while the transition bandwidths are successively increased by the same factor. Because of the downsampling operations, the frequency responses of the final analysis filters closely resemble those in a uniform FB with identical transition bandwidth. This triplet-based uniform M-channel FB has very low design complexity and the PR condition and K regularity conditions are structurally imposed. Furthermore, it has considerably lower arithmetic complexity and system delay than conventional tree structure using identical FB at all levels. The multiplierless realization of these FBs using sum-of-power-of-two (SOPOT) coefficients and multiplier block is also studied. © 2004 IEEE.published_or_final_versio

    Design of approximate overclocked datapath

    Get PDF
    Embedded applications can often demand stringent latency requirements. While high degrees of parallelism within custom FPGA-based accelerators may help to some extent, it may also be necessary to limit the precision used in the datapath to boost the operating frequency of the implementation. However, by reducing the precision, the engineer introduces quantisation error into the design. In this thesis, we describe an alternative circuit design methodology when considering trade-offs between accuracy, performance and silicon area. We compare two different approaches that could trade accuracy for performance. One is the traditional approach where the precision used in the datapath is limited to meet a target latency. The other is a proposed new approach which simply allows the datapath to operate without timing closure. We demonstrate analytically and experimentally that for many applications it would be preferable to simply overclock the design and accept that timing violations may arise. Since the errors introduced by timing violations occur rarely, they will cause less noise than quantisation errors. Furthermore, we show that conventional forms of computer arithmetic do not fail gracefully when pushed beyond the deterministic clocking region. In this thesis we take a fresh look at Online Arithmetic, originally proposed for digit serial operation, and synthesize unrolled digit parallel online arithmetic operators to allow for graceful degradation. We quantify the impact of timing violations on key arithmetic primitives, and show that substantial performance benefits can be obtained in comparison to binary arithmetic. Since timing errors are caused by long carry chains, these result in errors in least significant digits with online arithmetic, causing less impact than conventional implementations.Open Acces
    • …
    corecore