4,053 research outputs found

    A Unifying Framework for Finite Wordlength Realizations.

    No full text
    A general framework for the analysis of the finite wordlength (FWL) effects of linear time-invariant digital filter implementations is proposed. By means of a special implicit system description, all realization forms can be described. An algebraic characterization of the equivalent classes is provided, which enables a search for realizations that minimize the FWL effects to be made. Two suitable FWL coefficient sensitivity measures are proposed for use within the framework, these being a transfer function sensitivity measure and a pole sensitivity measure. An illustrative example is presented

    On error-spectrum shaping in state-space digital filters

    Get PDF
    A new scheme for shaping the error spectrum in state-space digital filter structures is proposed. The scheme is based on the application of diagonal second-order error feedback, and can be used in any arbitrary state-space structure having arbitrary order. A method to obtain noise-optimal state-space structures for fixed error feedback coefficients, starting from noise optimal structures in absence of error feedback (the Mullis and Roberts Structures), is also outlined. This optimization is based on the theory of continuous equivalence for state-space structures

    Optimal realizations of floating-point implemented digital controllers with finite word length considerations.

    Get PDF
    The closed-loop stability issue of finite word length (FWL) realizations is investigated for digital controllers implemented in floating-point arithmetic. Unlike the existing methods which only address the effect of the mantissa bits in floating-point implementation to the sensitivity of closed-loop stability, the sensitivity of closed-loop stability is analysed with respect to both the mantissa and exponent bits of floating-point implementation. A computationally tractable FWL closed-loop stability measure is then defined, and the method of computing the value of this measure is given. The optimal controller realization problem is posed as searching for a floating-point realization that maximizes the proposed FWL closed-loop stability measure, and a numerical optimization technique is adopted to solve for the resulting optimization problem. Simulation results show that the proposed design procedure yields computationally efficient controller realizations with enhanced FWL closed-loop stability performance

    FPGA Implementation of Convolutional Neural Networks with Fixed-Point Calculations

    Full text link
    Neural network-based methods for image processing are becoming widely used in practical applications. Modern neural networks are computationally expensive and require specialized hardware, such as graphics processing units. Since such hardware is not always available in real life applications, there is a compelling need for the design of neural networks for mobile devices. Mobile neural networks typically have reduced number of parameters and require a relatively small number of arithmetic operations. However, they usually still are executed at the software level and use floating-point calculations. The use of mobile networks without further optimization may not provide sufficient performance when high processing speed is required, for example, in real-time video processing (30 frames per second). In this study, we suggest optimizations to speed up computations in order to efficiently use already trained neural networks on a mobile device. Specifically, we propose an approach for speeding up neural networks by moving computation from software to hardware and by using fixed-point calculations instead of floating-point. We propose a number of methods for neural network architecture design to improve the performance with fixed-point calculations. We also show an example of how existing datasets can be modified and adapted for the recognition task in hand. Finally, we present the design and the implementation of a floating-point gate array-based device to solve the practical problem of real-time handwritten digit classification from mobile camera video feed

    Fractionally-addressed delay lines

    Full text link
    While traditional implementations of variable-length digital delay lines are based on a circular buffer accessed by two pointers, we propose an implementation where a single fractional pointer is used both for read and write operations. On modern general-purpose architectures, the proposed method is nearly as efficient as the popularinterpolated circular buffer, and it behaves well for delay-length modulations commonly found in digital audio effects. The physical interpretation of the new implementation shows that it is suitable for simulating tension or density modulations in wave-propagating media.Comment: 11 pages, 19 figures, to be published in IEEE Transactions on Speech and Audio Processing Corrected ACM-clas

    Dynamically reconfigurable management of energy, performance, and accuracy applied to digital signal, image, and video Processing Applications

    Get PDF
    There is strong interest in the development of dynamically reconfigurable systems that can meet real-time constraints in energy/power-performance-accuracy (EPA/PPA). In this dissertation, I introduce a framework for implementing dynamically reconfigurable digital signal, image, and video processing systems. The basic idea is to first generate a collection of Pareto-optimal realizations in the EPA/PPA space. Dynamic EPA/PPA management is then achieved by selecting the Pareto-optimal implementations that can meet the real-time constraints. The systems are then demonstrated using Dynamic Partial Reconfiguration (DPR) and dynamic frequency control on FPGAs. The framework is demonstrated on: i) a dynamic pixel processor, ii) a dynamically reconfigurable 1-D digital filtering architecture, and iii) a dynamically reconfigurable 2-D separable digital filtering system. Efficient implementations of the pixel processor are based on the use of look-up tables and local-multiplexes to minimize FPGA resources. For the pixel-processor, different realizations are generated based on the number of input bits, the number of cores, the number of output bits, and the frequency of operation. For each parameters combination, there is a different pixel-processor realization. Pareto-optimal realizations are selected based on measurements of energy per frame, PSNR accuracy, and performance in terms of frames per second. Dynamic EPA/PPA management is demonstrated for a sequential list of real-time constraints by selecting optimal realizations and implementing using DPR and dynamic frequency control. Efficient FPGA implementations for the 1-D and 2-D FIR filters are based on the use a distributed arithmetic technique. Different realizations are generated by varying the number of coefficients, coefficient bitwidth, and output bitwidth. Pareto-optimal realizations are selected in the EPA space. Dynamic EPA management is demonstrated on the application of real-time EPA constraints on a digital video. The results suggest that the general framework can be applied to a variety of digital signal, image, and video processing systems. It is based on the use of offline-processing that is used to determine the Pareto-optimal realizations. Real-time constraints are met by selecting Pareto-optimal realizations pre-loaded in memory that are then implemented efficiently using DPR and/or dynamic frequency control

    Nonlinear Switched-Capacitor Networks: Basic Principles and Piecewise-Linear Design

    Get PDF
    The applicability of switched-capacitor (SC) components to the design of nonlinear networks is extensively discussed in this paper. The main objective is to show that SC's can be efficiently used for designing nonlinear networks. Moreover, the design methods to be proposed here are fully compatible with general synthesis methods for nonlinear n -ports. Different circuit alternatives are given and their potentials are evaluated.Office of Naval Research (USA) N00014-76-C-0572Comisión Interministerial de Ciencia y Tecnología 0235/81Semiconductor Research Corporation (USA) 82-11-00

    FPGA based Uniform Channelizer Implementation

    Get PDF
    Channelizers are widely used in modern digital communication systems. Advanced uniform multirate channelization have been theoretically proved to be capable of reducing the computational load, with a better performance. Therefore, in this thesis, we implement these designs on a FPGA board for the sake of the comprehensive evaluation of resource usage, performance and frequency response. The uniform filter-banks are one of the most essential unit in channelization. The Generalised Discrete Fourier Transform Modulated Filter Bank (GDFT-FB), as an important variant of basic a DFT-FB, has been implemented in FPGA and demonstrated with a better computational saving rather than traditional schemes. Moreover the oversampling version is demonstrated to have a better frequency response with an acceptable amount of extra resources. On the other hand, frequency response masking (FRM) techniques is able to reduce the number of coefficients. Therefore, the full FRM GDFT-FB and alternative narrowband FRM GDFT-FB are both implemented in FPGA platform, in order to achieve a better performance and hardware efficiency

    Low power, compact charge coupled device signal processing system

    Get PDF
    A variety of charged coupled devices (CCDs) for performing programmable correlation for preprocessing environmental sensor data preparatory to its transmission to the ground were developed. A total of two separate ICs were developed and a third was evaluated. The first IC was a CCD chirp z transform IC capable of performing a 32 point DFT at frequencies to 1 MHz. All on chip circuitry operated as designed with the exception of the limited dynamic range caused by a fixed pattern noise due to interactions between the digital and analog circuits. The second IC developed was a 64 stage CCD analog/analog correlator for performing time domain correlation. Multiplier errors were found to be less than 1 percent at designed signal levels and less than 0.3 percent at the measured smaller levels. A prototype IC for performing time domain correlation was also evaluated
    corecore