4,053 research outputs found
A Unifying Framework for Finite Wordlength Realizations.
A general framework for the analysis of the finite
wordlength (FWL) effects of linear time-invariant digital filter
implementations is proposed. By means of a special implicit system
description, all realization forms can be described. An algebraic
characterization of the equivalent classes is provided, which
enables a search for realizations that minimize the FWL effects
to be made. Two suitable FWL coefficient sensitivity measures
are proposed for use within the framework, these being a transfer
function sensitivity measure and a pole sensitivity measure. An
illustrative example is presented
On error-spectrum shaping in state-space digital filters
A new scheme for shaping the error spectrum in state-space digital filter structures is proposed. The scheme is based on the application of diagonal second-order error feedback, and can be used in any arbitrary state-space structure having arbitrary order. A method to obtain noise-optimal state-space structures for fixed error feedback coefficients, starting from noise optimal structures in absence of error feedback (the Mullis and Roberts Structures), is also outlined. This optimization is based on the theory of continuous equivalence for state-space structures
Optimal realizations of floating-point implemented digital controllers with finite word length considerations.
The closed-loop stability issue of finite word length (FWL) realizations is
investigated for digital controllers implemented in floating-point arithmetic.
Unlike the existing methods which only address the effect of the mantissa bits
in floating-point implementation to the sensitivity of closed-loop stability,
the sensitivity of closed-loop stability is analysed with respect to both the
mantissa and exponent bits of floating-point implementation. A computationally
tractable FWL closed-loop stability measure is then defined, and the method of
computing the value of this measure is given. The optimal controller realization
problem is posed as searching for a floating-point realization that maximizes
the proposed FWL closed-loop stability measure, and a numerical optimization
technique is adopted to solve for the resulting optimization problem. Simulation
results show that the proposed design procedure yields computationally efficient
controller realizations with enhanced FWL closed-loop stability performance
FPGA Implementation of Convolutional Neural Networks with Fixed-Point Calculations
Neural network-based methods for image processing are becoming widely used in
practical applications. Modern neural networks are computationally expensive
and require specialized hardware, such as graphics processing units. Since such
hardware is not always available in real life applications, there is a
compelling need for the design of neural networks for mobile devices. Mobile
neural networks typically have reduced number of parameters and require a
relatively small number of arithmetic operations. However, they usually still
are executed at the software level and use floating-point calculations. The use
of mobile networks without further optimization may not provide sufficient
performance when high processing speed is required, for example, in real-time
video processing (30 frames per second). In this study, we suggest
optimizations to speed up computations in order to efficiently use already
trained neural networks on a mobile device. Specifically, we propose an
approach for speeding up neural networks by moving computation from software to
hardware and by using fixed-point calculations instead of floating-point. We
propose a number of methods for neural network architecture design to improve
the performance with fixed-point calculations. We also show an example of how
existing datasets can be modified and adapted for the recognition task in hand.
Finally, we present the design and the implementation of a floating-point gate
array-based device to solve the practical problem of real-time handwritten
digit classification from mobile camera video feed
Workshop on Verification and Theorem Proving for Continuous Systems (NetCA Workshop 2005)
Oxford, UK, 26 August 200
Fractionally-addressed delay lines
While traditional implementations of variable-length digital delay lines are
based on a circular buffer accessed by two pointers, we propose an
implementation where a single fractional pointer is used both for read and
write operations. On modern general-purpose architectures, the proposed method
is nearly as efficient as the popularinterpolated circular buffer, and it
behaves well for delay-length modulations commonly found in digital audio
effects. The physical interpretation of the new implementation shows that it is
suitable for simulating tension or density modulations in wave-propagating
media.Comment: 11 pages, 19 figures, to be published in IEEE Transactions on Speech
and Audio Processing Corrected ACM-clas
Dynamically reconfigurable management of energy, performance, and accuracy applied to digital signal, image, and video Processing Applications
There is strong interest in the development of dynamically reconfigurable systems that can meet real-time constraints in energy/power-performance-accuracy (EPA/PPA). In this dissertation, I introduce a framework for implementing dynamically reconfigurable digital signal, image, and video processing systems. The basic idea is to first generate a collection of Pareto-optimal realizations in the EPA/PPA space. Dynamic EPA/PPA management is then achieved by selecting the Pareto-optimal implementations that can meet the real-time constraints. The systems are then demonstrated using Dynamic Partial Reconfiguration (DPR) and dynamic frequency control on FPGAs. The framework is demonstrated on: i) a dynamic pixel processor, ii) a dynamically reconfigurable 1-D digital filtering architecture, and iii) a dynamically reconfigurable 2-D separable digital filtering system. Efficient implementations of the pixel processor are based on the use of look-up tables and local-multiplexes to minimize FPGA resources. For the pixel-processor, different realizations are generated based on the number of input bits, the number of cores, the number of output bits, and the frequency of operation. For each parameters combination, there is a different pixel-processor realization. Pareto-optimal realizations are selected based on measurements of energy per frame, PSNR accuracy, and performance in terms of frames per second. Dynamic EPA/PPA management is demonstrated for a sequential list of real-time constraints by selecting optimal realizations and implementing using DPR and dynamic frequency control. Efficient FPGA implementations for the 1-D and 2-D FIR filters are based on the use a distributed arithmetic technique. Different realizations are generated by varying the number of coefficients, coefficient bitwidth, and output bitwidth. Pareto-optimal realizations are selected in the EPA space. Dynamic EPA management is demonstrated on the application of real-time EPA constraints on a digital video. The results suggest that the general framework can be applied to a variety of digital signal, image, and video processing systems. It is based on the use of offline-processing that is used to determine the Pareto-optimal realizations. Real-time constraints are met by selecting Pareto-optimal realizations pre-loaded in memory that are then implemented efficiently using DPR and/or dynamic frequency control
Nonlinear Switched-Capacitor Networks: Basic Principles and Piecewise-Linear Design
The applicability of switched-capacitor (SC) components to the design of nonlinear networks is extensively discussed in this paper. The main objective is to show that SC's can be efficiently used for designing nonlinear networks. Moreover, the design methods to be proposed here are fully compatible with general synthesis methods for nonlinear n -ports. Different circuit alternatives are given and their potentials are evaluated.Office of Naval Research (USA) N00014-76-C-0572Comisión Interministerial de Ciencia y Tecnología 0235/81Semiconductor Research Corporation (USA) 82-11-00
FPGA based Uniform Channelizer Implementation
Channelizers are widely used in modern digital communication systems.
Advanced uniform multirate channelization have been theoretically proved to be
capable of reducing the computational load, with a better performance. Therefore,
in this thesis, we implement these designs on a FPGA board for the sake of the
comprehensive evaluation of resource usage, performance and frequency
response.
The uniform filter-banks are one of the most essential unit in channelization. The
Generalised Discrete Fourier Transform Modulated Filter Bank (GDFT-FB), as an
important variant of basic a DFT-FB, has been implemented in FPGA and
demonstrated with a better computational saving rather than traditional schemes.
Moreover the oversampling version is demonstrated to have a better frequency
response with an acceptable amount of extra resources. On the other hand,
frequency response masking (FRM) techniques is able to reduce the number of
coefficients. Therefore, the full FRM GDFT-FB and alternative narrowband FRM
GDFT-FB are both implemented in FPGA platform, in order to achieve a better
performance and hardware efficiency
Low power, compact charge coupled device signal processing system
A variety of charged coupled devices (CCDs) for performing programmable correlation for preprocessing environmental sensor data preparatory to its transmission to the ground were developed. A total of two separate ICs were developed and a third was evaluated. The first IC was a CCD chirp z transform IC capable of performing a 32 point DFT at frequencies to 1 MHz. All on chip circuitry operated as designed with the exception of the limited dynamic range caused by a fixed pattern noise due to interactions between the digital and analog circuits. The second IC developed was a 64 stage CCD analog/analog correlator for performing time domain correlation. Multiplier errors were found to be less than 1 percent at designed signal levels and less than 0.3 percent at the measured smaller levels. A prototype IC for performing time domain correlation was also evaluated
- …