581 research outputs found

    A survey of the state of the art and focused research in range systems, task 2

    Get PDF
    Contract generated publications are compiled which describe the research activities for the reporting period. Study topics include: equivalent configurations of systolic arrays; least squares estimation algorithms with systolic array architectures; modeling and equilization of nonlinear bandlimited satellite channels; and least squares estimation and Kalman filtering by systolic arrays

    A survey of the state-of-the-art and focused research in range systems

    Get PDF
    In this one-year renewal of NASA Contract No. 2-304, basic research, development, and implementation in the areas of modern estimation algorithms and digital communication systems have been performed. In the first area, basic study on the conversion of general classes of practical signal processing algorithms into systolic array algorithms is considered, producing four publications. Also studied were the finite word length effects and convergence rates of lattice algorithms, producing two publications. In the second area of study, the use of efficient importance sampling simulation technique for the evaluation of digital communication system performances were studied, producing two publications

    Architectures for block Toeplitz systems

    Get PDF
    In this paper efficient VLSI architectures of highly concurrent algorithms for the solution of block linear systems with Toeplitz or near-to-Toeplitz entries are presented. The main features of the proposed scheme are the use of scalar only operations, multiplications/divisions and additions, and the local communication which enables the development of wavefront array architecture. Both the mean squared error and the total squared error formulations are described and a variety of implementations are given

    On recursive least-squares filtering algorithms and implementations

    Get PDF
    In many real-time signal processing applications, fast and numerically stable algorithms for solving least-squares problems are necessary and important. In particular, under non-stationary conditions, these algorithms must be able to adapt themselves to reflect the changes in the system and take appropriate adjustments to achieve optimum performances. Among existing algorithms, the QR-decomposition (QRD)-based recursive least-squares (RLS) methods have been shown to be useful and effective for adaptive signal processing. In order to increase the speed of processing and achieve high throughput rate, many algorithms are being vectorized and/or pipelined to facilitate high degrees of parallelism. A time-recursive formulation of RLS filtering employing block QRD will be considered first. Several methods, including a new non-continuous windowing scheme based on selectively rejecting contaminated data, were investigated for adaptive processing. Based on systolic triarrays, many other forms of systolic arrays are shown to be capable of implementing different algorithms. Various updating and downdating systolic algorithms and architectures for RLS filtering are examined and compared in details, which include Householder reflector, Gram-Schmidt procedure, and Givens rotation. A unified approach encompassing existing square-root-free algorithms is also proposed. For the sinusoidal spectrum estimation problem, a judicious method of separating the noise from the signal is of great interest. Various truncated QR methods are proposed for this purpose and compared to the truncated SVD method. Computer simulations provided for detailed comparisons show the effectiveness of these methods. This thesis deals with fundamental issues of numerical stability, computational efficiency, adaptivity, and VLSI implementation for the RLS filtering problems. In all, various new and modified algorithms and architectures are proposed and analyzed; the significance of any of the new method depends crucially on specific application

    Reliable and Efficient Parallel Processing Algorithms and Architectures for Modern Signal Processing

    Get PDF
    Least-squares (LS) estimations and spectral decomposition algorithms constitute the heart of modern signal processing and communication problems. Implementations of recursive LS and spectral decomposition algorithms onto parallel processing architectures such as systolic arrays with efficient fault-tolerant schemes are the major concerns of this dissertation. There are four major results in this dissertation. First, we propose the systolic block Householder transformation with application to the recursive least-squares minimization. It is successfully implemented on a systolic array with a two-level pipelined implementation at the vector level as well as at the word level. Second, a real-time algorithm-based concurrent error detection scheme based on the residual method is proposed for the QRD RLS systolic array. The fault diagnosis, order degraded reconfiguration, and performance analysis are also considered. Third, the dynamic range, stability, error detection capability under finite-precision implementation, order degraded performance, and residual estimation under faulty situations for the QRD RLS systolic array are studied in details. Finally, we propose the use of multi-phase systolic algorithms for spectral decomposition based on the QR algorithm. Two systolic architectures, one based on triangular array and another based on rectangular array, are presented for the multiphase operations with fault-tolerant considerations. Eigenvectors and singular vectors can be easily obtained by using the multi-pase operations. Performance issues are also considered

    CP-Based SBHT-RLS Algorithms for Tracking Channel Estimates in Multicarrier Modulation Systems

    Get PDF

    An inverse factorization algorithm for linear prediction

    Get PDF
    AbstractA new inverse factorization technique is presented for solving linear prediction problems arising in signal processing. The algorithm is similar to a scheme of Qiao in that is uses the rectangular Toeplitz structure of the data to recursively compute the prediction error and to solve the problem when the optimum filter order has been found. The novelty of the scheme presented here is the use of an inverse factorization scheme due to Pan and Plemmons for solving the linear prediction problem with low computational complexity and without the need for solving triangular systems. We also provide a linear systolic array for solving these problems

    Efficient Parallel Algorithms and VLSI Architectures for Manipulator Jacobian Computation

    Get PDF
    Real-time computations of manipulator Jacobian are examined for executing on uniprocessor computers, parallel computers, and VLSI pipelines. The characteristics of the Jacobian equations are found to be in the form of the first-order linear recurrence. The time lower bound of computing the first-order linear recurrence, and hence the Jacobian, is of order O(N) on uniprocessor computers, and of order O(log2N) on parallel SIMD computers, where TV is the number of degrees-of-freedom of the manipulator. The Generalized-^ method, which achieves the time lower bound on uniprocessor computers, is derived to compute the Jacobian at any desired reference coordinate frame A; from the base coordinate frame to the end-effector coordinate frame. We find that if the reference coordinate frame k is in the range [3 , N—4], then the computational effort is the minimum. To reduce the computational complexity from the order of O (N) to O (log2N), we derive the parallel forward and backward recursive doubling algorithm to compute the Jacobian on parallel computers. Again, any reference coordinate frame k can be used, and the minimum computation occurs at k = (N—1)/2. To further reduce the Jacobian computation complexity, we design two VLSI systolic pipelined architectures. A linear VLSI pipe, which uses the least number of modular processors, takes 3N floating-point operations to compute the Jacobian, and a parallel VLSI pipe takes 3 floating-point operations. We also show that if the reference coordinate frame is selected at k — (N—1)/2, then the parallel pipe will require the least number of modular processors, and the communication paths are much shorter
    • 

    corecore