8 research outputs found

    An Algorithm and Architecture Based on Orthonormal µ-Rotations for Computing the Symmetric EVD

    No full text
    In this paper an algorithm and architecture for computing the eigenvalue decomposition (EVD) of a symmetric matrix is presented. The EVD is computed using a Jacobi-type method, where the angle of the rotations is approximated by an angle ff k , corresponding to an orthonormal µ-rotation. These orthonormal µ-rotations are based on the idea of CORDIC and share the property that performing the rotation requires a minimal number of shift-add operations. We present various methods of construction for such orthonormal µ-rotations of increasing complexity. Moreover, the computations to determine which angle ff k to use in the approximation of the optimal angle, can itself be expressed purely in orthonormal µ-rotations on the matrix data. The complexity of the used type of orthonormal µ-rotation decreases during the diagonalization of the matrix. A significant reduction of the number of required shift-add operations is achieved. All types of fast, orthonormal µ-rotations (and the co..

    An Approach for the Mapping of Jacobi Algorithms onto a Jacobi Specific Dataflow Processor

    No full text
    Dedicated application specific processors often suffer from long design times and short lifecycles. It makes sense to confine ourselves to the class of Jacobi algorithms, because this enables us to come up with an initial template for a processing element (PE). The processor consists of a series of such PEs. The PE itself is composed of four units. One of these units is the memory and contains a series of logical storage structures (LSSs). These LSSs can be seen as data re-ordering units. At the write port an LSS behaves like a queue. An LSS is special in that a single command can be used to request a whole sequence of data elements. The problem is to map the algorithms onto a series of PEs. Our point of departure is the single assignment code (SAC) of the algorithms to be mapped. Given an algorithm, we go through a number of transformations, such that we eventually arrive at a specification whose semantics corresponds with that of the processor. To verify the correctness of the transf..

    Efficient Orthogonal Realization of Image Transforms

    No full text
    One can find ample examples in the literature of implementations of image transforms such as the discrete cosine transform and the lapped orthogonal transform. The objective is invariantly the minimization of the number of multiplies and adds. Of course, a reduction of operations from O(N 2 )to O(NlogN) is a great achievement, yet the cost resulting from non-local communication and operation accuracy is seldom taken into account. Especially accuracy needed to preserve the dominant property of the transforms may turn out to be expensive. These properties are that the transforms are a collection of highly structured orthonormal basis functions and the first concern should be to preserve these properties by enforcing them through a decomposition of the transforms in terms of inexpensive elementary operations which can be inaccurate without violating the global properties. This paper presents a decomposition of image transforms into a network of 2 \Theta 2 so-called fast rotations which..

    SVD-Updating Using Orthonormal µ-Rotations

    No full text
    In this paper the implementation of the SVD-updating algorithm using orthonormal µ-rotations is presented. An orthonormal µ-rotation is a rotation by an angle of a given set of µ-rotation angles (e.g. the angles \Phi i = arctan2 \Gammai) which are choosen such that the rotation can be implemented by a small amount of shift-add operations. A version of the SVD-updating algorithm is used where all computations are entirely based on the evaluation and application of orthonormal rotations. Therefore, in this form the SVD--updating algorithm is amenable to an implementation using orthonormal µ-rotations, i.e., each rotation executed in the SVD-updating algorithm will be approximated by orthonormal µ-rotations. For all the approximations the same accuracy is used, i.e., only r � � w (w: wordlength) orthonormal µ-rotations are used to approximate the exact rotation. The rotation evaluation can also be performed by the execution of µ-rotations such that the complete SVD--updating algorithm can be expressed in terms of orthonormal µ-rotations. Simulations show the efficiency of the SVD--updating algorithm based on orthonormal µ-rotations

    Jacobi-Specific Processor Arrays

    Get PDF
    We present a processor and a compiler for prototyping array implementations of algorithms from the class of Jacobi algorithms. We use adaptive matrix QR decomposition as an illustrative example

    The Formal Derivation of a Systolic Array for Recursive Least Squares Estimation

    No full text
    A formal proof is presented for a recently presented systolic array for recursive least squares estimation by inverse updates. The derivation of this systolic array is highly non-trivial due to the presence of data contra-flow and feedback loops in the underlying signal flow graph. This would normally prohibit pipelined processing. However, it is shown that suitable delays may be introduced into the signal flow graph by performing a simple algorithmic transformation which compensates for the interference of crossing data flows. The pipelined systolic array is then obtained by retiming the signal flow graph and applying the cut theorem. I. Introduction In this paper we derive a novel systolic array for implementing recursive least squares (RLS) computations based on the method of inverse updates. Recursive least squares estimation is required in a wide range of applications from adaptive beamforming for antenna arrays to data communications, space navigation and system identification. ..

    Efficient Implementations of Pipelined CORDIC Based IIR Digital Filters using Fast Orthonormal µ-rotations

    No full text
    CORDIC based IIR digital filters are orthogonal filters whose internal computations consist of orthogonal transformations. These filters possess desirable properties for VLSI implementations such as regularity, local connection, low sensitivity to finite word-length implementation, and elimination of limit cycles. Recently, fine-grain pipelined CORDIC based IIR digital filter architectures which can perform the filtering operations at arbitrarily high sample rates at the cost of linear increase in hardware complexity have been developed. These pipelined architectures consist of only Givens rotations and a few additions which can be mapped onto CORDIC arithmetic based processors. However, in practical applications, implementations of Givens rotations using traditional CORDIC arithmetic are quite expensive. For example, for 16 bit accuracy, using floating point data format with 16 bit mantissa and 5 bit exponent, it will require approximately 20 pairs of shift-add operations for one Give..
    corecore