56 research outputs found
A parallel processor for fast execution of time-adaptive Jacobi algorithms
In this paper we take the class of Jacobi-type algorithms and present a systematic way to derive an architecture for execution of the time adaptive QR and QR \Gamma1 algorithms, two members of the class. We know that Jacobi-type algorithms find natural expression in Cordic arithmetic and that high-throughput implementations ask for parallel operating pipelined Cordic processor elements. Based on this knowledge, we perform algorithmic transformations, exploiting class specific properties, to reduce critical paths, increase throughput and improve structure utilization. The techniques illustrated in the paper are currently used to derive the specifications of a `class optimal' processor into which several Jacobi-type algorithms execute simultaneously. Keywords--- parallel processors, Jacobi-type algorithms, algorithmic transformations, pipelined processors, Cordic-arithmetic. I. Introduction Matrix computations are increasingly finding application in real-time signal processing. A num..
Affine nested loop programs and their binary cyclo-static dataflow counterparts
Parameterized static affine nested loop programs can be automatically converted to input-output equivalent Kahn Process Network specifications. These networks turn out to be close relatives of parameterized cyclo-static dataflow graphs. Token production and consumption can be cyclic with a finite number of cycles or finite non-cyclic. Moreover the token production and consumption sequences are binary. 1
- …