Search CORE

667 research outputs found

Multidimensional Systolic Arrays of LMS AlgorithmAdaptive (FIR) Digital Filters

Author: Bakir A. R. Al-Hashemy
Riyadh A.H. AL-Helali
Publication venue: Al-Khwarizmi College of Engineering – University of Baghdad
Publication date: 01/01/2009
Field of study

A multidimensional systolic arrays realization of LMS algorithm by a method of mapping regular algorithm onto processor array, are designed. They are based on appropriately selected 1-D systolic array filter that depends on the inner product sum systolic implementation. Various arrays may be derived that exhibit a regular arrangement of the cells (processors) and local interconnection pattern, which are important for VLSI implementation. It reduces latency time and increases the throughput rate in comparison to classical 1-D systolic arrays. The 3-D multilayered array consists of 2-D layers, which are connected with each other only by edges. Such arrays for LMS-based adaptive (FIR) filter may be opposed the fundamental requirements of fast convergence rate in most adaptive filter applications

Directory of Open Access Journals

A block algorithm for the algebraic path problem and its execution on a systolic array

Author: Núñez Fernando J.
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1989
Field of study

The solution of the algebraic path problem (APP) for arbitrarily sized graphs by a fixed-size systolic array processor (SAP) is addressed. The APP is decomposed into two subproblems, and SAP is designed for each one. Both SAPs combined produce a highly implementable versatile SAP. The proposed SAP has p*p processing elements (PEs) solving the APP of an N-vertex graph in N/sup 3//p/sup 2/+N/sup 2//p+3p-2 cycles. With slight modifications in the operations performed by the PEs, the problem is optimally solved in N/sup 3//p/sup 2/+3p-2 cycles.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

A highly parameterized and efficient FPGA-based skeleton for pairwise biological sequence alignment

Author: Benkrid Abdsamad
Benkrid K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2009
Field of study

Portsmouth University Research Portal (Pure)

A Comprehensive Methodology for Algorithm Characterization, Regularization and Mapping Into Optimal VLSI Arrays.

Author: Barada Hassan Reda
Publication venue: LSU Digital Commons
Publication date: 01/01/1989
Field of study

This dissertation provides a fairly comprehensive treatment of a broad class of algorithms as it pertains to systolic implementation. We describe some formal algorithmic transformations that can be utilized to map regular and some irregular compute-bound algorithms into the best fit time-optimal systolic architectures. The resulted architectures can be one-dimensional, two-dimensional, three-dimensional or nonplanar. The methodology detailed in the dissertation employs, like other methods, the concept of dependence vector to order, in space and time, the index points representing the algorithm. However, by differentiating between two types of dependence vectors, the ordering procedure is allowed to be flexible and time optimal. Furthermore, unlike other methodologies, the approach reported here does not put constraints on the topology or dimensionality of the target architecture. The ordered index points are represented by nodes in a diagram called Systolic Precedence Diagram (SPD). The SPD is a form of precedence graph that takes into account the systolic operation requirements of strictly local communications and regular data flow. Therefore, any algorithm with variable dependence vectors has to be transformed into a regular indexed set of computations with local dependencies. This can be done by replacing variable dependence vectors with sets of fixed dependence vectors. The SPD is transformed into an acyclic, labeled, directed graph called the Systolic Directed Graph (SDG). The SDG models the data flow as well as the timing for the execution of the given algorithm on a time-optimal array. The target architectures are obtained by projecting the SDG along defined directions. If more than one valid projection direction exists, different designs are obtained. The resulting architectures are then evaluated to determine if an improvement in the performance can be achieved by increasing PE fan-out. If so, the methodology provides the corresponding systolic implementation. By employing a new graph transformation, the SDG is manipulated so that it can be mapped into fixed-size and fixed-depth multi-linear arrays. The latter is a new concept of systolic arrays that is adaptable to changes in the state of technology. It promises a bonded clock skew, higher throughput and better performance than the linear implementation

Louisiana State University

Systolic Array Implementations With Reduced Compute Time.

Author: Barbir Abdulkader Omar
Publication venue: LSU Digital Commons
Publication date: 01/01/1989
Field of study

The goal of the research is the establishment of a formal methodology to develop computational structures more suitable for the changing nature of real-time signal processing and control applications. A major effort is devoted to the following question: Given a systolic array designed to execute a particular algorithm, what other algorithms can be executed on the same array? One approach for answering this question is based on a general model of array operations using graph-theoretic techniques. As a result, a systematic procedure is introduced that models array operations as a function of the compute cycle. As a consequence of the analysis, the dissertation develops the concept of fast algorithm realizations. This concept characterizes specific realizations that can be evaluated in a reduced number of cycles. It restricts the operations to remain in the same class but with reduced execution time. The concept takes advantage of the data dependencies of the algorithm at hand. This feature allows the modification of existing structures by reordering the input data. Applications of the principle allows optimum time band and triangular matrix product on arrays designed for dense matrices. A second approach for analyzing the families of algorithms implementable in an array, is based on the concept of array time constrained operation. The principle uses the number of compute cycle as an additional degree of freedom to expand the class of transformations generated by a single array. A mathematical approach, based on concepts from multilinear algebra, is introduced to model the recursive transformations implemented in linear arrays at each compute cycle. The proposed representation is general enough to encompass a large class of signal processing and control applications. A complete analytical model of the linear maps implementable by the array at each compute cycle is developed. The proposed methodology results in arrays that are more adaptable to the changing nature of operations. Lessons learned from analyzing existing arrays are used to design smart arrays for special algorithm realizations. Applications of the methodology include the design of flexible time structures and the ability to decompose a full size array into subarrays implementing smaller size problems

Louisiana State University

Real Time Signal Processing Using Systolic Arrays

Author: Boulay Jack
Publication venue: University of Central Florida
Publication date: 01/01/1987
Field of study

This thesis discusses and presents the design of systolic arrays used in modern real time signal processing. A methodology to map a given algorithm into a systolized VLSI implementation is described. The architectural alternatives for a given signal processing algorithm are discussed and investigated at a function level using a simulation package that has been developed using the “C” programming language. The similarities and differences between wavefront array processors and systolic array processors are presented

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Recommended from our members

Two-dimensional DCT/IDCT architecture

Author: Aggoun A
Jollah I
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2003
Field of study

A fully parallel architecture for the computation of a two-dimensional (2-D) discrete cosine transform (DCT), based on row-column decomposition is presented. It uses the same one dimensional (1-D) DCT unit for the row and column computations and (N2+N) registers to perform the transposition. It possesses features of regularity and modularity, and is thus well suited for VLSI implementation. It can be used for the computation of either the forward or the inverse 2-D DCT. Each 1-D DCT unit uses N fully parallel vector inner product (VIP) units. The design of the VIP units is based on a systematic design methodology using radix-2” arithmetic, which allows partitioning of the elements of each vector into small groups. Array multipliers without the final adder are used to produce the different partial product terms. This allows a more efficient use of 4:2 compressors for the accumulation of the products in the intermediate stages and reduces the number of accumulators from N to one. Using this procedure, the 2-D DCT architecture requires less than N2 multipliers (in terms of area occupied) and only 2N adders. It can compute a N x N-point DCT at a rate of one complete transform per N cycles after an appropriate initial delay

Brunel University Research Archive

Architectures for block Toeplitz systems

Author: Bouras Ilias
Glentis George-Othon
Kalouptsidis Nicholas
Publication venue: Elsevier
Publication date: 01/01/1995
Field of study

In this paper efficient VLSI architectures of highly concurrent algorithms for the solution of block linear systems with Toeplitz or near-to-Toeplitz entries are presented. The main features of the proposed scheme are the use of scalar only operations, multiplications/divisions and additions, and the local communication which enables the development of wavefront array architecture. Both the mean squared error and the total squared error formulations are described and a variety of implementations are given

CiteSeerX

University of Twente Research Information