1,443 research outputs found
Systolic Array Implementations With Reduced Compute Time.
The goal of the research is the establishment of a formal methodology to develop computational structures more suitable for the changing nature of real-time signal processing and control applications. A major effort is devoted to the following question: Given a systolic array designed to execute a particular algorithm, what other algorithms can be executed on the same array? One approach for answering this question is based on a general model of array operations using graph-theoretic techniques. As a result, a systematic procedure is introduced that models array operations as a function of the compute cycle. As a consequence of the analysis, the dissertation develops the concept of fast algorithm realizations. This concept characterizes specific realizations that can be evaluated in a reduced number of cycles. It restricts the operations to remain in the same class but with reduced execution time. The concept takes advantage of the data dependencies of the algorithm at hand. This feature allows the modification of existing structures by reordering the input data. Applications of the principle allows optimum time band and triangular matrix product on arrays designed for dense matrices. A second approach for analyzing the families of algorithms implementable in an array, is based on the concept of array time constrained operation. The principle uses the number of compute cycle as an additional degree of freedom to expand the class of transformations generated by a single array. A mathematical approach, based on concepts from multilinear algebra, is introduced to model the recursive transformations implemented in linear arrays at each compute cycle. The proposed representation is general enough to encompass a large class of signal processing and control applications. A complete analytical model of the linear maps implementable by the array at each compute cycle is developed. The proposed methodology results in arrays that are more adaptable to the changing nature of operations. Lessons learned from analyzing existing arrays are used to design smart arrays for special algorithm realizations. Applications of the methodology include the design of flexible time structures and the ability to decompose a full size array into subarrays implementing smaller size problems
Generalized Methodology for Array Processor Design of Real-time Systems
Many techniques and design tools have been developed for mapping algorithms to array processors. Linear mapping is usually used for regular algorithms. Large and complex problems are not regular by nature and regularization may cause a computational overhead which prevents the ability to meet real-time deadlines. In this paper, a systematic design methodology for mapping partially-regular as well as regular Dependence Graphs is presented. In this approach the set of all optimal solutions is generated under the given constraints. Due to nature of the problem and the tight timing constraints of real-time systems the set of alternative solutions is limited. An image processing example is discusse
A 2D DWT architecture suitable for the Embedded Zerotree Wavelet Algorithm
Digital Imaging has had an enormous impact on industrial applications such as the Internet and video-phone systems. However, demand for industrial applications is growing enormously. In particular, internet application users are, growing at a near exponential rate. The sharp increase in applications using digital images has caused much emphasis on the fields of image coding, storage, processing and communications. New techniques are continuously developed with the main aim of increasing efficiency. Image coding is in particular a field of great commercial interest. A digital image requires a large amount of data to be created. This large amount of data causes many problems when storing, transmitting or processing the image. Reducing the amount of data that can be used to represent an image is the main objective of image coding. Since the main objective is to reduce the amount of data that represents an image, various techniques have been developed and are continuously developed to increase efficiency. The JPEG image coding standard has enjoyed widespread acceptance, and the industry continues to explore its various implementation issues. However, recent research indicates multiresolution based image coding is a far superior alternative.
A recent development in the field of image coding is the use of Embedded Zerotree Wavelet (EZW) as the technique to achieve image compression. One of The aims of this theses is to explain how this technique is superior to other current coding standards. It will be seen that an essential part orthis method of image coding is the use of multi resolution analysis, a subband system whereby the subbands arc logarithmically spaced in frequency and represent an octave band decomposition. The block structure that implements this function is termed the two dimensional Discrete Wavelet Transform (2D-DWT). The 20 DWT is achieved by several architectures and these are analysed in order to choose the best suitable architecture for the EZW coder. Finally, this architecture is implemented and verified using the Synopsys Behavioural Compiler and recommendations are made based on experimental findings
Recommended from our members
Efficient FPGA implementation and power modelling of image and signal processing IP cores
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage
and signal processing application areas such as consumer electronics, instrumentation,
medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA
devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the
work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of
cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area.
A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM
is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed
A Distributed System for Robot Manipulator Control, NSF Grant ECS-11879 Fourth Report
This is the fourth annual report representing our last year\u27s work under the current grant. This work was directed to the development of a distributed computer architecture to function as a force and motion server to a robot system. In the course of this work we developed a compliant contact sensor to provide for transitions between position and force control; developed an end-effector capable of securing a stable grasp on an object and a theory of grasping; developed and built a controller which minimizes control delays; explored a parallel kinematics algorithms for the controller; developed a consistent approach to the definition of motion both in joint coordinates and in Cartesian coordinates; developed a symbolic simplification software package to generate the dynamics equations of a manipulator such that the calculations may be split between background and foreground
Moving to Opportunity and Tranquility: Neighborhood Effects on Adult Economic Self-Sufficiency and Health From a Randomized Housing Voucher Experiment
neighborhood effects, housing vouchers
Application of constrained optimisation techniques in electrical impedance tomography
A Constrained Optimisation technique is described for the reconstruction of temporal resistivity images. The approach solves the Inverse problem by optimising a cost function under constraints, in the form of normalised boundary potentials.
Mathematical models have been developed for two different data collection methods for the chosen criterion. Both of these models express the reconstructed image in terms of one dimensional (I-D) Lagrange multiplier functions. The reconstruction problem becomes one of estimating these 1-D functions from the
normalised boundary potentials. These models are based on a cost criterion of the minimisation of the variance between the reconstructed resistivity distribution and the true resistivity distribution.
The methods presented In this research extend the algorithms previously developed for X-ray systems. Computational efficiency is enhanced by exploiting the structure of the associated system matrices. The structure of the system matrices was preserved in the Electrical Impedance Tomography (EIT) implementations by applying a weighting due to non-linear current distribution during the backprojection of the Lagrange multiplier functions.
In order to obtain the best possible reconstruction it is important to consider the effects of noise in the boundary data. This is achieved by using a fast algorithm which matches the statistics of the error in the approximate inverse of the associated system matrix with the statistics of the noise error in the boundary data. This yields the optimum solution with the available boundary data. Novel approaches have been developed to produce the Lagrange multiplier functions.
Two alternative methods are given for the design of VLSI implementations of hardware accelerators to improve computational efficiencies. These accelerators are designed to implement parallel geometries and are modelled using a verification
description language to assess their performance capabilities
- …