Search CORE

3 research outputs found

Application of constrained optimisation techniques in electrical impedance tomography

Author: Bayford R.
Bayford R.
Publication venue
Publication date: 01/01/1994
Field of study

A Constrained Optimisation technique is described for the reconstruction of temporal resistivity images. The approach solves the Inverse problem by optimising a cost function under constraints, in the form of normalised boundary potentials. Mathematical models have been developed for two different data collection methods for the chosen criterion. Both of these models express the reconstructed image in terms of one dimensional (I-D) Lagrange multiplier functions. The reconstruction problem becomes one of estimating these 1-D functions from the normalised boundary potentials. These models are based on a cost criterion of the minimisation of the variance between the reconstructed resistivity distribution and the true resistivity distribution. The methods presented In this research extend the algorithms previously developed for X-ray systems. Computational efficiency is enhanced by exploiting the structure of the associated system matrices. The structure of the system matrices was preserved in the Electrical Impedance Tomography (EIT) implementations by applying a weighting due to non-linear current distribution during the backprojection of the Lagrange multiplier functions. In order to obtain the best possible reconstruction it is important to consider the effects of noise in the boundary data. This is achieved by using a fast algorithm which matches the statistics of the error in the approximate inverse of the associated system matrix with the statistics of the noise error in the boundary data. This yields the optimum solution with the available boundary data. Novel approaches have been developed to produce the Lagrange multiplier functions. Two alternative methods are given for the design of VLSI implementations of hardware accelerators to improve computational efficiencies. These accelerators are designed to implement parallel geometries and are modelled using a verification description language to assess their performance capabilities

Middlesex University Research Repository

Singular value computations on the AP1000 array computer

Author: Czezowski Adam
Publication venue
Publication date
Field of study

The increasing popularity of singular value decomposition algorithms, used as a tool in many areas of science and engineering, demands a rapid development of their fast and reliable implementations. No longer are those implementations bounded to the single processor environment since more and more parallel computers are available on the market. This situation requires that often software need to be re-implemented on those new parallel architectures efficiently. In this thesis we show, on the example of a singular value decomposition algorithm, how this task of changing the working environments can be accomplished with non-trivial gains in performance. We show several optimisation techniques and their impact on the algorithm performance on all parallel memory hierarchy levels (register, cache, main memory and external processor memory levels). The central principle in all of the optimisations presented herein is to increase the number of columns (column-segments) being held in each level of the memory hierarchy and therefore increase the data reuse factors. In the optimisations for the parallel memory hierarchy the techniques used are, rectangular processor configuration, partitioning, and four-column rotation. The rectangular processor configuration technique is where the data were mapped onto a rectangular network of processors instead of a linear one. This technique improves the communication and cache performance such that on average, we reduced the execution time by a factor of 2 and, in the case of long column-segments, by a factor of 5. The partitioning technique involves rearranging data and the order of computations in the cells. This technique increases the cache hit ratio for large matrices. For the relatively modest improvements in the cache performance of 2 to 5%, we achieved a significant reduction in the execution times of 10 to 20%. The four-column rotation technique improves the performance by a better register reuse. For the cases of large number of columns stored per processor, this technique gave 2 to 10% improvement in execution time over the 'classic', two column rotation. Apart from the optimisations on the memory hierarchy levels, several floating point optimisations are presented on the algorithm itself which can be applied in any architecture. The main ideas behind those optimisations are the reduction of the number of floating point instructions executed in a unit of time and the balance of the floating point operations. This was accomplished by reshaping the relevant parts of the code to use the APlOOO processors architecture (SPARC) to its full potential. After combining all of the optimisations, we achieved a sustained 60% reduction of the execution time which corresponds to the 2.5 fold reduction. In the cases where long columns of the input matrix were used, we achieved nearly 5 fold reduction in execution time without adversely affecting the accuracy of the singular values and maintaining the quadratic convergence of the algorithm. The algorithm was implemented on the Fujitsu's APlOOO Array Multiprocessor, but all optimisations described can be easily applied to any MIMD architecture with a mesh or hypercube topology, and all but one can be applied to register-cache uniprocessors also. Despite many changes in the structure of the algorithm we found that the convergence was not adversely affected and the accuracy of the orthogonalisation was no worse than for the uniprocessor implementation of the noted SVD algorithm

The Australian National University

The application of parallel computer technology to the dynamic analysis of suspension bridges

Author: Beith Jason Gordon
Publication venue
Publication date: 01/01/1997
Field of study

This research is concerned with the application of distributed computer technology to the solution of non-linear structural dynamic problems, in particular the onset of aerodynamic instabilities in long span suspension bridge structures, such as flutter which is a catastrophic aeroelastic phenomena. The thesis is set out in two distinct parts:- Part I, presents the theoretical background of the main forms of aerodynamic instabilities, presenting in detail the main solution techniques used to solve the flutter problem. The previously written analysis package ANSUSP is presented which has been specifically developed to predict numerically the onset of flutter instability. The various solution techniques which were employed to predict the onset of flutter for the Severn Bridge are discussed. All the results presented in Part I were obtained using a 486DX2 66MHz serial personal computer. Part II, examines the main solution techniques in detail and goes on to apply them to a large distributed supercomputer, which allows the solution of the problem to be achieved considerably faster than is possible using the serial computer system. The solutions presented in Part II are represented as Performance Indices (PI) which quote the ratio of time to performing a specific calculation using a serial algorithm compared to a parallel algorithm running on the same computer system

Glasgow Theses Service

OpenGrey Repository