Search CORE

5,287 research outputs found

Design and analysis of numerical algorithms for the solution of linear systems on parallel and distributed architectures

Author: Rosni Abdullah (7169939)
Publication venue
Publication date: 01/01/1997
Field of study

The increasing availability of parallel computers is having a very significant impact on all aspects of scientific computation, including algorithm research and software development in numerical linear algebra. In particular, the solution of linear systems, which lies at the heart of most calculations in scientific computing is an important computation found in many engineering and scientific applications. In this thesis, well-known parallel algorithms for the solution of linear systems are compared with implicit parallel algorithms or the Quadrant Interlocking (QI) class of algorithms to solve linear systems. These implicit algorithms are (2x2) block algorithms expressed in explicit point form notation. [Continues.

Loughborough University Institutional Repository

Performance Analysis of Hardware/Software Co-Design of Matrix Solvers

Author: Huang Peng
Publication venue: 'University of Saskatchewan Library'
Publication date: 01/01/2008
Field of study

Solving a system of linear and nonlinear equations lies at the heart of many scientific and engineering applications such as circuit simulation, applications in electric power networks, and structural analysis. The exponentially increasing complexity of these computing applications and the high cost of supercomputing force us to explore affordable high performance computing platforms. The ultimate goal of this research is to develop hardware friendly parallel processing algorithms and build cost effective high performance parallel systems using hardware in order to enable the solution of large linear systems. In this thesis, FPGA-based general hardware architectures of selected iterative methods and direct methods are discussed. Xilinx Embedded Development Kit (EDK) hardware/software (HW/SW) codesigns of these methods are also presented. For iterative methods, FPGA based hardware architectures of Jacobi, combined Jacobi and Gauss-Seidel, and conjugate gradient (CG) are proposed. The convergence analysis of the LNS-based Jacobi processor demonstrates to what extent the hardware resource constraints and additional conversion error affect the convergence of Jacobi iterative method. Matlab simulations were performed to compare the performance of three iterative methods in three ways, i.e., number of iterations for any given tolerance, number of iterations for different matrix sizes, and computation time for different matrix sizes. The simulation results indicate that the key to a fast implementation of the three methods is a fast implementation of matrix multiplication. The simulation results also show that CG method takes less number of iterations for any given tolerance, but more computation time as matrix size increases compared to other two methods, since matrix-vector multiplication is a more dominant factor in CG method than in the other two methods. By implementing matrix multiplications of the three methods in hardware with Xilinx EDK HW/SW codesign, the performance is significantly improved over pure software Power PC (PPC) based implementation. The EDK implementation results show that CG takes less computation time for any size of matrices compared to other two methods in HW/SW codesign, due to that fact that matrix multiplications dominate the computation time of all three methods while CG requires less number of iterations to converge compared to other two methods. For direct methods, FPGA-based general hardware architecture and Xilinx EDK HW/SW codesign of WZ factorization are presented. Single unit and scalable hardware architectures of WZ factorization are proposed and analyzed under different constraints. The results of Matlab simulations show that WZ runs faster than the LU on parallel processors but slower on a single processor. The simulation results also indicate that the most time consuming part of WZ factorization is matrix update. By implementing the matrix update of WZ factorization in hardware with Xilinx EDK HW/SW codesign, the performance is also apparently improved over PPC based pure software implementation

eCommons@USASK

University of Saskatchewan Research Archive

Recommended from our members

Scoring functions for protein docking and drug design

Author: Viswanath Shruthi
Publication venue
Publication date: 26/06/2014
Field of study

textPredicting the structure of complexes formed by two interacting proteins is an important problem in computation structural biology. Proteins perform many of their functions by binding to other proteins. The structure of protein-protein complexes provides atomic details about protein function and biochemical pathways, and can help in designing drugs that inhibit binding. Docking computationally models the structure of protein-protein complexes, given three-dimensional structures of the individual chains. Protein docking methods have two phases. In the first phase, a comprehensive, coarse search is performed for optimally docked models. In the second refinement and reranking phase, the models from the first phase are refined and reranked, with the expectation of extracting a small set of accurate models from the pool of thousands of models obtained from the first phase. In this thesis, new algorithms are developed for the refinement and reranking phase of docking. New scoring functions, or potentials, that rank models are developed. These potentials are learnt using large-scale machine learning methods based on mathematical programming. The procedure for learning these potentials involves examining hundreds of thousands of correct and incorrect models. In this thesis, hierarchical constraints were introduced into the learning algorithm. First, an atomic potential was developed using this learning procedure. A refinement procedure involving side-chain remodeling and conjugate gradient-based minimization was introduced. The refinement procedure combined with the atomic potential was shown to improve docking accuracy significantly. Second, a hydrogen bond potential, was developed. Molecular dynamics-based sampling combined with the hydrogen bond potential improved docking predictions. Third, mathematical programming compared favorably to SVMs and neural networks in terms of accuracy, training and test time for the task of designing potentials to rank docking models. The methods described in this thesis are implemented in the docking package DOCK/PIERR. DOCK/PIERR was shown to be among the best automated docking methods in community wide assessments. Finally, DOCK/PIERR was extended to predict membrane protein complexes. A membrane-based score was added to the reranking phase, and shown to improve the accuracy of docking. This docking algorithm for membrane proteins was used to study the dimers of amyloid precursor protein, implicated in Alzheimer's disease.R. DOCK/PIERR was shown to be among the best automated docking methods in community wide assessments. Finally, DOCK/PIERR was extended to predict membrane protein complexes. A membrane-based score was added to the reranking phase, and shown to improve the accuracy of docking. This docking algorithm for membrane proteins was used to study the dimers of amyloid precursor protein, implicated in Alzheimer’s disease.Computer Science

Texas ScholarWorks

Calculation of three-dimensional compressible laminar and turbulent boundary layers. Calculation of three-dimensional compressible boundary layers on arbitrary wings

Author: Cebeci T.
Kaups K.
Moser A.
Ramsey J.
Publication venue
Publication date
Field of study

A very general method for calculating compressible three-dimensional laminar and turbulent boundary layers on arbitrary wings is described. The method utilizes a nonorthogonal coordinate system for the boundary-layer calculations and includes a geometry package that represents the wing analytically. In the calculations all the geometric parameters of the coordinate system are accounted for. The Reynolds shear-stress terms are modeled by an eddy-viscosity formulation developed by Cebeci. The governing equations are solved by a very efficient two-point finite-difference method used earlier by Keller and Cebeci for two-dimensional flows and later by Cebeci for three-dimensional flows

NASA Technical Reports Server

Computational methods and software systems for dynamics and control of large space structures

Author: Farhat C.
Felippa C. A.
Park K. C.
Pramono E.
Publication venue
Publication date
Field of study

Two key areas of crucial importance to the computer-based simulation of large space structures are discussed. The first area involves multibody dynamics (MBD) of flexible space structures, with applications directed to deployment, construction, and maneuvering. The second area deals with advanced software systems, with emphasis on parallel processing. The latest research thrust in the second area involves massively parallel computers

NASA Technical Reports Server

Analysis and performance of the gas-lubricated tilting pad thrust bearing Interim report

Author: Colsher R.
Shapiro W.
Publication venue
Publication date
Field of study

Optimal design and performance criteria for gas lubricated tilting pad thrust bearin

NASA Technical Reports Server

Sound propagation in a duct of periodic wall structure

Author: Kurze U.
Publication venue
Publication date
Field of study

A boundary condition, which accounts for the coupling in the sections behind the duct boundary, is given for the sound-absorbing duct with a periodic structure of the wall lining and using regular partition walls. The soundfield in the duct is suitably described by the method of differences. For locally active walls this renders an explicit approximate solution for the propagation constant. Coupling may be accounted for by the method of differences in a clear manner. Numerical results agree with measurements and yield information which has technical applications

NASA Technical Reports Server

The Identification of Unbalance in a Nonlinear Squeeze-Film Damped System using an Inverse method - a Computational and Experimental study

Author: Torres Cedillo Sergio
Publication venue
Publication date: 31/12/2015
Field of study

The University of Manchester - Institutional Repository

Institute for Computational Mechanics in Propulsion (ICOMP) fourth annual review, 1989

Author
Publication venue
Publication date
Field of study

The Institute for Computational Mechanics in Propulsion (ICOMP) is operated jointly by Case Western Reserve University and the NASA Lewis Research Center. The purpose of ICOMP is to develop techniques to improve problem solving capabilities in all aspects of computational mechanics related to propulsion. The activities at ICOMP during 1989 are described

NASA Technical Reports Server