Search CORE

60 research outputs found

Stable Sparse Orthogonal Factorization of Ill-Conditioned Banded Matrices for Parallel Computing

Author: Huang Qian
Publication venue: SURFACE at Syracuse University
Publication date: 25/08/2017
Field of study

Sequential and parallel algorithms based on the LU factorization or the QR factorization have been intensely studied and widely used in the problems of computation with large-scale ill-conditioned banded matrices. Great concerns on existing methods include ill-conditioning, sparsity of factor matrices, computational complexity, and scalability. In this dissertation, we study a sparse orthogonal factorization of a banded matrix motivated by parallel computing. Specifically, we develop a process to factorize a banded matrix as a product of a sparse orthogonal matrix and a sparse matrix which can be transformed to an upper triangular matrix by column permutations. We prove that the proposed process requires low complexity, and it is numerically stable, maintaining similar stability results as the modified Gram-Schmidt process. On this basis, we develop a parallel algorithm for the factorization in a distributed computing environment. Through an analysis of its performance, we show that the communication costs reach the theoretical least upper bounds, while its parallel complexity or speedup approaches the optimal bound. For an ill-conditioned banded system, we construct a sequential solver that breaks it down into small-scale underdetermined systems, which are solved by the proposed factorization with high accuracy. We also implement a parallel solver with strategies to treat the memory issue appearing in extra large-scale linear systems of size over one billion. Numerical experiments confirm the theoretical results derived in this thesis, and demonstrate the superior accuracy and scalability of the proposed solvers for ill-conditioned linear systems, comparing to the most commonly used direct solvers

Syracuse University Research Facility and Collaborative Environment

BSPlib Java Interface for Parallel Scientific Computing Applications

Author: Kromonov Ilja
Publication venue: Tartu Ülikool
Publication date: 01/01/2012
Field of study

Selle lõputöö eesmärk on BSPlib liidese loomine Java programmeerimiskeele jaoks, mis võimaldaks luua Bulk Synchronous Parallel (BSP) hajusarvutuse mudelil baseeruvaid Java programme. Loodavat liidest kasutatakse keerulise teadusarvutuse programmi paralleliseerimiseks, et illustreerida BSP kasutatavust teadusarvutuslike probleemide lahendamise jaoks. Selleks programmiks on valitud tahkete materjalide soojusjuhtivuse simuleerimine lineaarsüsteemide lahedamise teel, mis on äärmiselt kommnikatsiooni-nõudlik ning sobib seega väga heaks näiteks. Lisaks BSP mudelile kasutatakse selle programmi paralleliseerimiseks ka kahte erinevat MPI hajusarvutuse liidest, millega võrreldes hinnatakse loodud BSP liidese effektiivsust ning skaleeruvust.This work presents a Java interface to a native BSPlib library for implementing parallel algorithms in a structured way (as described by the BSP model), using the Java programming language. To compare the created library to existing parallel programming solutions, a typical physics simulation application is created. It employs the parallel conjugate gradient method for solving systems of linear algebraic equations, a common scientific computing algorithm, which is challenging from a parallelization standpoint. Using the results from running the test application, the Java BSPlib interface is compared to various MPI (Message Passing Interface) implementations

DSpace at Tartu University Library

Solution of partial differential equations on vector and parallel computers

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

NASA Technical Reports Server

Teadusarvutuse algoritmide taandamine hajusarvutuse raamistikele

Author: Jakovits Pelle
Publication venue
Publication date: 09/02/2017
Field of study

Teadusarvutuses kasutatakse arvuteid ja algoritme selleks, et lahendada probleeme erinevates reaalteadustes nagu geneetika, bioloogia ja keemia. Tihti on eesmärgiks selliste loodusnähtuste modelleerimine ja simuleerimine, mida päris keskkonnas oleks väga raske uurida. Näiteks on võimalik luua päikesetormi või meteoriiditabamuse mudel ning arvutisimulatsioonide abil hinnata katastroofi mõju keskkonnale. Mida keerulisemad ja täpsemad on sellised simulatsioonid, seda rohkem arvutusvõimsust on vaja. Tihti kasutatakse selleks suurt hulka arvuteid, mis kõik samaaegselt töötavad ühe probleemi kallal. Selliseid arvutusi nimetatakse paralleel- või hajusarvutusteks. Hajusarvutuse programmide loomine on aga keeruline ning nõuab palju rohkem aega ja ressursse, kuna vaja on sünkroniseerida erinevates arvutites samaaegselt tehtavat tööd. On loodud mitmeid tarkvararaamistikke, mis lihtsustavad seda tööd automatiseerides osa hajusprogrammeerimisest. Selle teadustöö eesmärk oli uurida selliste hajusarvutusraamistike sobivust keerulisemate teadusarvutuse algoritmide jaoks. Tulemused näitasid, et olemasolevad raamistikud on üksteisest väga erinevad ning neist ükski ei ole sobiv kõigi erinevat tüüpi algoritmide jaoks. Mõni raamistik on sobiv ainult lihtsamate algoritmide jaoks; mõni ei sobi olukorras, kus andmed ei mahu arvutite mällu. Algoritmi jaoks kõige sobivama hajusarvutisraamistiku valimine võib olla väga keeruline ülesanne, kuna see nõuab olemasolevate raamistike uurimist ja rakendamist. Sellele probleemile lahendust otsides otsustati luua dünaamiline algoritmide modelleerimise rakendus (DAMR), mis oskab simuleerida algoritmi implementatsioone erinevates hajusarvutusraamistikes. DAMR aitab hinnata milline hajusraamistik on kõige sobivam ette antud algoritmi jaoks, ilma algoritmi reaalselt ühegi hajusraamistiku peale implementeerimata. Selle uurimustöö peamine panus on hajusarvutusraamistike kasutuselevõtu lihtsamaks tegemine teadlastele, kes ei ole varem nende kasutamisega kokku puutunud. See peaks märkimisväärselt aega ja ressursse kokku hoidma, kuna ei pea ükshaaval kõiki olemasolevaid hajusraamistikke tundma õppima ja rakendama.Scientific computing uses computers and algorithms to solve problems in various sciences such as genetics, biology and chemistry. Often the goal is to model and simulate different natural phenomena which would otherwise be very difficult to study in real environments. For example, it is possible to create a model of a solar storm or a meteor hit and run computer simulations to assess the impact of the disaster on the environment. The more sophisticated and accurate the simulations are the more computing power is required. It is often necessary to use a large number of computers, all working simultaneously on a single problem. These kind of computations are called parallel or distributed computing. However, creating distributed computing programs is complicated and requires a lot more time and resources, because it is necessary to synchronize different computers working at the same time. A number of software frameworks have been created to simplify this process by automating part of a distributed programming. The goal of this research was to assess the suitability of such distributed computing frameworks for complex scientific computing algorithms. The results showed that existing frameworks are very different from each other and none of them are suitable for all different types of algorithms. Some frameworks are only suitable for simple algorithms; others are not suitable when data does not fit into the computer memory. Choosing the most appropriate distributed computing framework for an algorithm can be a very complex task, because it requires studying and applying the existing frameworks. While searching for a solution to this problem, it was decided to create a Dynamic Algorithms Modelling Application (DAMA), which is able to simulate the implementation of the algorithm in different distributed computing frameworks. DAMA helps to estimate which distributed framework is the most appropriate for a given algorithm, without actually implementing it in any of the available frameworks. This main contribution of this study is simplifying the adoption of distributed computing frameworks for researchers who are not yet familiar with using them. It should save significant time and resources as it is not necessary to study each of the available distributed computing frameworks in detail

DSpace at Tartu University Library

An improved parallel singular value algorithm and its implementation for multicore hardware

Author: Agullo E.
Anderson E.
Andrews H. C.
Bartels R. H.
Berry B. P. M.
Bečka M.
Bregman N.
Buttari A.
Buttari A.
Dongarra J.
Farra V.
Gansterer W.
Golub G. H.
Golub G. H.
Haidar A.
Haidar A.
Jiang E. P.
Kurzak J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Research in computerized structural analysis and synthesis

Author: Mccomb H. G., Jr.
Publication venue
Publication date
Field of study

Computer applications in dynamic structural analysis and structural design modeling are discussed

NASA Technical Reports Server

Parallel algorithms for the solution of elliptic and parabolic problems on transputer networks

Author: Sevelyn Chikohora (7170182)
Publication venue
Publication date: 01/01/1991
Field of study

This thesis is a study of the implementation of parallel algorithms for solving elliptic and parabolic partial differential equations on a network of transputers. The thesis commences with a general introduction to parallel processing. Here a discussion of the various ways of introducing parallelism in computer systems and the classification of parallel architectures is presented. In chapter 2, the transputer architecture and the associated language OCCAM are described. The transputer development system (TDS) is also described as well as a short account of other transputer programming languages. Also, a brief description of the methodologies for programming transputer networks is given. The chapter is concluded by a detailed description of the hardware used for the research. [Continues.

Loughborough University Institutional Repository

The Investigation of Efficiency of Physical Phenomena Modelling Using Differential Equations on Distributed Systems

Author: Bugajev Andrej
Publication venue: VGTU leidykla „Technika“
Publication date: 01/01/2015
Field of study

This work is dedicated to development of mathematical modelling software. In this dissertation numerical methods and algorithms are investigated in software making context. While applying a numerical method it is important to take into account the limited computer resources, the architecture of these resources and how do methods affect software robustness. Three main aspects of this investigation are that software implementation must be efficient, robust and be able to utilize specific hardware resources. The hardware specificity in this work is related to distributed computations of different types: single CPU with multiple cores, multiple CPUs with multiple cores and highly parallel multithreaded GPU device. The investigation is done in three directions: GPU usage for 3D FDTD calculations, FVM method usage to implement efficient calculations of a very specific heat transferring problem, and development of special techniques for software for specific bacteria self organization problem when the results are sensitive to numerical methods, initial data and even computer round-off errors. All these directions are dedicated to create correct technological components that make a software implementation robust and efficient. The time prediction model for 3D FDTD calculations is proposed, which lets to evaluate the efficiency of different GPUs. A reasonable speedup with GPU comparing to CPU is obtained. For FVM implementation the OpenFOAM open source software is selected as a basis for implementation of calculations and a few algorithms and their modifications to solve efficiency issues are proposed. The FVM parallel solver is implemented and analyzed, it is adapted to heterogeneous cluster Vilkas. To create robust software for simulation of bacteria self organization mathematically robust methods are applied and results are analyzed, the algorithm is modified for parallel computations

Vilniaus Gedimino Technikos Universitetas: VGTU Talpykla / Vilnius Gediminas Technical University: VGTU Repository