Search CORE

6 research outputs found

Greedy Givens algorithms for computing the rank-k updating of the QR decomposition

Author: John Kontoghiorghes
Q Erricos
Publication venue
Publication date: 03/04/2020
Field of study

Abstract A Greedy Givens algorithm for computing the rank-1 updating of the QR decomposition is proposed. An exclusive-read exclusive-write parallel random access machine computational model is assumed. The complexity of the algorithms is calculated in two different ways. In the unlimited parallelism case a single time unit is required to apply a compound disjoint Givens rotation of any size. In the limited parallelism case all the disjoint Givens rotations can be applied simultaneously, but one time unit is required to apply a rotation to a two-element vector. The proposed Greedy algorithm requires approximately 5=8 the number of steps performed by the conventional sequential Givens rank-1 algorithm under unlimited parallelism. A parallel implementation of the sequential Givens algorithm outperforms the Greedy one under limited parallelism. An adaptation of the Greedy algorithm to compute the rank-k updating of the QR decomposition has been developed. This algorithm outperforms a recently reported parallel method for small k, but its efficiency decreases as k increases

CiteSeerX

Communication Optimization by Data Replication for Distributed Graph Algorithms

Author: Ribizel Tobias
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2019
Field of study

KITopen

Harnessing the power of GPUs for problems in real algebraic geometry

Author: Emeliyanenko Pavel
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2012
Field of study

This thesis presents novel parallel algorithms to leverage the power of GPUs (Graphics Processing Units) for exact computations with polynomials having large integer coefficients. The significance of such computations, especially in real algebraic geometry, is hard to undermine. On massively-parallel architectures such as GPU, the degree of datalevel parallelism exposed by an algorithm is the main performance factor. We attain high efficiency through the use of structured matrix theory to assist the realization of relevant operations on polynomials on the graphics hardware. A detailed complexity analysis, assuming the PRAM model, also confirms that our approach achieves a substantially better parallel complexity in comparison to classical algorithms used for symbolic computations. Aside from the theoretical considerations, a large portion of this work is dedicated to the actual algorithm development and optimization techniques where we pay close attention to the specifics of the graphics hardware. As a byproduct of this work, we have developed high-throughput modular arithmetic which we expect to be useful for other GPU applications, in particular, open-key cryptography. We further discuss the algorithms for the solution of a system of polynomial equations, topology computation of algebraic curves and curve visualization which can profit to the full extent from the GPU acceleration. Extensive benchmarking on a real data demonstrates the superiority of our algorithms over several state-of-the-art approaches available to date. This thesis is written in English.Diese Arbeit beschäftigt sich mit neuen parallelen Algorithmen, die das Leistungspotenzial der Grafik-Prozessoren (GPUs) zur exakten Berechnungen mit ganzzahlige Polynomen nutzen. Solche symbolische Berechnungen sind von großer Bedeutung zur Lösung vieler Probleme aus der reellen algebraischen Geometrie. Für die effziente Implementierung eines Algorithmus auf massiv-parallelen Hardwarearchitekturen, wie z.B. GPU, ist vor allem auf eine hohe Datenparallelität zu achten. Unter Verwendung von Ergebnissen aus der strukturierten Matrix-Theorie konnten wir die entsprechenden Operationen mit Polynomen auf der Grafikkarte leicht übertragen. Außerdem zeigt eine Komplexitätanalyse im PRAM-Rechenmodell, dass die von uns entwickelten Verfahren eine deutlich bessere Komplexität aufweisen als dies für die klassischen Verfahren der Fall ist. Neben dem theoretischen Ergebnis liegt ein weiterer Schwerpunkt dieser Arbeit in der praktischen Implementierung der betrachteten Algorithmen, wobei wir auf der Besonderheiten der Grafikhardware achten. Im Rahmen dieser Arbeit haben wir hocheffiziente modulare Arithmetik entwickelt, von der wir erwarten, dass sie sich für andere GPU Anwendungen, insbesondere der Public-Key-Kryptographie, als nützlich erweisen wird. Darüber hinaus betrachten wir Algorithmen für die Lösung eines Systems von Polynomgleichungen, Topologie Berechnung der algebraischen Kurven und deren Visualisierung welche in vollem Umfang von der GPU-Leistung profitieren können. Zahlreiche Experimente belegen dass wir zur Zeit die beste Verfahren zur Verfügung stellen. Diese Dissertation ist in englischer Sprache verfasst

Universaar

MPG.PuRe

Acronym

Design and analysis of numerical algorithms for the solution of linear systems on parallel and distributed architectures

Author: Rosni Abdullah (7169939)
Publication venue
Publication date: 01/01/1997
Field of study

The increasing availability of parallel computers is having a very significant impact on all aspects of scientific computation, including algorithm research and software development in numerical linear algebra. In particular, the solution of linear systems, which lies at the heart of most calculations in scientific computing is an important computation found in many engineering and scientific applications. In this thesis, well-known parallel algorithms for the solution of linear systems are compared with implicit parallel algorithms or the Quadrant Interlocking (QI) class of algorithms to solve linear systems. These implicit algorithms are (2x2) block algorithms expressed in explicit point form notation. [Continues.

Loughborough University Institutional Repository

Optimal algorithms for parallel Givens factorization on a coarse grained PRAM

Author: Centre National de la Recherche Scientifique (CNRS) 69 - Lyon (France). Lab. de l'Informatique du Parallelisme
Cosnard Michel
Daoudi El Mostafa
Ecole Normale Superieure de Lyon 69 (France). Lab. de l'Informatique du Parallelisme
Institut Informatique et Mathematiques Appliquees de Grenoble (IMAG) 69 - Lyon (France). Lab. de l'Informatique du Parallelisme
Lyon-1 Univ. 69 (France). Lab. de l'Informatique du Parallelisme
Publication venue
Publication date: 01/01/1990
Field of study

SIGLEAvailable at INIST (FR), Document Supply Service, under shelf-number : RP 10293 / INIST-CNRS - Institut de l'Information Scientifique et TechniqueFRFranc

OpenGrey Repository

LIPIcs, Volume 244, ESA 2022, Complete Volume

Author: Chechik Shiri
Herman Grzegorz
Navarro Gonzalo
Rotenberg Eva
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 01/01/2022
Field of study

LIPIcs, Volume 244, ESA 2022, Complete Volum

Dagstuhl Research Online Publication Server