6 research outputs found

    Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods

    No full text
    International audienceWe consider the solution of sparse linear systems using direct methods via LU factorization. Unless the matrix is positive definite, numerical pivoting is usually needed to ensure stability, which is costly to implement especially in the sparse case. The Random Butterfly Transformations (RBT) technique provides an alternative to pivoting and is easily parallelizable. The RBT transforms the original matrix into another one that can be factorized without pivoting with probability one. This approach has been successful for dense matrices; in this work, we investigate the sparse case. In particular, we address the issue of fill-in in the transformed system

    Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods

    Get PDF
    Abstract. We consider the solution of sparse linear systems using direct methods via LU factorization. Unless the matrix is positive definite, numerical pivoting is usually needed to ensure stability, which is costly to implement especially in the sparse case. The Random Butterfly Transformations (RBT) technique provides an alternative to pivoting and is easily parallelizable. The RBT transforms the original matrix into another one that can be factorized without pivoting with probability one. This approach has been successful for dense matrices; in this work, we investigate the sparse case. In particular, we address the issue of fill-in in the transformed system.

    Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods

    No full text
    Also appeared as Lapack Working Note 285We consider the solution of sparse linear systems using direct methods via LU factorization. Unless the matrix is positive definite, numerical pivoting is usually needed to ensure stability, which is costly to implement especially in the sparse case. The Random Butterfly Transformations (RBT) technique provides an alternative to pivoting and is easily parallelizable. The RBT transforms the original matrix into another one that can be factorized without pivoting with probability one. This approach has been successful for dense matrices; in this work, we investigate the sparse case. In particular, we address the issue of fill-in in the transformed system

    Ένας τυχαιοποιημένος αλγόριθμος για την μέθοδο απαλοιφής του Gauss

    Get PDF
    Η παρούσα διπλωματική εργασία ασχολείται με την τροποποίηση της μεθόδου απαλοιφής του Gauss χρησιμοποιώντας την τυχαιότητα, έτσι ώστε να έχει μια μορφή κατάλληλη για να εφαρμοστεί σε υψηλής επίδοσης, παράλληλους υπολογιστές. Η μέθοδος απαλοιφής του Gauss είναι μια από τις πιο γνωστές μεθόδους για την επίλυση γραμμικών συστημάτων, η οποία όμως δεν μπορεί να εφαρμοστεί ανεξάρτητα για την επίλυση ενός προβλήματος. Για την παραγωγή ακριβών αποτελεσμάτων, απαιτείται επιπρόσθετα η χρήση της τεχνικής της οδήγησης (πχ. ολική, μερική). Σε παράλληλες αρχιτεκτονικές η οδήγηση δεν εισάγει μόνο επιπρόσθετο υπολογιστικό φόρτο, αλλά και σημαντική επιβάρυνση λόγω του κόστους επικοινωνίας που απαιτείται μεταξύ των επεξεργαστών. Η μέθοδος που περιγράφουμε στην εργασία αυτή ονομάζεται Τυχαίος Μετασχηματισμός Πεταλούδας (Random Butterfly Transformation (RBT)) είναι μια μέθοδος που με πιθανότητα σχεδόν 1 μπορεί να μετασχηματίσει ένα γραμμικό σύστημα σε μορφή στην οποία δεν είναι απαραίτητη η χρήση οδήγησης. Αυτό επιτυγχάνεται πολλαπλασιάζοντας κατάλληλα τους πίνακες του αρχικού γραμμικού συστήματος με «κατάλληλα τυχαίους» αναδρομικούς πίνακες πεταλούδας. Επιπρόσθετα, στην εργασία παρουσιάζεται η επίδραση της μεθόδου στον αριθμό συνθήκης (condition number) των πινάκων και ασχολούμαστε με ορισμένα θέματα επιλογής παραμέτρων (tuning) για την μέθοδο, όπως το εύρος των τυχαίων αριθμών και τα επίπεδα αναδρομής. Στη συνέχεια, παρουσιάζονται ορισμένα θέματα για την πιο αποτελεσματική υλοποίηση, όπως η αποδοτική αποθήκευση των αναδρομικών πινάκων πεταλούδας. Επίσης, υποδεικνύονται ορισμένα βασικά στοιχεία για την παράλληλη υλοποίηση της μεθόδου RBT σε κάρτες γραφικών (GPUs). Τέλος, για την πειραματική επαλήθευση της απόδοσης της μεθόδου, έχουν δημιουργηθεί ορισμένες δοκιμαστικές κλάσεις πινάκων στις οποίες εφαρμόζεται η μέθοδος RBT και προκύπτουν τα απαραίτητα αριθμητικά αποτελέσματα.This Master thesis presents a modification over Gaussian elimination method using randomness in order to be used in high performance parallel computers. Gauss elimination method is one of the most known and documented method for solving linear systems, but it cannot be applied directly for solving a problem. In order to produce accurate results, additional pivoting (eg. complete, partial) is required. In a parallel architecture, pivoting not only introduces additional computational cost, but also high communication overhead generated by data movement among the processors. The method proposed in this thesis is called Random Butterfly Transformation – RBT and with a probability near to 1 is able to transform a linear system to a new form, in which pivoting is not required. That can be managed with a procedure that involves the generation of “sufficient random” recursive matrices called “recursive butterfly matrices” and the multiplication of the linear system matrices with those recursive butterfly matrices. Additionally, we present the impact of the transformation to the condition number of the matrices and we discuss some issues about the fine tuning of the algorithm like the range of the random numbers or the recursion depth. Furthermore, additional implementation issues are examined like how to store in an efficient way the recursive butterfly matrices and implementing the method on GPU. In order to be able to confirm the performance of the method, we conducted some tests based on testing classes of matrices proposed in the literature. Comments on the arithmetic accuracy and overall performance of the method are provided too

    Resilience for Asynchronous Iterative Methods for Sparse Linear Systems

    Get PDF
    Large scale simulations are used in a variety of application areas in science and engineering to help forward the progress of innovation. Many spend the vast majority of their computational time attempting to solve large systems of linear equations; typically arising from discretizations of partial differential equations that are used to mathematically model various phenomena. The algorithms used to solve these problems are typically iterative in nature, and making efficient use of computational time on High Performance Computing (HPC) clusters involves constantly improving these iterative algorithms. Future HPC platforms are expected to encounter three main problem areas: scalability of code, reliability of hardware, and energy efficiency of the platform. The HPC resources that are expected to run the large programs are planned to consist of billions of processing units that come from more traditional multicore processors as well as a variety of different hardware accelerators. This growth in parallelism leads to the presence of all three problems. Previously, work on algorithm development has focused primarily on creating fault tolerance mechanisms for traditional iterative solvers. Recent work has begun to revisit using asynchronous methods for solving large scale applications, and this dissertation presents research into fault tolerance for fine-grained methods that are asynchronous in nature. Classical convergence results for asynchronous methods are revisited and modified to account for the possible occurrence of a fault, and a variety of techniques for recovery from the effects of a fault are proposed. Examples of how these techniques can be used are shown for various algorithms, including an analysis of a fine-grained algorithm for computing incomplete factorizations. Lastly, numerous modeling and simulation tools for the further construction of iterative algorithms for HPC applications are developed, including numerical models for simulating faults and a simulation framework that can be used to extrapolate the performance of algorithms towards future HPC systems
    corecore