317 research outputs found

    A Novel Partitioning Method for Accelerating the Block Cimmino Algorithm

    Get PDF
    We propose a novel block-row partitioning method in order to improve the convergence rate of the block Cimmino algorithm for solving general sparse linear systems of equations. The convergence rate of the block Cimmino algorithm depends on the orthogonality among the block rows obtained by the partitioning method. The proposed method takes numerical orthogonality among block rows into account by proposing a row inner-product graph model of the coefficient matrix. In the graph partitioning formulation defined on this graph model, the partitioning objective of minimizing the cutsize directly corresponds to minimizing the sum of inter-block inner products between block rows thus leading to an improvement in the eigenvalue spectrum of the iteration matrix. This in turn leads to a significant reduction in the number of iterations required for convergence. Extensive experiments conducted on a large set of matrices confirm the validity of the proposed method against a state-of-the-art method

    On the Easy Use of Scientific Computing Services for Large Scale Linear Algebra and Parallel Decision Making with the P-Grade Portal

    Get PDF
    International audienceScientific research is becoming increasingly dependent on the large-scale analysis of data using distributed computing infrastructures (Grid, cloud, GPU, etc.). Scientific computing (Petitet et al. 1999) aims at constructing mathematical models and numerical solution techniques for solving problems arising in science and engineering. In this paper, we describe the services of an integrated portal based on the P-Grade (Parallel Grid Run-time and Application Development Environment) portal (http://www.p-grade.hu) that enables the solution of large-scale linear systems of equations using direct solvers, makes easier the use of parallel block iterative algorithm and provides an interface for parallel decision making algorithms. The ultimate goal is to develop a single sign on integrated multi-service environment providing an easy access to different kind of mathematical calculations and algorithms to be performed on hybrid distributed computing infrastructures combining the benefits of large clusters, Grid or cloud, when needed

    Méthodes hybrides pour la résolution de grands systèmes linéaires creux sur calculateurs parallèles

    Get PDF
    Nous nous intéressons à la résolution en parallèle de système d’équations linéaires creux et de large taille. Le calcul de la solution d’un tel type de système requiert un grand espace mémoire et une grande puissance de calcul. Il existe deux principales méthodes de résolution de systèmes linéaires. Soit la méthode est directe et de ce fait est rapide et précise, mais consomme beaucoup de mémoire. Soit elle est itérative, économe en mémoire, mais assez lente à atteindre une solution de qualité suffisante. Notre travail consiste à combiner ces deux techniques pour créer un solveur hybride efficient en consommation mémoire tout en étant rapide et robuste. Nous essayons ensuite d’améliorer ce solveur en introduisant une nouvelle méthode pseudo directe qui contourne certains inconvénients de la méthode précédente. Dans les premiers chapitres nous examinons les méthodes de projections par lignes, en particulier la méthode Cimmino en bloc, certains de leurs aspects numériques et comment ils affectent la convergence. Ensuite, nous analyserons l’accélération de ces techniques avec la méthode des gradients conjugués et comment cette accélération peut être améliorée avec une version en bloc du gradient conjugué. Nous regarderons ensuite comment le partitionnement du système linéaire affecte lui aussi la convergence et comment nous pouvons améliorer sa qualité. Finalement, nous examinerons l’implantation en parallèle du solveur hybride, ses performances ainsi que les améliorations possible. Les deux derniers chapitres introduisent une amélioration à ce solveur hybride, en améliorant les propriétés numériques du système linéaire, de sorte à avoir une convergence en une seule itération et donc un solveur pseudo direct. Nous commençons par examiner les propriétés numériques du système résultants, analyser la solution parallèle et comment elle se comporte face au solveur hybride et face à un solveur direct. Finalement, nous introduisons de possible amélioration au solveur pseudo direct. Ce travail a permis d’implanter un solveur hybride "ABCD solver" (Augmented Block Cimmino Distributed solver) qui peut soit fonctionner en mode itératif ou en mode pseudo direct. ABSTRACT : We are interested in solving large sparse systems of linear equations in parallel. Computing the solution of such systems requires a large amount of memory and computational power. The two main ways to obtain the solution are direct and iterative approaches. The former achieves this goal fast but with a large memory footprint while the latter is memory friendly but can be slow to converge. In this work we try first to combine both approaches to create a hybrid solver that can be memory efficient while being fast. Then we discuss a novel approach that creates a pseudo-direct solver that compensates for the drawback of the earlier approach. In the first chapters we take a look at row projection techniques, especially the block Cimmino method and examine some of their numerical aspects and how they affect the convergence. We then discuss the acceleration of convergence using conjugate gradients and show that a block version improves the convergence. Next, we see how partitioning the linear system affects the convergence and show how to improve its quality. We finish by discussing the parallel implementation of the hybrid solver, discussing its performance and seeing how it can be improved. The last two chapters focus on an improvement to this hybrid solver. We try to improve the numerical properties of the linear system so that we converge in a single iteration which results in a pseudo-direct solver. We first discuss the numerical properties of the new system, see how it works in parallel and see how it performs versus the iterative version and versus a direct solver. We finally consider some possible improvements to the solver. This work led to the implementation of a hybrid solver, our "ABCD solver" (Augmented Block Cimmino Distributed solver), that can either work in a fully iterative mode or in a pseudo-direct mode

    Applications in GNSS water vapor tomography

    Get PDF
    Algebraic reconstruction algorithms are iterative algorithms that are used in many area including medicine, seismology or meteorology. These algorithms are known to be highly computational intensive. This may be especially troublesome for real-time applications or when processed by conventional low-cost personnel computers. One of these real time applications is the reconstruction of water vapor images from Global Navigation Satellite System (GNSS) observations. The parallelization of algebraic reconstruction algorithms has the potential to diminish signi cantly the required resources permitting to obtain valid solutions in time to be used for nowcasting and forecasting weather models. The main objective of this dissertation was to present and analyse diverse shared memory libraries and techniques in CPU and GPU for algebraic reconstruction algorithms. It was concluded that the parallelization compensates over sequential implementations. Overall the GPU implementations were found to be only slightly faster than the CPU implementations, depending on the size of the problem being studied. A secondary objective was to develop a software to perform the GNSS water vapor reconstruction using the implemented parallel algorithms. This software has been developed with success and diverse tests were made namely with synthetic and real data, the preliminary results shown to be satisfactory. This dissertation was written in the Space & Earth Geodetic Analysis Laboratory (SEGAL) and was carried out in the framework of the Structure of Moist convection in high-resolution GNSS observations and models (SMOG) (PTDC/CTE-ATM/119922/2010) project funded by FCT.Algoritmos de reconstrução algébrica são algoritmos iterativos que são usados em muitas áreas incluindo medicina, sismologia ou meteorologia. Estes algoritmos são conhecidos por serem bastante exigentes computacionalmente. Isto pode ser especialmente complicado para aplicações de tempo real ou quando processados por computadores pessoais de baixo custo. Uma destas aplicações de tempo real é a reconstrução de imagens de vapor de água a partir de observações de sistemas globais de navegação por satélite. A paralelização dos algoritmos de reconstrução algébrica permite que se reduza significativamente os requisitos computacionais permitindo obter soluções válidas para previsão meteorológica num curto espaço de tempo. O principal objectivo desta dissertação é apresentar e analisar diversas bibliotecas e técnicas multithreading para a reconstrução algébrica em CPU e GPU. Foi concluído que a paralelização compensa sobre a implementações sequenciais. De um modo geral as implementações GPU obtiveram resultados relativamente melhores que implementações em CPU, isto dependendo do tamanho do problema a ser estudado. Um objectivo secundário era desenvolver uma aplicação que realizasse a reconstrução de imagem de vapor de água através de sistemas globais de navegação por satélite de uma forma paralela. Este software tem sido desenvolvido com sucesso e diversos testes foram realizados com dados sintéticos e dados reais, os resultados preliminares foram satisfatórios. Esta dissertação foi escrita no Space & Earth Geodetic Analysis Laboratory (SEGAL) e foi realizada de acordo com o projecto Structure 01' Moist convection in high-resolution GNSS observations and models (SMOG) (PTDC / CTE-ATM/ 11992212010) financiado pelo FCT.Fundação para a Ciência e a Tecnologia (FCT

    Improved analysis of algorithms based on supporting halfspaces and quadratic programming for the convex intersection and feasibility problems

    Full text link
    This paper improves the algorithms based on supporting halfspaces and quadratic programming for convex set intersection problems in our earlier paper in several directions. First, we give conditions so that much smaller quadratic programs (QPs) and approximate projections arising from partially solving the QPs are sufficient for multiple-term superlinear convergence for nonsmooth problems. Second, we identify additional regularity, which we call the second order supporting hyperplane property (SOSH), that gives multiple-term quadratic convergence. Third, we show that these fast convergence results carry over for the convex inequality problem. Fourth, we show that infeasibility can be detected in finitely many operations. Lastly, we explain how we can use the dual active set QP algorithm of Goldfarb and Idnani to get useful iterates by solving the QPs partially, overcoming the problem of solving large QPs in our algorithms.Comment: 27 pages, 2 figure

    Hybrid direct and interactive solvers for sparse indefinite and overdetermined systems on future exascale architectures

    Get PDF
    In scientific computing, the numerical simulation of systems is crucial to get a deep understanding of the physics underlying real world applications. The models used in simulation are often based on partial differential equations (PDE) which, after fine discretisation, give rise to huge sparse systems of equations to solve. Historically, 2 classes of methods were designed for the solution of such systems: direct methods, robust but expensive in both computations and memory; and iterative methods, cheap but with a very problem-dependent convergence properties. In the context of high performance computing, hybrid direct-iterative methods were then introduced inorder to combine the advantages of both methods, while using efficiently the increasingly largeand fast supercomputing facilities. In this thesis, we focus on the latter type of methods with two complementary research axis.In the first chapter, we detail the mechanisms behind the efficient implementation of multigrid methods. The latter makes use of several levels of increasingly refined grids to solve linear systems with a combination of fine grid smoothing and coarse grid corrections. The efficient parallel implementation of such a scheme is a difficult task. We focus on the solution of the problem on the coarse grid whose scalability is often observed as limiting at very large scales. We propose an agglomeration technique to gather the data of the coarse grid problem on a subset ofthe computing resources in order to minimise the execution time of a direct solver. Combined with a relaxation of the solution accuracy, we demonstrate an increased overall scalability of the multigrid scheme when using our approach compared to classical iterative methods, when the problem is numerically difficult. At extreme scale, this study is carried in the HHG framework(Hierarchical Hybrid Grids) for the solution of a Stokes problem with jumping coefficients, inspired from Earth's mantle convection simulation. The direct solver used on the coarse grid is MUMPS,combined with block low-rank approximation and single precision arithmetic.In the following chapters, we study some hybrid methods derived from the classical row-projection method block Cimmino, and interpreted as domain decomposition methods. These methods are based on the partitioning of the matrix into blocks of rows. Due to its known slow convergence, the original iterative scheme is accelerated with a stabilised block version of the conjugate gradient algorithm. While an optimal choice of block size improves the efficiency of this approach, the convergence stays problem dependent. An alternative solution is then introduced which enforces a convergence in one iteration by embedding the linear system into a carefully augmented space.These two approaches are extended in order to compute the minimum norm solution of in definite systems and the solution of least-squares problems. The latter problems require a partitioning in blocks of columns. We show how to improve the numerical properties of the iterative and pseudo-direct methods with scaling, partitioning and better augmentation methods. Both methods are implemented in the parallel solver ABCD-Solver (Augmented Block Cimmino Distributed solver)whose parallelisation we improve through a combination of load balancing and communication minimising techniques.Finally, for the solution of discretised PDE problems, we propose a new approach which augments the linear system using a coarse representation of the space. The size of the augmentation is controlled by the choice of a more or less refined mesh. We obtain an iterative method with fast linear convergence demonstrated on Helmholtz and Convection-Diffusion problems. The central point of the approach is the iterative construction and solution of a Schur complemen
    • …
    corecore