Search CORE

967 research outputs found

Using a multifrontal sparse solver in a high performance, finite element code

Author: King Scott D.
Lucas Robert
Raefsky Arthur
Publication venue
Publication date
Field of study

We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP

NASA Technical Reports Server

Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

Author: Uçar Bora
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/08/2014
Field of study

Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

The Reverse Cuthill-McKee Algorithm in Distributed-Memory

Author: Azad Ariful
Buluc Aydin
Jacquelin Mathias
Ng Esmond G.
Publication venue
Publication date: 01/01/2016
Field of study

Ordering vertices of a graph is key to minimize fill-in and data structure size in sparse direct solvers, maximize locality in iterative solvers, and improve performance in graph algorithms. Except for naturally parallelizable ordering methods such as nested dissection, many important ordering methods have not been efficiently mapped to distributed-memory architectures. In this paper, we present the first-ever distributed-memory implementation of the reverse Cuthill-McKee (RCM) algorithm for reducing the profile of a sparse matrix. Our parallelization uses a two-dimensional sparse matrix decomposition. We achieve high performance by decomposing the problem into a small number of primitives and utilizing optimized implementations of these primitives. Our implementation shows strong scaling up to 1024 cores for smaller matrices and up to 4096 cores for larger matrices

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

A Parallel Solver for Graph Laplacians

Author: Boman Erik G.
Brannick James
Kepner Jeremy
Napov Artem
Ruge John W.
Spielman Daniel A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/07/2018
Field of study

Problems from graph drawing, spectral clustering, network flow and graph partitioning can all be expressed in terms of graph Laplacian matrices. There are a variety of practical approaches to solving these problems in serial. However, as problem sizes increase and single core speeds stagnate, parallelism is essential to solve such problems quickly. We present an unsmoothed aggregation multigrid method for solving graph Laplacians in a distributed memory setting. We introduce new parallel aggregation and low degree elimination algorithms targeted specifically at irregular degree graphs. These algorithms are expressed in terms of sparse matrix-vector products using generalized sum and product operations. This formulation is amenable to linear algebra using arbitrary distributions and allows us to operate on a 2D sparse matrix distribution, which is necessary for parallel scalability. Our solver outperforms the natural parallel extension of the current state of the art in an algorithmic comparison. We demonstrate scalability to 576 processes and graphs with up to 1.7 billion edges.Comment: PASC '18, Code: https://github.com/ligmg/ligm

arXiv.org e-Print Archive

Crossref

On High Performance Computing in Geodesy : Applications in Global Gravity Field Determination

Author: Brockmann Jan Martin
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Autonomously working sensor platforms deliver an increasing amount of precise data sets, which are often usable in geodetic applications. Due to the volume and quality, models determined from the data can be parameterized more complex and in more detail. To derive model parameters from these observations, the solution of a high dimensional inverse data fitting problem is often required. To solve such high dimensional adjustment problems, this thesis proposes a systematical, end-to-end use of a massive parallel implementation of the geodetic data analysis, using standard concepts of massive parallel high performance computing. It is shown how these concepts can be integrated into a typical geodetic problem, which requires the solution of a high dimensional adjustment problem. Due to the proposed parallel use of the computing and memory resources of a compute cluster it is shown, how general Gauss-Markoff models become solvable, which were only solvable by means of computationally motivated simplifications and approximations before. A basic, easy-to-use framework is developed, which is able to perform all relevant operations needed to solve a typical geodetic least squares adjustment problem. It provides the interface to the standard concepts and libraries used. Examples, including different characteristics of the adjustment problem, show how the framework is used and can be adapted for specific applications. In a computational sense rigorous solutions become possible for hundreds of thousands to millions of unknown parameters, which have to be estimated from a huge number of observations. Three special problems with different characteristics, as they arise in global gravity field recovery, are chosen and massive parallel implementations of the solution processes are derived. The first application covers global gravity field determination from real data as collected by the GOCE satellite mission (comprising 440 million highly correlated observations, 80,000 parameters). Within the second application high dimensional global gravity field models are estimated from the combination of complementary data sets via the assembly and solution of full normal equations (scenarios with 520,000 parameters, 2 TB normal equations). The third application solves a comparable problem, but uses an iterative least squares solver, allowing for a parameter space of even higher dimension (now considering scenarios with two million parameters). This thesis forms the basis for a flexible massive parallel software package, which is extendable according to further current and future research topics studied in the department. Within this thesis, the main focus lies on the computational aspects.Autonom arbeitende Sensorplattformen liefern präzise geodätisch nutzbare Datensätze in größer werdendem Umfang. Deren Menge und Qualität führt dazu, dass Modelle die aus den Beobachtungen abgeleitet werden, immer komplexer und detailreicher angesetzt werden können. Zur Bestimmung von Modellparametern aus den Beobachtungen gilt es oftmals, ein hochdimensionales inverses Problem im Sinne der Ausgleichungsrechnung zu lösen. Innerhalb dieser Arbeit soll ein Beitrag dazu geleistet werden, Methoden und Konzepte aus dem Hochleistungsrechnen in der geodätischen Datenanalyse strukturiert, durchgängig und konsequent zu verwenden. Diese Arbeit zeigt, wie sich diese nutzen lassen, um geodätische Fragestellungen, die ein hochdimensionales Ausgleichungsproblem beinhalten, zu lösen. Durch die gemeinsame Nutzung der Rechen- und Speicherressourcen eines massiv parallelen Rechenclusters werden Gauss-Markoff Modelle lösbar, die ohne den Einsatz solcher Techniken vorher höchstens mit massiven Approximationen und Vereinfachungen lösbar waren. Ein entwickeltes Grundgerüst stellt die Schnittstelle zu den massiv parallelen Standards dar, die im Rahmen einer numerischen Lösung von typischen Ausgleichungsaufgaben benötigt werden. Konkrete Anwendungen mit unterschiedlichen Charakteristiken zeigen das detaillierte Vorgehen um das Grundgerüst zu verwenden und zu spezifizieren. Rechentechnisch strenge Lösungen sind so für Hunderttausende bis Millionen von unbekannten Parametern möglich, die aus einer Vielzahl von Beobachtungen geschätzt werden. Drei spezielle Anwendungen aus dem Bereich der globalen Bestimmung des Erdschwerefeldes werden vorgestellt und die Implementierungen für einen massiv parallelen Hochleistungsrechner abgeleitet. Die erste Anwendung beinhaltet die Bestimmung von Schwerefeldmodellen aus realen Beobachtungen der Satellitenmission GOCE (welche 440 Millionen korrelierte Beobachtungen umfasst, 80,000 Parameter). In der zweite Anwendung werden globale hochdimensionale Schwerefelder aus komplementären Daten über das Aufstellen und Lösen von vollen Normalgleichungen geschätzt (basierend auf Szenarien mit 520,000 Parametern, 2 TB Normalgleichungen). Die dritte Anwendung löst dasselbe Problem, jedoch über einen iterativen Löser, wodurch der Parameterraum noch einmal deutlich höher dimensional sein kann (betrachtet werden nun Szenarien mit 2 Millionen Parametern). Die Arbeit bildet die Grundlage für ein massiv paralleles Softwarepaket, welches schrittweise um Spezialisierungen, abhängig von aktuellen Forschungsprojekten in der Arbeitsgruppe, erweitert werden wird. Innerhalb dieser Arbeit liegt der Fokus rein auf den rechentechnischen Aspekten

bonndoc – Der Publikationsserver der Universität Bonn

On high performance computing in geodesy : applications in global gravity field determination

Author: Brockmann Jan Martin
Publication venue: Rheinische Friedrich-Wilhelms-Universität Bonn, Landwirtschaftliche Fakultät, IGG - Institut für Geodäsie und Geoinformation
Publication date
Field of study

bonndoc – Der Publikationsserver der Universität Bonn