Search CORE

2,032 research outputs found

Research Projects, Technical Reports and Publications

Author: Oliger Joseph
Publication venue
Publication date
Field of study

The Research Institute for Advanced Computer Science (RIACS) was established by the Universities Space Research Association (USRA) at the NASA Ames Research Center (ARC) on June 6, 1983. RIACS is privately operated by USRA, a consortium of universities with research programs in the aerospace sciences, under contract with NASA. The primary mission of RIACS is to provide research and expertise in computer science and scientific computing to support the scientific missions of NASA ARC. The research carried out at RIACS must change its emphasis from year to year in response to NASA ARC's changing needs and technological opportunities. A flexible scientific staff is provided through a university faculty visitor program, a post doctoral program, and a student visitor program. Not only does this provide appropriate expertise but it also introduces scientists outside of NASA to NASA problems. A small group of core RIACS staff provides continuity and interacts with an ARC technical monitor and scientific advisory group to determine the RIACS mission. RIACS activities are reviewed and monitored by a USRA advisory council and ARC technical monitor. Research at RIACS is currently being done in the following areas: Advanced Methods for Scientific Computing High Performance Networks During this report pefiod Professor Antony Jameson of Princeton University, Professor Wei-Pai Tang of the University of Waterloo, Professor Marsha Berger of New York University, Professor Tony Chan of UCLA, Associate Professor David Zingg of University of Toronto, Canada and Assistant Professor Andrew Sohn of New Jersey Institute of Technology have been visiting RIACS. January 1, 1996 through September 30, 1996 RIACS had three staff scientists, four visiting scientists, one post-doctoral scientist, three consultants, two research associates and one research assistant. RIACS held a joint workshop with Code 1 29-30 July 1996. The workshop was held to discuss needs and opportunities in basic research in computer science in and for NASA applications. There were 14 talks given by NASA, industry and university scientists and three open discussion sessions. There were approximately fifty participants. A proceedings is being prepared. It is planned to have similar workshops on an annual basis. RIACS technical reports are usually preprints of manuscripts that have been submitted to research 'ournals or conference proceedings. A list of these reports for the period January i 1, 1996 through September 30, 1996 is in the Reports and Abstracts section of this report

NASA Technical Reports Server

RIACS

Author: Oliger Joseph
Publication venue
Publication date
Field of study

Topics considered include: high-performance computing; cognitive and perceptual prostheses (computational aids designed to leverage human abilities); autonomous systems. Also included: development of a 3D unstructured grid code based on a finite volume formulation and applied to the Navier-stokes equations; Cartesian grid methods for complex geometry; multigrid methods for solving elliptic problems on unstructured grids; algebraic non-overlapping domain decomposition methods for compressible fluid flow problems on unstructured meshes; numerical methods for the compressible navier-stokes equations with application to aerodynamic flows; research in aerodynamic shape optimization; S-HARP: a parallel dynamic spectral partitioner; numerical schemes for the Hamilton-Jacobi and level set equations on triangulated domains; application of high-order shock capturing schemes to direct simulation of turbulence; multicast technology; network testbeds; supercomputer consolidation project

NASA Technical Reports Server

The DUNE-ALUGrid Module

Author: Alkämper Martin
Dedner Andreas
Klöfkorn Robert
Nolte Martin
Publication venue
Publication date: 15/08/2015
Field of study

In this paper we present the new DUNE-ALUGrid module. This module contains a major overhaul of the sources from the ALUgrid library and the binding to the DUNE software framework. The main changes include user defined load balancing, parallel grid construction, and an redesign of the 2d grid which can now also be used for parallel computations. In addition many improvements have been introduced into the code to increase the parallel efficiency and to decrease the memory footprint. The original ALUGrid library is widely used within the DUNE community due to its good parallel performance for problems requiring local adaptivity and dynamic load balancing. Therefore, this new model will benefit a number of DUNE users. In addition we have added features to increase the range of problems for which the grid manager can be used, for example, introducing a 3d tetrahedral grid using a parallel newest vertex bisection algorithm for conforming grid refinement. In this paper we will discuss the new features, extensions to the DUNE interface, and explain for various examples how the code is used in parallel environments.Comment: 25 pages, 11 figure

arXiv.org e-Print Archive

UiS Brage

Load Balancing Unstructured Adaptive Grids for CFD Problems

Author: Biswas Rupak
Oliker Leonid
Publication venue
Publication date
Field of study

Mesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among processors on a parallel machine. A dynamic load balancing method is presented that balances the workload across all processors with a global view. After each parallel tetrahedral mesh adaption, the method first determines if the new mesh is sufficiently unbalanced to warrant a repartitioning. If so, the adapted mesh is repartitioned, with new partitions assigned to processors so that the redistribution cost is minimized. The new partitions are accepted only if the remapping cost is compensated by the improved load balance. Results indicate that this strategy is effective for large-scale scientific computations on distributed-memory multiprocessors

NASA Technical Reports Server

Solving multi-physics problems using adaptive finite elements with independently refined meshes

Author: Ling Siqi
Publication venue
Publication date: 16/12/2016
Field of study

In this thesis, we study a numerical tool named multi-mesh method within the framework of the adaptive finite element method. The aim of this method is to minimize the size of the linear system to get the optimal performance of simulations. Multi-mesh methods are typically used in multi-physics problems, where more than one component is involved in the system. During the discretization of the weak formulation of partial differential equations, a finite-dimensional space associated with an independently refined mesh is assigned to each component respectively. The usage of independently refined meshes leads less degrees of freedom from a global point of view. To our best knowledge, the first multi-mesh method was presented at the beginning of the 21st Century. Similar techniques were announced by different mathematics researchers afterwards. But, due to some common restrictions, this method is not widely used in the field of numerical simulations. On one hand, only the case of two-mesh is taken into scientists\' consideration. But more than two components are common in multi-physics problems. Each is, in principle, allowed to be defined on an independent mesh. Besides that, the multi-mesh methods presented so far omit the possibility that coefficient function spaces live on the different meshes from the trial and test function spaces. As a ubiquitous numerical tool, the multi-mesh method should comprise the above circumstances. On the other hand, users are accustomed to improving the performance by taking the advantage of parallel resources rather than running simulations with the multi-mesh approach on one single processor, so it would be a pity if such an efficient method was only available in sequential. The multi-mesh method is actually used within local assembling process, which should not be conflict with parallelization. In this thesis, we present a general multi-mesh method without the limitation of the number of meshes used in the system, and it can be applied to parallel environments as well. Chapter 1 introduces the background knowledge of the adaptive finite element method and the pioneering work, on which this thesis is based. Then, the main idea of the multi-mesh method is formally derived and the detailed implementation is discussed in Chapter 2 and 3. In Chapter 4, applications, e.g. the multi-phase flow problem and the dendritic growth, are shown to prove that our method is superior in contrast to the standard single-mesh finite element method in terms of performance, while accuracy is not reduced

Technische Universität Dresden: Qucosa

Software concepts and algorithms for an efficient and scalable parallel finite element method

Author: Witkowski Thomas
Publication venue
Publication date: 19/12/2013
Field of study

Software packages for the numerical solution of partial differential equations (PDEs) using the finite element method are important in different fields of research. The basic data structures and algorithms change in time, as the user\'s requirements are growing and the software must efficiently use the newest highly parallel computing systems. This is the central point of this work. To make efficiently use of parallel computing systems with growing number of independent basic computing units, i.e.~CPUs, we have to combine data structures and algorithms from different areas of mathematics and computer science. Two crucial parts are a distributed mesh and parallel solver for linear systems of equations. For both there exists multiple independent approaches. In this work we argue that it is necessary to combine both of them to allow for an efficient and scalable implementation of the finite element method. First, we present concepts, data structures and algorithms for distributed meshes, which allow for local refinement. The central point of our presentation is to provide arbitrary geometrical information of the mesh and its distribution to the linear solver. A large part of the overall computing time of the finite element method is spend by the linear solver. Thus, its parallelization is of major importance. Based on the presented concept for distributed meshes, we preset several different linear solver methods. Hereby we concentrate on general purpose linear solver, which makes only little assumptions about the systems to be solver. For this, a new FETI-DP (Finite Element Tearing and Interconnect - Dual Primal) method is proposed. Those the standard FETI-DP method is quasi optimal from a mathematical point of view, its not possible to implement it efficiently for a large number of processors (> 10,000). The main reason is a relatively small but globally distributed coarse mesh problem. To circumvent this problem, we propose a new multilevel FETI-DP method which hierarchically decompose the coarse grid problem. This leads to a more local communication pattern for solver the coarse grid problem and makes it possible to scale for a large number of processors. Besides the parallelization of the finite element method, we discuss an approach to speed up serial computations of existing finite element packages. In many computations the PDE to be solved consists of more than one variable. This is especially the case in multi-physics modeling. Observation show that in many of these computation the solution structure of the variables is different. But in the standard finite element method, only one mesh is used for the discretization of all variables. We present a multi-mesh finite element method, which allows to discretize a system of PDEs with two independently refined meshes.Softwarepakete zur numerischen Lösung partieller Differentialgleichungen mit Hilfe der Finiten-Element-Methode sind in vielen Forschungsbereichen ein wichtiges Werkzeug. Die dahinter stehenden Datenstrukturen und Algorithmen unterliegen einer ständigen Neuentwicklung um den immer weiter steigenden Anforderungen der Nutzergemeinde gerecht zu werden und um neue, hochgradig parallel Rechnerarchitekturen effizient nutzen zu können. Dies ist auch der Kernpunkt dieser Arbeit. Um parallel Rechnerarchitekturen mit einer immer höher werdenden Anzahl an von einander unabhängigen Recheneinheiten, z.B.~Prozessoren, effizient Nutzen zu können, müssen Datenstrukturen und Algorithmen aus verschiedenen Teilgebieten der Mathematik und Informatik entwickelt und miteinander kombiniert werden. Im Kern sind dies zwei Bereiche: verteilte Gitter und parallele Löser für lineare Gleichungssysteme. Für jedes der beiden Teilgebiete existieren unabhängig voneinander zahlreiche Ansätze. In dieser Arbeit wird argumentiert, dass für hochskalierbare Anwendungen der Finiten-Elemente-Methode nur eine Kombination beider Teilgebiete und die Verknüpfung der darunter liegenden Datenstrukturen eine effiziente und skalierbare Implementierung ermöglicht. Zuerst stellen wir Konzepte vor, die parallele verteile Gitter mit entsprechenden Adaptionstrategien ermöglichen. Zentraler Punkt ist hier die Informationsaufbereitung für beliebige Löser linearer Gleichungssysteme. Beim Lösen partieller Differentialgleichung mit der Finiten Elemente Methode wird ein großer Teil der Rechenzeit für das Lösen der dabei anfallenden linearen Gleichungssysteme aufgebracht. Daher ist deren Parallelisierung von zentraler Bedeutung. Basierend auf dem vorgestelltem Konzept für verteilten Gitter, welches beliebige geometrische Informationen für die linearen Löser aufbereiten kann, präsentieren wir mehrere unterschiedliche Lösermethoden. Besonders Gewicht wird dabei auf allgemeine Löser gelegt, die möglichst wenig Annahmen über das zu lösende System machen. Hierfür wird die FETI-DP (Finite Element Tearing and Interconnect - Dual Primal) Methode weiterentwickelt. Obwohl die FETI-DP Methode vom mathematischen Standpunkt her als quasi-optimal bezüglich der parallelen Skalierbarkeit gilt, kann sie für große Anzahl an Prozessoren (> 10.000) nicht mehr effizient implementiert werden. Dies liegt hauptsächlich an einem verhältnismäßig kleinem aber global verteilten Grobgitterproblem. Wir stellen eine Multilevel FETI-DP Methode vor, die dieses Problem durch eine hierarchische Komposition des Grobgitterproblems löst. Dadurch wird die Kommunikation entlang des Grobgitterproblems lokalisiert und die Skalierbarkeit der FETI-DP Methode auch für große Anzahl an Prozessoren sichergestellt. Neben der Parallelisierung der Finiten-Elemente-Methode beschäftigen wir uns in dieser Arbeit mit der Ausnutzung von bestimmten Voraussetzung um auch die sequentielle Effizienz bestehender Implementierung der Finiten-Elemente-Methode zu steigern. In vielen Fällen müssen partielle Differentialgleichungen mit mehreren Variablen gelöst werden. Sehr häufig ist dabei zu beobachten, insbesondere bei der Modellierung mehrere miteinander gekoppelter physikalischer Phänomene, dass die Lösungsstruktur der unterschiedlichen Variablen entweder schwach oder vollständig voneinander entkoppelt ist. In den meisten Implementierungen wird dabei nur ein Gitter zur Diskretisierung aller Variablen des Systems genutzt. Wir stellen eine Finite-Elemente-Methode vor, bei der zwei unabhängig voneinander verfeinerte Gitter genutzt werden können um ein System partieller Differentialgleichungen zu lösen

Technische Universität Dresden: Qucosa

Hybrid parallelization of an adaptive finite element code

Author: Voigt Axel
Witkowski Thomas
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2010
Field of study

summary:We present a hybrid OpenMP/MPI parallelization of the finite element method that is suitable to make use of modern high performance computers. These are usually built from a large bulk of multi-core systems connected by a fast network. Our parallelization method is based firstly on domain decomposition to divide the large problem into small chunks. Each of them is then solved on a multi-core system using parallel assembling, solution and error estimation. To make domain decomposition for both, the large problem and the smaller sub-problems, sufficiently fast we make use of a hierarchical mesh structure. The partitioning is done on a coarser mesh level, resulting in a very fast method that shows good computational balancing results. Numerical experiments show that both parallelization methods achieve good scalability in computing solution of nonlinear, time dependent, higher order PDEs on large domains. The parallelization is realized in the adaptive finite element software AMDiS

Institute of Mathematics AS CR, v. v. i.

Parallel Eulerian-Lagrangian Method with Adaptive Mesh Refinement for Moving Boundary Computation

Author: Agbaglah G.
Burstedde C.
Deiterding R.
Kirk B. S.
MacNeice P.
Uzgoren E.
Zuzio D.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 01/01/2013
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/106477/1/AIAA2013-370.pd

Crossref

Hong Kong University of Science and Technology Institutional Repository

Deep Blue Documents at the University of Michigan

HARP: A Dynamic Inertial Spectral Partitioner

Author: Biswas Rupak
Simon Horst D.
Sohn Andrew
Publication venue
Publication date
Field of study

Partitioning unstructured graphs is central to the parallel solution of computational science and engineering problems. Spectral partitioners, such recursive spectral bisection (RSB), have proven effecfive in generating high-quality partitions of realistically-sized meshes. The major problem which hindered their wide-spread use was their long execution times. This paper presents a new inertial spectral partitioner, called HARP. The main objective of the proposed approach is to quickly partition the meshes at runtime in a manner that works efficiently for real applications in the context of distributed-memory machines. The underlying principle of HARP is to find the eigenvectors of the unpartitioned vertices and then project them onto the eigerivectors of the original mesh. Results for various meshes ranging in size from 1000 to 100,000 vertices indicate that HARP can indeed partition meshes rapidly at runtime. Experimental results show that our largest mesh can be partitioned sequentially in only a few seconds on an SP2 which is several times faster than other spectral partitioners while maintaining the solution quality of the proven RSB method. A parallel WI version of HARP has also been implemented on IBM SP2 and Cray T3E. Parallel HARP, running on 64 processors SP2 and T3E, can partition a mesh containing more than 100,000 vertices into 64 subgrids in about half a second. These results indicate that graph partitioning can now be truly embedded in dynamically-changing real-world applications

NASA Technical Reports Server

A practical approach to dynamic load balancing

Author: J. Watts
S. Taylor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref