75 research outputs found
An Application Perspective on High-Performance Computing and Communications
We review possible and probable industrial applications of HPCC focusing on the software and hardware issues. Thirty-three separate categories are illustrated by detailed descriptions of five areas -- computational chemistry; Monte Carlo methods from physics to economics; manufacturing; and computational fluid dynamics; command and control; or crisis management; and multimedia services to client computers and settop boxes. The hardware varies from tightly-coupled parallel supercomputers to heterogeneous distributed systems. The software models span HPF and data parallelism, to distributed information systems and object/data flow parallelism on the Web. We find that in each case, it is reasonably clear that HPCC works in principle, and postulate that this knowledge can be used in a new generation of software infrastructure based on the WebWindows approach, and discussed in an accompanying paper
GPU Usage for Parallel Functions and Contacts in Modelica
This thesis investigates two ways of incorporating GPUs in Modelica. The first by automatically generating GPU code for Modelica functions, and the second by using GPU accelerated external code for a contact handling package. Automatic parallelization of functions is desired, as it can potentially accelerate large simulations significantly. Special patterns of nested for-loops in Modelica code are recognized and translated into CUDA kernel functions. Inline integration allows a broader spectrum of models to take advantage of the parallelization, by reducing CPU-GPU transfers. The prototype has been tested and achieved a speed-up factor of up to five compared to the CPU. The contact handling package is capable of handling both complex contact behavior between arbitrarily shaped bodies and large DEM-like simulations, something which Modelica is currently lacking. Attempts to accelerate the package with GPUs were made, with partial success for the broad phase. The package uses Morton encoding for the broad phase, and the narrow phase is based on CSG intersection with BSP trees. Contact response is calculated using a volume dependent method, taking friction, damping and multiple contact points into account. The capability of the package was demonstrated by the fact that both complex contact behavior such as the inversion of the Tippe Top toy and tens of thousands of colliding spheres could be simulated.One of the key components in our modern society is the ability to simulate. By simulations, the industry can design new cars, phones, aircrafts etc., without having to go through prototype after prototype, allowing us the cheap and high-tech products most of us rely on in our day-to-day life. However, good as simulations are today there are still severe limitations on what can be simulated. Two of the largest limiting factors are how hard simulations are to design and how long they take to run. The first of these problems are tackled by the Modelica programming language, which is designed for easy set-up of simulations. We have attacked both these problems by using GPUs to speed-up Modelica simulations more than 5 times, and extending the capability of Modelica to handle collisions between objects
Activities of the Research Institute for Advanced Computer Science
The Research Institute for Advanced Computer Science (RIACS) was established by the Universities Space Research Association (USRA) at the NASA Ames Research Center (ARC) on June 6, 1983. RIACS is privately operated by USRA, a consortium of universities with research programs in the aerospace sciences, under contract with NASA. The primary mission of RIACS is to provide research and expertise in computer science and scientific computing to support the scientific missions of NASA ARC. The research carried out at RIACS must change its emphasis from year to year in response to NASA ARC's changing needs and technological opportunities. Research at RIACS is currently being done in the following areas: (1) parallel computing; (2) advanced methods for scientific computing; (3) high performance networks; and (4) learning systems. RIACS technical reports are usually preprints of manuscripts that have been submitted to research journals or conference proceedings. A list of these reports for the period January 1, 1994 through December 31, 1994 is in the Reports and Abstracts section of this report
Doctor of Philosophy
dissertationSmoothness-increasing accuracy-conserving (SIAC) filters were introduced as a class of postprocessing techniques to ameliorate the quality of numerical solutions of discontinuous Galerkin (DG) simulations. SIAC filtering works to eliminate the oscillations in the error by introducing smoothness back to the DG field and raises the accuracy in the L2-n o rm up to its natural superconvergent accuracy in the negative-order norm. The increased smoothness in the filtered DG solutions can then be exploited by simulation postprocessing tools such as streamline integrators where the absence of continuity in the data can lead to erroneous visualizations. However, lack of extension of this filtering technique, both theoretically and computationally, to nontrivial mesh structures along with the expensive core operators have been a hindrance towards the application of the SIAC filters to more realistic simulations. In this dissertation, we focus on the numerical solutions of linear hyperbolic equations solved with the discontinuous Galerkin scheme and provide a thorough analysis of SIAC filtering applied to such solution data. In particular, we investigate how the use of different quadrature techniques could mitigate the extensive processing required when filtering over the whole computational field. Moreover, we provide detailed and efficient algorithms that a numerical practitioner requires to know in order to implement this filtering technique effectively. In our first attempt to expand the application scope of this filtering technique, we demonstrate both mathematically and through numerical examples that it is indeed possible to observe SIAC filtering characteristics when applied to numerical solutions obtained over structured triangular meshes. We further provide a thorough investigation of the interplay between mesh geometry and filtering. Building upon these promising results, we present how SIAC filtering could be applied to gain higher accuracy and smoothness when dealing with totally unstructured triangular meshes. Lastly, we provide the extension of our filtering scheme to structured tetrahedral meshes. Guidelines and future work regarding the application of the SIAC filter in the visualization domain are also presented. We further note that throughout this document, the terms postprocessing and filtering will be used interchangeably
Convergence of Intelligent Data Acquisition and Advanced Computing Systems
This book is a collection of published articles from the Sensors Special Issue on "Convergence of Intelligent Data Acquisition and Advanced Computing Systems". It includes extended versions of the conference contributions from the 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS’2019), Metz, France, as well as external contributions
Parallel generalized Delaunay mesh refinement
The modeling of physical phenomena in computational fracture mechanics, computational fluid dynamics and other fields is based on solving systems of partial differential equations (PDEs). When PDEs are defined over geometrically complex domains, they often do not admit closed form solutions. In such cases, they are solved approximately using discretizations of domains into simple elements like triangles and quadrilaterals in two dimensions (2D), and tetrahedra and hexahedra in three dimensions (3D). These discretizations are called finite element meshes. Many applications, for example, real-time computer assisted surgery, or crack propagation from fracture mechanics, impose time and/or mesh size constraints that cannot be met on a single sequential machine. as a result, the development of parallel mesh generation algorithms is required.;In this dissertation, we describe a complete solution for both sequential and parallel construction of guaranteed quality Delaunay meshes for 2D and 3D geometries. First, we generalize the existing 2D and 3D Delaunay refinement algorithms along with theoretical proofs of mesh quality in terms of element shape and mesh gradation. Existing algorithms are constrained by just one or two specific positions for the insertion of a Steiner point inside a circumscribed disk of a poorly shaped element. We derive an entire 2D or 3D region for the selection of a Steiner point (i.e., infinitely many choices) inside the circumscribed disk. Second, we develop a novel theory which extends both the 2D and the 3D Generalized Delaunay Refinement methods for the concurrent and mathematically guaranteed independent insertion of Steiner points. Previous parallel algorithms are either reactive relying on implementation heuristics to resolve dependencies in parallel mesh generation computations or require the solution of a very difficult geometric optimization problem (the domain decomposition problem) which is still open for general 3D geometries. Our theory solves both of these drawbacks. Third, using our generalization of both the sequential and the parallel algorithms we implemented prototypes of practical and efficient parallel generalized guaranteed quality Delaunay refinement codes for both 2D and 3D geometries using existing state-of-the-art sequential codes for traditional Delaunay refinement methods. On a heterogeneous cluster of more than 100 processors our implementation can generate a uniform mesh with about a billion elements in less than 5 minutes. Even on a workstation with a few cores, we achieve a significant performance improvement over the corresponding state-of-the-art sequential 3D code, for graded meshes
Recommended from our members
Coupling, Conservation, and Performance in Numerical Simulations
This thesis considers three aspects of the numerical simulations, which arecoupling, conservation, and performance. We conduct a project and addressone challenge from each of these aspects.We propose a novel penalty force to enforce contacts with accurate Coulombfriction. The force is compatible with fully-implicit time integration and theuse of optimization-based integration. In addition to processing collisionsbetween deformable objects, the force can be used to couple rigid bodies todeformable objects or the material point method. The force naturally leads tostable stacking without drift over time, even when solvers are not run toconvergence. The force leads to an asymmetrical system, and we provide apractical solution for handling these.Next we present a new technique for transferring momentum and velocity betweenparticles and MAC grids based on the Affine-Particle-In-Cell (APIC) frameworkpreviously developed for co-locatedgrids. We extend the original APIC paper and show thatthe proposed transfers preserve linear and angular momentum and also satisfyall of the original APIC properties.Early indications in the original APIC paper suggested that APIC might besuitable for simulating high Reynolds fluids due to favorable retention ofvortices, but these properties were not studied further. We use twodimensional Fourier analysis to investigate dissipation in the limit \dt=0.We investigate dissipation and vortex retention numerically to quantify theeffectiveness of APIC compared with other transfer algorithms.Finally we present an efficient solver for problems typically seen inmicrofluidic applications.Microfluidic ``lab on a chip'' devices are small devices that operate on smalllength scales on small volumes of fluid. Designs for microfluidic chips aregenerally composed of standardized and often repeated components connected bylong, thin, straight fluid channels. We propose a novel discretizationalgorithm for simulating the Stokes equations on geometry with these features,which produces sparse linear systems with many repeated matrix blocks. Thediscretization is formally third order accurate for velocity and second orderaccurate for pressure in the norm. We also propose a novel linearsystem solver based on cyclic reduction, reordered sparse Gaussian elimination,and operation caching that is designed to efficiently solve systems withrepeated matrix blocks
Methods and Distributed Software for Visualization of Cracks Propagating in Discrete Particle Systems
Scientific visualization is becoming increasingly important in analyzing and interpreting numerical and experimental data sets. Parallel computations of discrete particle systems lead to large data sets that can be produced, stored and visualized on distributed IT infrastructures. However, this leads to very complicated environments handling complex simulation and interactive visualization on the remote heterogeneous architectures. In micro-structure of continuum, broken connections between neighbouring particles can form complex cracks of unknown geometrical shape. The complex disjoint surfaces of cracks with holes and unavailability of a suitable scalar field defining the crack surfaces limit the application of the common surface extraction methods. The main visualization task is to extract the surfaces of cracks according to the connectivity of the broken connections and the geometry of the neighbouring particles. The research aims at enhancing the visualization methods of discrete particle systems and increasing speed of distributed visualization software.
The dissertation consists of introduction, three main chapters and general conclusions. In the first Chapter, a literature review on visualization software, distributed environments, discrete element simulation of particle systems and crack visualization methods is presented. In the second Chapter, novel visualization methods were proposed for extraction of crack surfaces from monodispersed particle systems modelled by the discrete element method. The cell cut-based method, the Voronoi-based method and cell centre-based method explicitly define geometry of propagating cracks in fractured regions. The proposed visualization methods were implemented in the grid visualization e–service VizLitG and the distributed visualization software VisPartDEM. Partial data set transfer from the grid storage element was developed to reduce the data transfer and visualization time.
In the third Chapter, the results of experimental research are presented. The performance of e-service VizLitG was evaluated in a geographically distributed grid. Different types of software were employed for data transfer in order to present the quantitative comparison. The performance of the developed visualization methods was investigated. The quantitative comparison of the execution time of local Voronoi-based method and that of global Voronoi diagrams generated by Voro++ library was presented. The accuracy of the developed methods was evaluated by computing the total depth of cuts made in particles by the extracted crack surfaces. The present research confirmed that the proposed visualization methods and the developed distributed software were capable of visualizing crack propagation modelled by the discrete element method in monodispersed particulate media
- …