145 research outputs found

    A unifying mathematical definition enables the theoretical study of the algorithmic class of particle methods.

    Get PDF
    Mathematical definitions provide a precise, unambiguous way to formulate concepts. They also provide a common language between disciplines. Thus, they are the basis for a well-founded scientific discussion. In addition, mathematical definitions allow for deeper insights into the defined subject based on mathematical theorems that are incontrovertible under the given definition. Besides their value in mathematics, mathematical definitions are indispensable in other sciences like physics, chemistry, and computer science. In computer science, they help to derive the expected behavior of a computer program and provide guidance for the design and testing of software. Therefore, mathematical definitions can be used to design and implement advanced algorithms. One class of widely used algorithms in computer science is the class of particle-based algorithms, also known as particle methods. Particle methods can solve complex problems in various fields, such as fluid dynamics, plasma physics, or granular flows, using diverse simulation methods, including Discrete Element Methods (DEM), Molecular Dynamics (MD), Reproducing Kernel Particle Methods (RKPM), Particle Strength Exchange (PSE), and Smoothed Particle Hydrodynamics (SPH). Despite the increasing use of particle methods driven by improved computing performance, the relation between these algorithms remains formally unclear. In particular, particle methods lack a unifying mathematical definition and precisely defined terminology. This prevents the determination of whether an algorithm belongs to the class and what distinguishes the class. Here we present a rigorous mathematical definition for determining particle methods and demonstrate its importance by applying it to several canonical algorithms and those not previously recognized as particle methods. Furthermore, we base proofs of theorems about parallelizability and computational power on it and use it to develop scientific computing software. Our definition unified, for the first time, the so far loosely connected notion of particle methods. Thus, it marks the necessary starting point for a broad range of joint formal investigations and applications across fields.:1 Introduction 1.1 The Role of Mathematical Definitions 1.2 Particle Methods 1.3 Scope and Contributions of this Thesis 2 Terminology and Notation 3 A Formal Definition of Particle Methods 3.1 Introduction 3.2 Definition of Particle Methods 3.2.1 Particle Method Algorithm 3.2.2 Particle Method Instance 3.2.3 Particle State Transition Function 3.3 Explanation of the Definition of Particle Methods 3.3.1 Illustrative Example 3.3.2 Explanation of the Particle Method Algorithm 3.3.3 Explanation of the Particle Method Instance 3.3.4 Explanation of the State Transition Function 3.4 Conclusion 4 Algorithms as Particle Methods 4.1 Introduction 4.2 Perfectly Elastic Collision in Arbitrary Dimensions 4.3 Particle Strength Exchange 4.4 Smoothed Particle Hydrodynamics 4.5 Lennard-Jones Molecular Dynamics 4.6 Triangulation refinement 4.7 Conway's Game of Life 4.8 Gaussian Elimination 4.9 Conclusion 5 Parallelizability of Particle Methods 5.1 Introduction 5.2 Particle Methods on Shared Memory Systems 5.2.1 Parallelization Scheme 5.2.2 Lemmata 5.2.3 Parallelizability 5.2.4 Time Complexity 5.2.5 Application 5.3 Particle Methods on Distributed Memory Systems 5.3.1 Parallelization Scheme 5.3.2 Lemmata 5.3.3 Parallelizability 5.3.4 Bounds on Time Complexity and Parallel Scalability 5.4 Conclusion 6 Turing Powerfulness and Halting Decidability 6.1 Introduction 6.2 Turing Machine 6.3 Turing Powerfulness of Particle Methods Under a First Set of Constraints 6.4 Turing Powerfulness of Particle Methods Under a Second Set of Constraints 6.5 Halting Decidability of Particle Methods 6.6 Conclusion 7 Particle Methods as a Basis for Scientific Software Engineering 7.1 Introduction 7.2 Design of the Prototype 7.3 Applications, Comparisons, Convergence Study, and Run-time Evaluations 7.4 Conclusion 8 Results, Discussion, Outlook, and Conclusion 8.1 Problem 8.2 Results 8.3 Discussion 8.4 Outlook 8.5 Conclusio

    Machine Learning and Its Application to Reacting Flows

    Get PDF
    This open access book introduces and explains machine learning (ML) algorithms and techniques developed for statistical inferences on a complex process or system and their applications to simulations of chemically reacting turbulent flows. These two fields, ML and turbulent combustion, have large body of work and knowledge on their own, and this book brings them together and explain the complexities and challenges involved in applying ML techniques to simulate and study reacting flows. This is important as to the world’s total primary energy supply (TPES), since more than 90% of this supply is through combustion technologies and the non-negligible effects of combustion on environment. Although alternative technologies based on renewable energies are coming up, their shares for the TPES is are less than 5% currently and one needs a complete paradigm shift to replace combustion sources. Whether this is practical or not is entirely a different question, and an answer to this question depends on the respondent. However, a pragmatic analysis suggests that the combustion share to TPES is likely to be more than 70% even by 2070. Hence, it will be prudent to take advantage of ML techniques to improve combustion sciences and technologies so that efficient and “greener” combustion systems that are friendlier to the environment can be designed. The book covers the current state of the art in these two topics and outlines the challenges involved, merits and drawbacks of using ML for turbulent combustion simulations including avenues which can be explored to overcome the challenges. The required mathematical equations and backgrounds are discussed with ample references for readers to find further detail if they wish. This book is unique since there is not any book with similar coverage of topics, ranging from big data analysis and machine learning algorithm to their applications for combustion science and system design for energy generation

    Simulating the nonlinear QED vacuum

    Get PDF

    Corporate influence and the academic computer science discipline. [4: CMU]

    Get PDF
    Prosopographical work on the four major centers for computer research in the United States has now been conducted, resulting in big questions about the independence of, so called, computer science

    Materials and Molecular Modelling at the Exascale

    Get PDF
    Progression of computational resources towards exascale computing makes possible simulations of unprecedented accuracy and complexity in the fields of materials and molecular modelling (MMM), allowing high fidelity in silico experiments on complex materials of real technological interest. However, this presents demanding challenges for the software used, especially the exploitation of the huge degree of parallelism available on exascale hardware, and the associated problems of developing effective workflows and data management on such platforms. As part of the UKs ExCALIBUR exascale computing initiative, the UK-led MMM Design and Development Working Group has worked with the broad MMM community to identify a set of high priority application case studies which will drive future exascale software developments. We present an overview of these case studies, categorized by the methodological challenges which will be required to realize them on exascale platforms, and discuss the exascale requirements, software challenges and impact of each application area

    Simulation Intelligence: Towards a New Generation of Scientific Methods

    Full text link
    The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence. We call this merger simulation intelligence (SI), for short. We argue the motifs of simulation intelligence are interconnected and interdependent, much like the components within the layers of an operating system. Using this metaphor, we explore the nature of each layer of the simulation intelligence operating system stack (SI-stack) and the motifs therein: (1) Multi-physics and multi-scale modeling; (2) Surrogate modeling and emulation; (3) Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based modeling; (6) Probabilistic programming; (7) Differentiable programming; (8) Open-ended optimization; (9) Machine programming. We believe coordinated efforts between motifs offers immense opportunity to accelerate scientific discovery, from solving inverse problems in synthetic biology and climate science, to directing nuclear energy experiments and predicting emergent behavior in socioeconomic settings. We elaborate on each layer of the SI-stack, detailing the state-of-art methods, presenting examples to highlight challenges and opportunities, and advocating for specific ways to advance the motifs and the synergies from their combinations. Advancing and integrating these technologies can enable a robust and efficient hypothesis-simulation-analysis type of scientific method, which we introduce with several use-cases for human-machine teaming and automated science

    Novel parallel approaches to efficiently solve spatial problems on heterogeneous CPU-GPU systems

    Get PDF
    Addressing this task is difficult as (i) it requires analysing large databases in a short time, and (ii) it is commonly addressed by combining different methods with complex data dependencies, making it challenging to exploit parallelism on heterogeneous CPU-GPU systems. Moreover, most efforts in this context focus on improving the accuracy of the approaches and neglect reducing the processing time—the most accurate algorithm was designed to process the fingerprints using a single thread. We developed a new methodology to address the latent fingerprint identification problem called “Asynchronous processing for Latent Fingerprint Identification” (ALFI) that speeds up processing while maintaining high accuracy. ALFI exploits all the resources of CPU-GPU systems using asynchronous processing and fine-coarse parallelism to analyse massive fingerprint databases. We assessed the performance of ALFI on Linux and Windows operating systems using the well-known NIST/FVC databases. Experimental results revealed that ALFI is on average 22x faster than the state-of-the-art identification algorithm, reaching a speed-up of 44.7x for the best-studied case. In terrain analysis, Digital Elevation Models (DEMs) are relevant datasets used as input to those algorithms that typically sweep the terrain to analyse its main topological features such as visibility, elevation, and slope. The most challenging computation related to this topic is the total viewshed problem. It involves computing the viewshed—the visible area of the terrain—for each of the points in the DEM. The algorithms intended to solve this problem require many memory accesses to 2D arrays, which, despite being regular, lead to poor data locality in memory. We proposed a methodology called “skewed Digital Elevation Model” (sDEM) that substantially improves the locality of memory accesses and exploits the inherent parallelism of rotational sweep-based algorithms. Particularly, sDEM applies a data relocation technique before accessing the memory and computing the viewshed, thus significantly reducing the execution time. Different implementations are provided for single-core, multi-core, single-GPU, and multi-GPU platforms. We carried out two experiments to compare sDEM with (i) the most used geographic information systems (GIS) software and (ii) the state-of-the-art algorithm for solving the total viewshed problem. In the first experiment, sDEM results on average 8.8x faster than current GIS software, despite considering only a few points because of the limitations of the GIS software. In the second experiment, sDEM is 827.3x faster than the state-of-the-art algorithm considering the best case. The use of Unmanned Aerial Vehicles (UAVs) with multiple onboard sensors has grown enormously in tasks involving terrain coverage, such as environmental and civil monitoring, disaster management, and forest fire fighting. Many of these tasks require a quick and early response, which makes maximising the land covered from the flight path an essential goal, especially when the area to be monitored is irregular, large, and includes many blind spots. In this regard, state-of-the-art total viewshed algorithms can help analyse large areas and find new paths providing all-round visibility. We designed a new heuristic called “Visibility-based Path Planning” (VPP) to solve the path planning problem in large areas based on a thorough visibility analysis. VPP generates flyable paths that provide high visual coverage to monitor forest regions using the onboard camera of a single UAV. For this purpose, the hidden areas of the target territory are identified and considered when generating the path. Simulation results showed that VPP covers up to 98.7% of the Montes de Malaga Natural Park and 94.5% of the Sierra de las Nieves National Park, both located in the province of Malaga (Spain). In addition, a real flight test confirmed the high visibility achieved using VPP. Our methodology and analysis can be easily applied to enhance monitoring in other large outdoor areas.In recent years, approaches that seek to extract valuable information from large datasets have become particularly relevant in today's society. In this category, we can highlight those problems that comprise data analysis distributed across two-dimensional scenarios called spatial problems. These usually involve processing (i) a series of features distributed across a given plane or (ii) a matrix of values where each cell corresponds to a point on the plane. Therefore, we can see the open-ended and complex nature of spatial problems, but it also leaves room for imagination to be applied in the search for new solutions. One of the main complications we encounter when dealing with spatial problems is that they are very computationally intensive, typically taking a long time to produce the desired result. This drawback is also an opportunity to use heterogeneous systems to address spatial problems more efficiently. Heterogeneous systems give the developer greater freedom to speed up suitable algorithms by increasing the parallel programming options available, making it possible for different parts of a program to run on the dedicated hardware that suits them best. Several of the spatial problems that have not been optimised for heterogeneous systems cover very diverse areas that seem vastly different at first sight. However, they are closely related due to common data processing requirements, making them suitable for using dedicated hardware. In particular, this thesis provides new parallel approaches to tackle the following three crucial spatial problems: latent fingerprint identification, total viewshed computation, and path planning based on maximising visibility in large regions. Latent fingerprint identification is one of the essential identification procedures in criminal investigations. Addressing this task is difficult as (i) it requires analysing large databases in a short time, and (ii) it is commonly addressed by combining different methods with complex data dependencies, making it challenging to exploit parallelism on heterogeneous CPU-GPU systems. Moreover, most efforts in this context focus on improving the accuracy of the approaches and neglect reducing the processing time—the most accurate algorithm was designed to process the fingerprints using a single thread. We developed a new methodology to address the latent fingerprint identification problem called “Asynchronous processing for Latent Fingerprint Identification” (ALFI) that speeds up processing while maintaining high accuracy. ALFI exploits all the resources of CPU-GPU systems using asynchronous processing and fine-coarse parallelism to analyse massive fingerprint databases. We assessed the performance of ALFI on Linux and Windows operating systems using the well-known NIST/FVC databases. Experimental results revealed that ALFI is on average 22x faster than the state-of-the-art identification algorithm, reaching a speed-up of 44.7x for the best-studied case. In terrain analysis, Digital Elevation Models (DEMs) are relevant datasets used as input to those algorithms that typically sweep the terrain to analyse its main topological features such as visibility, elevation, and slope. The most challenging computation related to this topic is the total viewshed problem. It involves computing the viewshed—the visible area of the terrain—for each of the points in the DEM. The algorithms intended to solve this problem require many memory accesses to 2D arrays, which, despite being regular, lead to poor data locality in memory. We proposed a methodology called “skewed Digital Elevation Model” (sDEM) that substantially improves the locality of memory accesses and exploits the inherent parallelism of rotational sweep-based algorithms. Particularly, sDEM applies a data relocation technique before accessing the memory and computing the viewshed, thus significantly reducing the execution time. Different implementations are provided for single-core, multi-core, single-GPU, and multi-GPU platforms. We carried out two experiments to compare sDEM with (i) the most used geographic information systems (GIS) software and (ii) the state-of-the-art algorithm for solving the total viewshed problem. In the first experiment, sDEM results on average 8.8x faster than current GIS software, despite considering only a few points because of the limitations of the GIS software. In the second experiment, sDEM is 827.3x faster than the state-of-the-art algorithm considering the best case. The use of Unmanned Aerial Vehicles (UAVs) with multiple onboard sensors has grown enormously in tasks involving terrain coverage, such as environmental and civil monitoring, disaster management, and forest fire fighting. Many of these tasks require a quick and early response, which makes maximising the land covered from the flight path an essential goal, especially when the area to be monitored is irregular, large, and includes many blind spots. In this regard, state-of-the-art total viewshed algorithms can help analyse large areas and find new paths providing all-round visibility. We designed a new heuristic called “Visibility-based Path Planning” (VPP) to solve the path planning problem in large areas based on a thorough visibility analysis. VPP generates flyable paths that provide high visual coverage to monitor forest regions using the onboard camera of a single UAV. For this purpose, the hidden areas of the target territory are identified and considered when generating the path. Simulation results showed that VPP covers up to 98.7% of the Montes de Malaga Natural Park and 94.5% of the Sierra de las Nieves National Park, both located in the province of Malaga (Spain). In addition, a real flight test confirmed the high visibility achieved using VPP. Our methodology and analysis can be easily applied to enhance monitoring in other large outdoor areas

    Efficient Algorithms for Solving Structured Eigenvalue Problems Arising in the Description of Electronic Excitations

    Get PDF
    Matrices arising in linear-response time-dependent density functional theory and many-body perturbation theory, in particular in the Bethe-Salpeter approach, show a 2 Ă— 2 block structure. The motivation to devise new algorithms, instead of using general purpose eigenvalue solvers, comes from the need to solve large problems on high performance computers. This requires parallelizable and communication-avoiding algorithms and implementations. We point out various novel directions for diagonalizing structured matrices. These include the solution of skew-symmetric eigenvalue problems in ELPA, as well as structure preserving spectral divide-and-conquer schemes employing generalized polar decompostions
    • …
    corecore