Search CORE

82 research outputs found

Recommended from our members

Preparing sparse solvers for exascale computing.

Author: Anzt Hartwig
Boman Erik
Curfman McInnes Lois
Falgout Rob
Ghysels Pieter
Heroux Michael
Li Xiaoye
Meier Yang Ulrike
Rajamanickam Sivasankaran
Rupp Karl
Smith Barry
Tran Mills Richard
Yamazaki Ichitaro
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'

eScholarship - University of California

Hydrolink 2015/3. SPH (Smoothed Particle Hydrodynamics) in Hydraulics

Author
Publication venue: 'Indian Association of Health, Research, and Welfare (IAHRW)'
Publication date: 01/01/2015
Field of study

Topic: SPH (Smoothed Particle Hydrodynamics] in Hydraulic

Hydraulic Engineering Repository

Efficient algebraic multigrid preconditioners on clusters of GPUs

Author: Abdullahi Hassan Ambra
Cardellini Valeria
D'Ambra Pasqua
Di Serafino Daniela
Filippone Salvatore
Publication venue: World Scientific Publishing
Publication date: 01/01/2019
Field of study

Many scientific applications require the solution of large and sparse linear systems of equations using Krylov subspace methods; in this case, the choice of an effective preconditioner may be crucial for the convergence of the Krylov solver. Algebraic MultiGrid (AMG) methods are widely used as preconditioners, because of their optimal computational cost and their algorithmic scalability. The wide availability of GPUs, now found in many of the fastest supercomputers, poses the problem of implementing efficiently these methods on high-throughput processors. In this work we focus on the application phase of AMG preconditioners, and in particular on the choice and implementation of smoothers and coarsest-level solvers capable of exploiting the computational power of clusters of GPUs. We consider block-Jacobi smoothers using sparse approximate inverses in the solve phase associated with the local blocks. The choice of approximate inverses instead of sparse matrix factorizations is driven by the large amount of parallelism exposed by the matrix-vector product as compared to the solution of large triangular systems on GPUs. The selected smoothers and solvers are implemented within the AMG preconditioning framework provided by the MLD2P4 library, using suitable sparse matrix data structures from the PSBLAS library. Their behaviour is illustrated in terms of execution speed and scalability, on a test case concerning groundwater modelling, provided by the Jülich Supercomputing Center within the Horizon 2020 Project EoCoE

Archivio della ricerca - Università degli studi di Napoli Federico II

ZENODO

Cranfield CERES

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ART

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"

Parallel computing 2011, ParCo 2011: book of abstracts

Author: D'Hollander Erik
Publication venue: Academia Press Scientific Publishers
Publication date: 01/01/2011
Field of study

This book contains the abstracts of the presentations at the conference Parallel Computing 2011, 30 August - 2 September 2011, Ghent, Belgiu

Ghent University Academic Bibliography

High-performance simulation technologies for water-related natural hazards

Author: Xia Xilin
Publication venue: Newcastle University
Publication date: 01/01/2017
Field of study

PhD ThesisWater-related natural hazards, such as flash floods, landslides and debris flows, usually happen in chains. In order to better understand the underlying physical processes and more reliably quantify the associated risk, it is essential to develop a physically-based multi-hazard modelling system to simulate these hazards at a catchment scale. An effective multi-hazard modelling system may be developed by solving a set of depth-averaged dynamic equations incorporating adaptive basal resistance terms. High-performance computing achieved through implementation on modern graphic processing units (GPUs) can be used to accelerate the model to support efficient large-scale simulations. This thesis presents the key simulation technologies for developing such a novel high-performance water-related natural hazards modelling system. A new well-balanced smoothed particle hydrodynamic (SPH) model is first presented for solving the shallow water equations (SWEs) in the context of flood inundation modelling. The performance of the SPH model is compared with an alternative flood inundation model based on a finite volume (FV) method in order to select a better numerical method for the current study. The FV model performs favourably for practical applications and therefore is adopted to develop the proposed multi-hazard model. In order to more accurately describe the rainfallrunoff and overland flow process that often initiates a hazard chain, a first-order FV Godunovtype model is developed to solve the SWEs, implemented with novel source term discretisation schemes. The new model overcomes the limitations of the current prevailing numerical schemes such as inaccurate calculations of bed slope or friction source terms and provides much improved numerical accuracy, efficiency and stability for simulating overland flows and surface flooding. To support large-scale simulation of flow-like landslides or debris flows, a new formulation of depth-averaged governing equations is derived on the Cartesian coordinate system. The new governing equations take into account the effects of non-hydrostatic pressure and centrifugal force, which may become significant over terrains with steep and curved topography. These equations are compatible with various basal resistance terms, effectively leading to a unified mathematical framework for describing different type of water-related natural hazards including surface flooding, flow-like landslides and debris flows. The new depthaveraged governing equations are then solved using an FV Godunov-type framework based on the second-order accurate scheme. A flexible and GPU-based software framework is further designed to provide much improved computational efficiency for large-scale simulations and ease the future implementation of new functionalities. This provides an effective codebase for the proposed multi-hazard modelling system and its potential is confirmed by successfully applying to simulate flow-like landslides and dam break floods.Newcastle University and China Scholarship Council, Henry Lester Trust and Great Britain China Education Trus

Newcastle University eTheses

DEVELOPMENT OF A MODULAR AGRICULTURAL ROBOTIC SPRAYER

Author: Sanchez Paolo Rommel P.
Publication venue: Rowan Digital Works
Publication date: 23/05/2023
Field of study

Precision Agriculture (PA) increases farm productivity, reduces pollution, and minimizes input costs. However, the wide adoption of existing PA technologies for complex field operations, such as spraying, is slow due to high acquisition costs, low adaptability, and slow operating speed. In this study, we designed, built, optimized, and tested a Modular Agrochemical Precision Sprayer (MAPS), a robotic sprayer with an intelligent machine vision system (MVS). Our work focused on identifying and spraying on the targeted plants with low cost, high speed, and high accuracy in a remote, dynamic, and rugged environment. We first researched and benchmarked combinations of one-stage convolutional neural network (CNN) architectures with embedded or mobile hardware systems. Our analysis revealed that TensorRT-optimized SSD-MobilenetV1 on an NVIDIA Jetson Nano provided sufficient plant detection performance with low cost and power consumption. We also developed an algorithm to determine the maximum operating velocity of a chosen CNN and hardware configuration through modeling and simulation. Based on these results, we developed a CNN-based MVS for real-time plant detection and velocity estimation. We implemented Robot Operating System (ROS) to integrate each module for easy expansion. We also developed a robust dynamic targeting algorithm to synchronize the spray operation with the robot motion, which will increase productivity significantly. The research proved to be successful. We built a MAPS with three independent vision and spray modules. In the lab test, the sprayer recognized and hit all targets with only 2% wrong sprays. In the field test with an unstructured crop layout, such as a broadcast-seeded soybean field, the MAPS also successfully sprayed all targets with only a 7% incorrect spray rate

Rowan University

Classification of the difficulty in accelerating problems using GPUs

Author: Tristram Uvedale Roy
Publication venue: Faculty of Science, Computer Science
Publication date: 01/01/2014
Field of study

Scientists continually require additional processing power, as this enables them to compute larger problem sizes, use more complex models and algorithms, and solve problems previously thought computationally impractical. General-purpose computation on graphics processing units (GPGPU) can help in this regard, as there is great potential in using graphics processors to accelerate many scientific models and algorithms. However, some problems are considerably harder to accelerate than others, and it may be challenging for those new to GPGPU to ascertain the difficulty of accelerating a particular problem or seek appropriate optimisation guidance. Through what was learned in the acceleration of a hydrological uncertainty ensemble model, large numbers of k-difference string comparisons, and a radix sort, problem attributes have been identified that can assist in the evaluation of the difficulty in accelerating a problem using GPUs. The identified attributes are inherent parallelism, branch divergence, problem size, required computational parallelism, memory access pattern regularity, data transfer overhead, and thread cooperation. Using these attributes as difficulty indicators, an initial problem difficulty classification framework has been created that aids in GPU acceleration difficulty evaluation. This framework further facilitates directed guidance on suggested optimisations and required knowledge based on problem classification, which has been demonstrated for the aforementioned accelerated problems. It is anticipated that this framework, or a derivative thereof, will prove to be a useful resource for new or novice GPGPU developers in the evaluation of potential problems for GPU acceleration

SEALS Digital commons

South East Academic Libraries System (SEALS)

Recommended from our members

Using GPU acceleration and a novel artificial neural networks approach for ultra-fast fluorescence lifetime imaging microscopy analysis

Author: Wu Gang
Publication venue
Publication date: 28/11/2017
Field of study

Fluorescence lifetime imaging microscopy (FLIM) which is capable of visualizing local molecular and physiological parameters in living cells, plays a significant role in biological sciences, chemistry, and medical research. In order to unveil dynamic cellular processes, it is necessary to develop high-speed FLIM technology. Thanks to the development of highly parallel time-to-digital convertor (TDC) arrays, especially when integrated with single-photon avalanche diodes (SPADs), the acquisition rate of high-resolution fluorescence lifetime imaging has been dramatically improved. On the other hand, these technological advances and advanced data acquisition systems have generated massive data, which significantly increases the difficulty of FLIM analysis. Traditional FLIM systems rely on time-consuming iterative algorithms to retrieve the FLIM parameters. Therefore, lifetime analysis has become a bottleneck for high-speed FLIM applications, let alone real-time or video-rate FLIM systems. Although some simple algorithms have been proposed, most of them are only able to resolve a simple FLIM decay model. On the other hand, existing FLIM systems based on CPU processing do not make use of available parallel acceleration. In order to tackle the existing problems, my study focused on introducing the state-of-art general purpose graphics processing units (GPUs) to the FLIM analysis, and building a data processing system based on both CPU and GPUs. With a large amount of parallel cores, the GPUs are able to significantly speed up lifetime analysis compared to CPU-only processing. In addition to transform the existing algorithms into GPU computing, I have developed a new high-speed and GPU friendly algorithm based on an artificial neural network (ANN). The proposed GPU-ANN-FLIM method has dramatically improved the efficiency of FLIM analysis, which is at least 1000-folder faster than some traditional algorithms, meaning that it has great potential to fuel current revolutions in high-speed high-resolution FLIM applications

Sussex Research Online

Computational methods and software for the design of inertial microfluidic flow sculpting devices

Author: Stoecklein Daniel James
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2017
Field of study

The ability to sculpt inertially flowing fluid via bluff body obstacles has enormous promise for applications in bioengineering, chemistry, and manufacturing within microfluidic devices. However, the computational difficulty inherent to full scale 3-dimensional fluid flow simulations makes designing and optimizing such systems tedious, costly, and generally tasked to computational experts with access to high performance resources. The goal of this work is to construct efficient models for the design of inertial microfluidic flow sculpting devices, and implement these models in freely available, user-friendly software for the broader microfluidics community. Two software packages were developed to accomplish this: uFlow and FlowSculpt . uFlow solves the forward problem in flow sculpting, that of predicting the net deformation from an arbitrary sequence of obstacles (pillars), and includes estimations of transverse mass diffusion and particles formed by optical lithography. FlowSculpt solves the more difficult inverse problem in flow sculpting, which is to design a flow sculpting device which produces a target flow shape. Each piece of software uses efficient, experimentally validated forward models developed within this work, which are applied to deep learning techniques to explore other routes to solving the inverse problem. The models are also highly modular, capable of incorporating new microfluidic components and flow physics to the design process. It is anticipated that the microfluidics community will integrate the tools developed here into their own research, and bring new designs, components, and applications to the inertial flow sculpting platform

Digital Repository @ Iowa State University (ISU)