Search CORE

14,619 research outputs found

MeshfreeFlowNet: A Physics-Constrained Deep Continuous Space-Time Super-Resolution Framework

Author: Anandkumar Anima
Azizzadenesheli Kamyar
Esmaeilzadeh Soheil
Jiang Chiyu Max
Kashinath Karthik
Marcus Philip
Mustafa Mustafa
Prabhat
Tchelepi Hamdi A.
Publication venue
Publication date: 01/05/2020
Field of study

We propose MeshfreeFlowNet, a novel deep learning-based super-resolution framework to generate continuous (grid-free) spatio-temporal solutions from the low-resolution inputs. While being computationally efficient, MeshfreeFlowNet accurately recovers the fine-scale quantities of interest. MeshfreeFlowNet allows for: (i) the output to be sampled at all spatio-temporal resolutions, (ii) a set of Partial Differential Equation (PDE) constraints to be imposed, and (iii) training on fixed-size inputs on arbitrarily sized spatio-temporal domains owing to its fully convolutional encoder. We empirically study the performance of MeshfreeFlowNet on the task of super-resolution of turbulent flows in the Rayleigh-Benard convection problem. Across a diverse set of evaluation metrics, we show that MeshfreeFlowNet significantly outperforms existing baselines. Furthermore, we provide a large scale implementation of MeshfreeFlowNet and show that it efficiently scales across large clusters, achieving 96.80% scaling efficiency on up to 128 GPUs and a training time of less than 4 minutes.Comment: Supplementary Video: https://youtu.be/mjqwPch9gDo. Accepted to SC2

arXiv.org e-Print Archive

Caltech Authors

Using Graph Properties to Speed-up GPU-based Graph Traversal: A Model-driven Approach

Author: de Laat Cees
Varbanescu Ana Lucia
Verstraaten Merijn
Publication venue
Publication date: 03/08/2017
Field of study

While it is well-known and acknowledged that the performance of graph algorithms is heavily dependent on the input data, there has been surprisingly little research to quantify and predict the impact the graph structure has on performance. Parallel graph algorithms, running on many-core systems such as GPUs, are no exception: most research has focused on how to efficiently implement and tune different graph operations on a specific GPU. However, the performance impact of the input graph has only been taken into account indirectly as a result of the graphs used to benchmark the system. In this work, we present a case study investigating how to use the properties of the input graph to improve the performance of the breadth-first search (BFS) graph traversal. To do so, we first study the performance variation of 15 different BFS implementations across 248 graphs. Using this performance data, we show that significant speed-up can be achieved by combining the best implementation for each level of the traversal. To make use of this data-dependent optimization, we must correctly predict the relative performance of algorithms per graph level, and enable dynamic switching to the optimal algorithm for each level at runtime. We use the collected performance data to train a binary decision tree, to enable high-accuracy predictions and fast switching. We demonstrate empirically that our decision tree is both fast enough to allow dynamic switching between implementations, without noticeable overhead, and accurate enough in its prediction to enable significant BFS speedup. We conclude that our model-driven approach (1) enables BFS to outperform state of the art GPU algorithms, and (2) can be adapted for other BFS variants, other algorithms, or more specific datasets

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

The ESCAPE project : Energy-efficient Scalable Algorithms for Weather Prediction at Exascale

Author: Baldauf Michael
Bauer Peter
Berg Per
Bosak Bartosz
Bénard Pierre
Błażewicz Marek
Ciesielski Sebastian
Ciznicki Milosz
Clement Valentin
Colavolpe Charles
Deconinck Willem
Degrauwe Daan
Diamantakis Michail
Douriez Louis
Fuhrer Oliver
Gillard Mike
Glinton Michael
Gray Alan
Guibert David
Hamrud Mats
Kulczewski Michał
Kurowski Krzysztof
Kühnlein Christian
Lange Michael
Lock Sarah-Jane
Lysaght Michael
Macfaden Alexander J
Marguinaud Philippe
Mazauric Cyril
McKinstry Alastair
Mengaldo Gianmarco
Messmer Peter
Mozdzynski George
Müller Andreas
New Nick
Nielsen Kristian P
O'Brien Enda
Osuna Carlos
Piotrowski Zbigniew P
Piątek Wojciech
Poulsen Jacob W
Procyk Marcin
Raffin Erwan
Robinson Oisín
Saarinen Sami
Sass Bent H
Shukla Parijat
Smet Geert
Smolarkiewicz Piotr K
Spychala Pawel
Szmelter Joanna
Termonia Piet
Thiemert Daniel
Van Bever Joris
Vigouroux Xavier
Voitus Fabrice
Wedi Nils
Wyszogrodzki Andrzej
Zheng Yongjun
Publication venue: 'Copernicus GmbH'
Publication date: 01/01/2019
Field of study

In the simulation of complex multi-scale flows arising in weather and climate modelling, one of the biggest challenges is to satisfy strict service requirements in terms of time to solution and to satisfy budgetary constraints in terms of energy to solution, without compromising the accuracy and stability of the application. These simulations require algorithms that minimise the energy footprint along with the time required to produce a solution, maintain the physically required level of accuracy, are numerically stable, and are resilient in case of hardware failure. The European Centre for Medium-Range Weather Forecasts (ECMWF) led the ESCAPE (Energy-efficient Scalable Algorithms for Weather Prediction at Exascale) project, funded by Horizon 2020 (H2020) under the FET-HPC (Future and Emerging Technologies in High Performance Computing) initiative. The goal of ESCAPE was to develop a sustainable strategy to evolve weather and climate prediction models to next-generation computing technologies. The project partners incorporate the expertise of leading European regional forecasting consortia, university research, experienced high-performance computing centres, and hardware vendors. This paper presents an overview of the ESCAPE strategy: (i) identify domain-specific key algorithmic motifs in weather prediction and climate models (which we term Weather & Climate Dwarfs), (ii) categorise them in terms of computational and communication patterns while (iii) adapting them to different hardware architectures with alternative programming models, (iv) analyse the challenges in optimising, and (v) find alternative algorithms for the same scheme. The participating weather prediction models are the following: IFS (Integrated Forecasting System); ALARO, a combination of AROME (Application de la Recherche a l'Operationnel a Meso-Echelle) and ALADIN (Aire Limitee Adaptation Dynamique Developpement International); and COSMO-EULAG, a combination of COSMO (Consortium for Small-scale Modeling) and EULAG (Eulerian and semi-Lagrangian fluid solver). For many of the weather and climate dwarfs ESCAPE provides prototype implementations on different hardware architectures (mainly Intel Skylake CPUs, NVIDIA GPUs, Intel Xeon Phi, Optalysys optical processor) with different programming models. The spectral transform dwarf represents a detailed example of the co-design cycle of an ESCAPE dwarf. The dwarf concept has proven to be extremely useful for the rapid prototyping of alternative algorithms and their interaction with hardware; e.g. the use of a domain-specific language (DSL). Manual adaptations have led to substantial accelerations of key algorithms in numerical weather prediction (NWP) but are not a general recipe for the performance portability of complex NWP models. Existing DSLs are found to require further evolution but are promising tools for achieving the latter. Measurements of energy and time to solution suggest that a future focus needs to be on exploiting the simultaneous use of all available resources in hybrid CPU-GPU arrangements

Loughborough University Institutional Repository

Ghent University Academic Bibliography