Search CORE

15 research outputs found

PyCOOL - a Cosmological Object-Oriented Lattice code written in Python

Author: A. Chambers
A. Chambers
A. {Klöckner} .
A.V. Frolov
D. Groen
E. Gaburov
G. Khanna
G.N. Felder
H.-Y. Schive
J Sainio
K.-I. Ishikawa
M.A. Amin
N. Nakasato
NVIDIA
P. Micikevicius
R. Capuzzo-Dolcetta
R. Easther
R.J. Brunner
S. Banerjee
S. Ord .
S. von Hoerner
S. von Hoerner
S.K. Chung
T. Hiramatsu
T. {Szalay}
V. Anselmi
V. Demchik
Publication venue: 'IOP Publishing'
Publication date: 30/04/2012
Field of study

There are a number of different phenomena in the early universe that have to be studied numerically with lattice simulations. This paper presents a graphics processing unit (GPU) accelerated Python program called PyCOOL that solves the evolution of scalar fields in a lattice with very precise symplectic integrators. The program has been written with the intention to hit a sweet spot of speed, accuracy and user friendliness. This has been achieved by using the Python language with the PyCUDA interface to make a program that is easy to adapt to different scalar field models. In this paper we derive the symplectic dynamics that govern the evolution of the system and then present the implementation of the program in Python and PyCUDA. The functionality of the program is tested in a chaotic inflation preheating model, a single field oscillon case and in a supersymmetric curvaton model which leads to Q-ball production. We have also compared the performance of a consumer graphics card to a professional Tesla compute card in these simulations. We find that the program is not only accurate but also very fast. To further increase the usefulness of the program we have equipped it with numerous post-processing functions that provide useful information about the cosmological model. These include various spectra and statistics of the fields. The program can be additionally used to calculate the generated curvature perturbation. The program is publicly available under GNU General Public License at https://github.com/jtksai/PyCOOL . Some additional information can be found from http://www.physics.utu.fi/tiedostot/theory/particlecosmology/pycool/ .Comment: 23 pages, 12 figures; some typos correcte

arXiv.org e-Print Archive

Crossref

A Full-Depth Amalgamated Parallel 3D Geometric Multigrid Solver for GPU Clusters

Author: Brandt A.
Brandvik T.
Corrigan A.
Cwire
Cwire
Elsen E.
Fan Z.
Goodnight N.
Griebel M.
Gropp W. D.
Göddeke D.
Hempel R.
Kindratenko V.
Matsuoka S.
McBryan O. A.
Micikevicius P.
Owens J.D.
Press W. H.
Schive H.
Showerman M.
Thibault J. C.
Tokyo Institute
Wan D.C.
Publication venue: 'IUScholarWorks'
Publication date: 04/01/2011
Field of study

Numerical computations of incompressible flow equations with pressure-based algorithms necessitate the solution of an elliptic Poisson equation, for which multigrid methods are known to be very efficient. In our previous work we presented a dual-level (MPI-CUDA) parallel implementation of the Navier-Stokes equations to simulate buoyancy-driven incompressible fluid flows on GPU clusters with simple iterative methods while focusing on the scalability of the overall solver. In the present study we describe the implementation and performance of a multigrid method to solve the pressure Poisson equation within our MPI-CUDA parallel incompressible flow solver. Various design decisions and algorithmic choices for multigrid methods are explored in light of NVIDIA’s recent Fermi architecture. We discuss how unique aspects of an MPI-CUDA implementation for GPU clusters is related to the software choices made to implement the multigrid method. We propose a new coarse grid solution method of embedded multigrid with amalgamation and show that the parallel implementation retains the numerical efficiency of the multigrid method. Performance measurements on the NCSA Lincoln and TACC Longhorn clusters are presented for up to 64 GPUs

Crossref

Boise State University - ScholarWorks

A hybrid format for better performance of sparse matrix-vector multiplication on a GPU

Author: Guo D
Micikevicius P
Rennich S
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

Accelerating unstructured grid-based seismic modeling on GPU

Author: Micikevicius P.
Muyan-Ozcelik P.
Shi X.
Publication venue: 'Society of Exploration Geophysicists'
Publication date
Field of study

Crossref

Interactive Forest Walk-through

Author: C. E. Hughes
J. M. Moshell
P. Micikevicius
Publication venue
Publication date
Field of study

Interactive rendering of a forest containing a large number of unique trees and other vegetation is a challenging and important problem in computer graphics and visual simulation. While methods for rendering near photo-realistic vegetation scenes have been described in the literature, they require tens of minutes or even hours of computation. In order to support interactive forest walk-throughs, we propose a hierarchical method for computing levels of detail for trees, as well as a framework for traversing scenes of arbitrary size. The proposed framework selects levels of detail based on a combination of visibility and projected size metrics, rather than projected size alone. Dynamic scene modification is possible since visibility is determined at run-time and requires no preprocessing step. Keywords: L-systems, level-of-detail, occlusion, walk-through

CiteSeerX

Full waveform inversion on the CPU/GPU heterogeneous platform and its application on land datasets

Author: Liu H W
Micikevicius P.
Sun X
Zhao L
Publication venue: 'Society of Exploration Geophysicists'
Publication date
Field of study

Crossref

A survey of convolutional neural networks on edge with reconfigurable computing

Author: Mário P. Véstias
Howard
Wang
Scherer
Nwankpa
Lecun
Zeiler
Erhan
Iandola
Micikevicius
Wang
Umuroglu
Courbariaux
Hubara
Han
Istrate
Guo
Abdelouahab
Publication venue: 'MDPI AG'
Publication date: 01/01/2010
Field of study

The convolutional neural network (CNN) is one of the most used deep learning models for image detection and classification, due to its high accuracy when compared to other machine learning algorithms. CNNs achieve better results at the cost of higher computing and memory requirements. Inference of convolutional neural networks is therefore usually done in centralized high-performance platforms. However, many applications based on CNNs are migrating to edge devices near the source of data due to the unreliability of a transmission channel in exchanging data with a central server, the uncertainty about channel latency not tolerated by many applications, security and data privacy, etc. While advantageous, deep learning on edge is quite challenging because edge devices are usually limited in terms of performance, cost, and energy. Reconfigurable computing is being considered for inference on edge due to its high performance and energy efficiency while keeping a high hardware flexibility that allows for the easy adaption of the target computing platform to the CNN model. In this paper, we described the features of the most common CNNs, the capabilities of reconfigurable computing for running CNNs, the state-of-the-art of reconfigurable computing implementations proposed to run CNN models, as well as the trends and challenges for future edge reconfigurable platforms.info:eu-repo/semantics/publishedVersio

Repositório Científico do Instituto Politécnico de Lisboa

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

RCAAP - Repositório Científico de Acesso Aberto de Portugal

The application of GPU in forward modeling and inversion

Author: Bin Zhang
Bo Li
Hong-Wei Liu
Hong-Wei Liu
Ken-Li Li
Micikevicius P
Wei-Feng Liu
Publication venue: 'Society of Exploration Geophysicists'
Publication date
Field of study

Crossref

A Survey of Convolutional Neural Networks on Edge with Reconfigurable Computing

Author: Abdelouahab
Courbariaux
Erhan
Guo
Han
Howard
Hubara
Iandola
Istrate
Lecun
Micikevicius
Mário P. Véstias
Nwankpa
Scherer
Umuroglu
Wang
Wang
Zeiler
Publication venue: 'MDPI AG'
Publication date
Field of study

Crossref

Active Brownian agents with concentration-dependent chemotactic sensitivity

Author: F. Schweitzer
J. D. Murray
L. Schimansky-Geier
Lutz Schimansky-Geier
Marcel Meyer
N. Eldredge
P. Micikevicius
Pawel Romanczuk
W. Ebeling
Publication venue: 'American Physical Society (APS)'
Publication date
Field of study

Crossref