Search CORE

67,491 research outputs found

Teaching Parallel Programming Using Java

Author: Akhtar Aleem
Carpenter Bryan
Javed Ansar
Shafi Aamir
Publication venue
Publication date: 01/01/2014
Field of study

This paper presents an overview of the "Applied Parallel Computing" course taught to final year Software Engineering undergraduate students in Spring 2014 at NUST, Pakistan. The main objective of the course was to introduce practical parallel programming tools and techniques for shared and distributed memory concurrent systems. A unique aspect of the course was that Java was used as the principle programming language. The course was divided into three sections. The first section covered parallel programming techniques for shared memory systems that include multicore and Symmetric Multi-Processor (SMP) systems. In this section, Java threads was taught as a viable programming API for such systems. The second section was dedicated to parallel programming tools meant for distributed memory systems including clusters and network of computers. We used MPJ Express-a Java MPI library-for conducting programming assignments and lab work for this section. The third and the final section covered advanced topics including the MapReduce programming model using Hadoop and the General Purpose Computing on Graphics Processing Units (GPGPU).Comment: 8 Pages, 6 figures, MPJ Express, MPI Java, Teaching Parallel Programmin

arXiv.org e-Print Archive

Crossref

Portsmouth University Research Portal (Pure)

Grid enabling legacy applications for scalability – Experiences of a production application on the UK NGS

Author: Fowler R
Pakhira A
Perring T
Sastry L
Publication venue
Publication date: 01/01/2005
Field of study

ePubs: the open archive for STFC research publications

A Block Minorization--Maximization Algorithm for Heteroscedastic Regression

Author: Lloyd-Jones Luke R.
McLachlan Geoffrey J.
Nguyen Hien D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/05/2016
Field of study

The computation of the maximum likelihood (ML) estimator for heteroscedastic regression models is considered. The traditional Newton algorithms for the problem require matrix multiplications and inversions, which are bottlenecks in modern Big Data contexts. A new Big Data-appropriate minorization--maximization (MM) algorithm is considered for the computation of the ML estimator. The MM algorithm is proved to generate monotonically increasing sequences of likelihood values and to be convergent to a stationary point of the log-likelihood function. A distributed and parallel implementation of the MM algorithm is presented and the MM algorithm is shown to have differing time complexity to the Newton algorithm. Simulation studies demonstrate that the MM algorithm improves upon the computation time of the Newton algorithm in some practical scenarios where the number of observations is large

arXiv.org e-Print Archive

University of Queensland eSpace

Computing Room Acoustics Using 3D FDTD: A Cuda Approach.

Author: Bilbao Stefan
Webb C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2011
Field of study

Crossref

Edinburgh Research Explorer

A Parallel Monte Carlo Code for Simulating Collisional N-body Systems

Author: Alok Choudhary
Bharath Pattabiraman
Binney
Böker
Chatterjee
Collins
Frederic A. Rasio
Fregeau
Fregeau
Giersz
Gokhan Memik
Goswami
Gropp
Heggie
Heggie
Heggie
Joshi
Joshi
L'Ecuyer
Li
Lightman
Lusk
McLaughlin
Merritt
Miller
Nvidia.
Spitzer
Stefan Umbreit
Stodolkiewicz
Trenti
Umbreit
Vassiliki Kalogera
Wei-keng Liao
Publication venue: 'IOP Publishing'
Publication date: 15/11/2012
Field of study

We present a new parallel code for computing the dynamical evolution of collisional N-body systems with up to N~10^7 particles. Our code is based on the the Henon Monte Carlo method for solving the Fokker-Planck equation, and makes assumptions of spherical symmetry and dynamical equilibrium. The principal algorithmic developments involve optimizing data structures, and the introduction of a parallel random number generation scheme, as well as a parallel sorting algorithm, required to find nearest neighbors for interactions and to compute the gravitational potential. The new algorithms we introduce along with our choice of decomposition scheme minimize communication costs and ensure optimal distribution of data and workload among the processing units. The implementation uses the Message Passing Interface (MPI) library for communication, which makes it portable to many different supercomputing architectures. We validate the code by calculating the evolution of clusters with initial Plummer distribution functions up to core collapse with the number of stars, N, spanning three orders of magnitude, from 10^5 to 10^7. We find that our results are in good agreement with self-similar core-collapse solutions, and the core collapse times generally agree with expectations from the literature. Also, we observe good total energy conservation, within less than 0.04% throughout all simulations. We analyze the performance of the code, and demonstrate near-linear scaling of the runtime with the number of processors up to 64 processors for N=10^5, 128 for N=10^6 and 256 for N=10^7. The runtime reaches a saturation with the addition of more processors beyond these limits which is a characteristic of the parallel sorting algorithm. The resulting maximum speedups we achieve are approximately 60x, 100x, and 220x, respectively.Comment: 53 pages, 13 figures, accepted for publication in ApJ Supplement

arXiv.org e-Print Archive

Crossref

GADGET: A code for collisionless and gasdynamical cosmological simulations

Author: Aarseth
Aarseth
Appel
Athanassoula
Athanassoula
Balsara
Barnes
Barnes
Barnes
Barnes
Bertschinger
Bode
Brieu
Carraro
Couchman
Couchman
Davé
Dubinski
Dubinski
Eastwood
Ebisuzaki
Efstathiou
Evrard
Frenk
Fukushige
Fukushige
Gingold
Greengard
Hernquist
Hernquist
Hiotelis
Hockney
Hohl
Holmberg
Hultman
Hut
Hut
Ito
Jernigan
Kang
Katz
Katz
Kawai
Klein
Kravtsov
Lia
Lombardi
Lucy
MacFarland
Makino
Makino
Makino
Makino
Makino
Makino
Makino
McMillan
Monaghan
Monaghan
Monaghan
Naoki Yoshida
Navarro
Navarro
Navarro
Nelson
Norman
Okumura
Pacheco
Pearce
Peebles
Press
Press
Press
Romeo
Salmon
Simon D.M. White
Snir
Splinter
Springel
Springel
Steinmetz
Steinmetz
Thacker
Theuns
Viturro
Volker Springel
Warren
White
White
Wirth
Xu
Yahagi
Yoshida
Yoshida
Publication venue: 'Elsevier BV'
Publication date: 01/01/2001
Field of study

We describe the newly written code GADGET which is suitable both for cosmological simulations of structure formation and for the simulation of interacting galaxies. GADGET evolves self-gravitating collisionless fluids with the traditional N-body approach, and a collisional gas by smoothed particle hydrodynamics. Along with the serial version of the code, we discuss a parallel version that has been designed to run on massively parallel supercomputers with distributed memory. While both versions use a tree algorithm to compute gravitational forces, the serial version of GADGET can optionally employ the special-purpose hardware GRAPE instead of the tree. Periodic boundary conditions are supported by means of an Ewald summation technique. The code uses individual and adaptive timesteps for all particles, and it combines this with a scheme for dynamic tree updates. Due to its Lagrangian nature, GADGET thus allows a very large dynamic range to be bridged, both in space and time. So far, GADGET has been successfully used to run simulations with up to 7.5e7 particles, including cosmological studies of large-scale structure formation, high-resolution simulations of the formation of clusters of galaxies, as well as workstation-sized problems of interacting galaxies. In this study, we detail the numerical algorithms employed, and show various tests of the code. We publically release both the serial and the massively parallel version of the code.Comment: 32 pages, 14 figures, replaced to match published version in New Astronomy. For download of the code, see http://www.mpa-garching.mpg.de/gadget (new version 1.1 available

arXiv.org e-Print Archive

CiteSeerX

Crossref