Search CORE

10,247 research outputs found

A Parallel Monte Carlo Code for Simulating Collisional N-body Systems

Author: Alok Choudhary
Bharath Pattabiraman
Binney
Böker
Chatterjee
Collins
Frederic A. Rasio
Fregeau
Fregeau
Giersz
Gokhan Memik
Goswami
Gropp
Heggie
Heggie
Heggie
Joshi
Joshi
L'Ecuyer
Li
Lightman
Lusk
McLaughlin
Merritt
Miller
Nvidia.
Spitzer
Stefan Umbreit
Stodolkiewicz
Trenti
Umbreit
Vassiliki Kalogera
Wei-keng Liao
Publication venue: 'IOP Publishing'
Publication date: 15/11/2012
Field of study

We present a new parallel code for computing the dynamical evolution of collisional N-body systems with up to N~10^7 particles. Our code is based on the the Henon Monte Carlo method for solving the Fokker-Planck equation, and makes assumptions of spherical symmetry and dynamical equilibrium. The principal algorithmic developments involve optimizing data structures, and the introduction of a parallel random number generation scheme, as well as a parallel sorting algorithm, required to find nearest neighbors for interactions and to compute the gravitational potential. The new algorithms we introduce along with our choice of decomposition scheme minimize communication costs and ensure optimal distribution of data and workload among the processing units. The implementation uses the Message Passing Interface (MPI) library for communication, which makes it portable to many different supercomputing architectures. We validate the code by calculating the evolution of clusters with initial Plummer distribution functions up to core collapse with the number of stars, N, spanning three orders of magnitude, from 10^5 to 10^7. We find that our results are in good agreement with self-similar core-collapse solutions, and the core collapse times generally agree with expectations from the literature. Also, we observe good total energy conservation, within less than 0.04% throughout all simulations. We analyze the performance of the code, and demonstrate near-linear scaling of the runtime with the number of processors up to 64 processors for N=10^5, 128 for N=10^6 and 256 for N=10^7. The runtime reaches a saturation with the addition of more processors beyond these limits which is a characteristic of the parallel sorting algorithm. The resulting maximum speedups we achieve are approximately 60x, 100x, and 220x, respectively.Comment: 53 pages, 13 figures, accepted for publication in ApJ Supplement

arXiv.org e-Print Archive

Crossref

Reducing phase error in long numerical binary black hole evolutions with sixth order finite differencing

Author: Bruegmann Bernd
Gonzalez Jose A.
Hannam Mark
Husa Sascha
Sperhake Ulrich
Publication venue: 'IOP Publishing'
Publication date: 05/06/2007
Field of study

We describe a modification of a fourth-order accurate ``moving puncture'' evolution code, where by replacing spatial fourth-order accurate differencing operators in the bulk of the grid by a specific choice of sixth-order accurate stencils we gain significant improvements in accuracy. We illustrate the performance of the modified algorithm with an equal-mass simulation covering nine orbits.Comment: 13 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Online Research @ Cardiff

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

Author: Blazewicz Marek
Brandt Steven R.
Ciznicki Milosz
Hinder Ian
Kierzynka Michal
Koppelman David M.
Löffler Frank
Schnetter Erik
Tao Jian
Publication venue: 'IOS Press'
Publication date: 01/01/2013
Field of study

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific Programmin

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Louisiana State University

MPG.PuRe

Searching for periodic sources with LIGO. II: Hierarchical searches

Author: A. P. Cowley
B. J. Owen
B. Owen
D. V. Gal’tsov
E. W. Gottlieb
J. L. Friedman
J. L. Friedman
J. L. Friedman
K. C. B. New
L. Bildsten
L. Lindblom
M. van der Klis
M. Zimmermann
M. Zimmermann
N. Andersson
P. Jaranowski
P. Jaranowski
P. R. Brady
Patrick R. Brady
R. B. Tully
R. Talbot
R. V. Wagoner
S. Bonazzola
S. Chandrasekhar
S. J. Curran
S. L. Shapiro
S. R. Kulkarni
S. van den Bergh
T. M. Niebauer
Teviet Creighton
Publication venue: 'American Physical Society (APS)'
Publication date: 03/12/1998
Field of study

The detection of quasi-periodic sources of gravitational waves requires the accumulation of signal-to-noise over long observation times. If not removed, Earth-motion induced Doppler modulations, and intrinsic variations of the gravitational-wave frequency make the signals impossible to detect. These effects can be corrected (removed) using a parameterized model for the frequency evolution. We compute the number of independent corrections

N_p(\Delta T,N)

required for incoherent search strategies which use stacked power spectra---a demodulated time series is divided into

N

segments of length

\Delta T

, each segment is FFTed, the power is computed, and the

N

spectra are summed up. We estimate that the sensitivity of an all-sky search that uses incoherent stacks is a factor of 2--4 better than would be achieved using coherent Fourier transforms; incoherent methods are computationally efficient at exploring large parameter spaces. A two-stage hierarchical search which yields another 20--60% improvement in sensitivity in all-sky searches for old (>= 1000 yr) slow (= 40 yr) fast (<= 1000 Hz) pulsars. Assuming 10^{12} flops of effective computing power for data analysis, enhanced LIGO interferometers should be sensitive to: (i) Galactic core pulsars with gravitational ellipticities of \epsilon\agt5\times 10^{-6} at 200 Hz, (ii) Gravitational waves emitted by the unstable r-modes of newborn neutron stars out to distances of ~8 Mpc, and (iii) neutron stars in LMXB's with x-ray fluxes which exceed

2 \times 10^{-8} erg/(cm^2 s)

. Moreover, gravitational waves from the neutron star in Sco X-1 should be detectable is the interferometer is operated in a signal-recycled, narrow-band configuration.Comment: 22 Pages, 13 Figure

arXiv.org e-Print Archive

Crossref

Coherent Bayesian inference on compact binary inspirals using a network of interferometric gravitational wave detectors

Author: A. Brillet
A. Gelman
A. Sandage
B. C. Barish
B. C. Barish
Christian Röver
D. E. Goldberg
D. W. Scott
E. T. Jaynes
E. T. Jaynes
E. T. Jaynes
F. Marion
F. J. Harris
J. Hough
K. Tsubono
K. R. Lang
K. S. Thorne
K. V. Mardia
L. Blanchet
Nelson Christensen
P. C. Gregory
P. D. Welch
R. E. Crochiere
Renate Meyer
T. J. Loredo
W. R. Gilks
Publication venue: 'American Physical Society (APS)'
Publication date: 28/09/2006
Field of study

Presented in this paper is a Markov chain Monte Carlo (MCMC) routine for conducting coherent parameter estimation for interferometric gravitational wave observations of an inspiral of binary compact objects using data from multiple detectors. The MCMC technique uses data from several interferometers and infers all nine of the parameters (ignoring spin) associated with the binary system, including the distance to the source, the masses, and the location on the sky. The Metropolis-algorithm utilises advanced MCMC techniques, such as importance resampling and parallel tempering. The data is compared with time-domain inspiral templates that are 2.5 post-Newtonian (PN) in phase and 2.0 PN in amplitude. Our routine could be implemented as part of an inspiral detection pipeline for a world wide network of detectors. Examples are given for simulated signals and data as seen by the LIGO and Virgo detectors operating at their design sensitivity.Comment: 10 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Carleton College: Digital Commons

CERN Document Server

MPG.PuRe