10,247 research outputs found

    A Parallel Monte Carlo Code for Simulating Collisional N-body Systems

    Full text link
    We present a new parallel code for computing the dynamical evolution of collisional N-body systems with up to N~10^7 particles. Our code is based on the the Henon Monte Carlo method for solving the Fokker-Planck equation, and makes assumptions of spherical symmetry and dynamical equilibrium. The principal algorithmic developments involve optimizing data structures, and the introduction of a parallel random number generation scheme, as well as a parallel sorting algorithm, required to find nearest neighbors for interactions and to compute the gravitational potential. The new algorithms we introduce along with our choice of decomposition scheme minimize communication costs and ensure optimal distribution of data and workload among the processing units. The implementation uses the Message Passing Interface (MPI) library for communication, which makes it portable to many different supercomputing architectures. We validate the code by calculating the evolution of clusters with initial Plummer distribution functions up to core collapse with the number of stars, N, spanning three orders of magnitude, from 10^5 to 10^7. We find that our results are in good agreement with self-similar core-collapse solutions, and the core collapse times generally agree with expectations from the literature. Also, we observe good total energy conservation, within less than 0.04% throughout all simulations. We analyze the performance of the code, and demonstrate near-linear scaling of the runtime with the number of processors up to 64 processors for N=10^5, 128 for N=10^6 and 256 for N=10^7. The runtime reaches a saturation with the addition of more processors beyond these limits which is a characteristic of the parallel sorting algorithm. The resulting maximum speedups we achieve are approximately 60x, 100x, and 220x, respectively.Comment: 53 pages, 13 figures, accepted for publication in ApJ Supplement

    Reducing phase error in long numerical binary black hole evolutions with sixth order finite differencing

    Full text link
    We describe a modification of a fourth-order accurate ``moving puncture'' evolution code, where by replacing spatial fourth-order accurate differencing operators in the bulk of the grid by a specific choice of sixth-order accurate stencils we gain significant improvements in accuracy. We illustrate the performance of the modified algorithm with an equal-mass simulation covering nine orbits.Comment: 13 pages, 6 figure

    From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

    Full text link
    Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific Programmin

    Searching for periodic sources with LIGO. II: Hierarchical searches

    Full text link
    The detection of quasi-periodic sources of gravitational waves requires the accumulation of signal-to-noise over long observation times. If not removed, Earth-motion induced Doppler modulations, and intrinsic variations of the gravitational-wave frequency make the signals impossible to detect. These effects can be corrected (removed) using a parameterized model for the frequency evolution. We compute the number of independent corrections Np(ΔT,N)N_p(\Delta T,N) required for incoherent search strategies which use stacked power spectra---a demodulated time series is divided into NN segments of length ΔT\Delta T, each segment is FFTed, the power is computed, and the NN spectra are summed up. We estimate that the sensitivity of an all-sky search that uses incoherent stacks is a factor of 2--4 better than would be achieved using coherent Fourier transforms; incoherent methods are computationally efficient at exploring large parameter spaces. A two-stage hierarchical search which yields another 20--60% improvement in sensitivity in all-sky searches for old (>= 1000 yr) slow (= 40 yr) fast (<= 1000 Hz) pulsars. Assuming 10^{12} flops of effective computing power for data analysis, enhanced LIGO interferometers should be sensitive to: (i) Galactic core pulsars with gravitational ellipticities of \epsilon\agt5\times 10^{-6} at 200 Hz, (ii) Gravitational waves emitted by the unstable r-modes of newborn neutron stars out to distances of ~8 Mpc, and (iii) neutron stars in LMXB's with x-ray fluxes which exceed 2×10−8erg/(cm2s)2 \times 10^{-8} erg/(cm^2 s). Moreover, gravitational waves from the neutron star in Sco X-1 should be detectable is the interferometer is operated in a signal-recycled, narrow-band configuration.Comment: 22 Pages, 13 Figure

    Coherent Bayesian inference on compact binary inspirals using a network of interferometric gravitational wave detectors

    Get PDF
    Presented in this paper is a Markov chain Monte Carlo (MCMC) routine for conducting coherent parameter estimation for interferometric gravitational wave observations of an inspiral of binary compact objects using data from multiple detectors. The MCMC technique uses data from several interferometers and infers all nine of the parameters (ignoring spin) associated with the binary system, including the distance to the source, the masses, and the location on the sky. The Metropolis-algorithm utilises advanced MCMC techniques, such as importance resampling and parallel tempering. The data is compared with time-domain inspiral templates that are 2.5 post-Newtonian (PN) in phase and 2.0 PN in amplitude. Our routine could be implemented as part of an inspiral detection pipeline for a world wide network of detectors. Examples are given for simulated signals and data as seen by the LIGO and Virgo detectors operating at their design sensitivity.Comment: 10 pages, 4 figure
    • …
    corecore