15,620 research outputs found

    Massively parallel approximate Gaussian process regression

    Get PDF
    We explore how the big-three computing paradigms -- symmetric multi-processor (SMC), graphical processing units (GPUs), and cluster computing -- can together be brought to bare on large-data Gaussian processes (GP) regression problems via a careful implementation of a newly developed local approximation scheme. Our methodological contribution focuses primarily on GPU computation, as this requires the most care and also provides the largest performance boost. However, in our empirical work we study the relative merits of all three paradigms to determine how best to combine them. The paper concludes with two case studies. One is a real data fluid-dynamics computer experiment which benefits from the local nature of our approximation; the second is a synthetic data example designed to find the largest design for which (accurate) GP emulation can performed on a commensurate predictive set under an hour.Comment: 24 pages, 6 figures, 1 tabl

    Speeding up neighborhood search in local Gaussian process prediction

    Full text link
    Recent implementations of local approximate Gaussian process models have pushed computational boundaries for non-linear, non-parametric prediction problems, particularly when deployed as emulators for computer experiments. Their flavor of spatially independent computation accommodates massive parallelization, meaning that they can handle designs two or more orders of magnitude larger than previously. However, accomplishing that feat can still require massive supercomputing resources. Here we aim to ease that burden. We study how predictive variance is reduced as local designs are built up for prediction. We then observe how the exhaustive and discrete nature of an important search subroutine involved in building such local designs may be overly conservative. Rather, we suggest that searching the space radially, i.e., continuously along rays emanating from the predictive location of interest, is a far thriftier alternative. Our empirical work demonstrates that ray-based search yields predictors with accuracy comparable to exhaustive search, but in a fraction of the time - bringing a supercomputer implementation back onto the desktop.Comment: 24 pages, 5 figures, 4 table

    High-performance solution of hierarchical equations of motions for studying energy-transfer in light-harvesting complexes

    Get PDF
    Excitonic models of light-harvesting complexes, where the vibrational degrees of freedom are treated as a bath, are commonly used to describe the motion of the electronic excitation through a molecule. Recent experiments point toward the possibility of memory effects in this process and require to consider time non-local propagation techniques. The hierarchical equations of motion (HEOM) were proposed by Ishizaki and Fleming to describe the site-dependent reorganization dynamics of protein environments (J. Chem. Phys., 130, p. 234111, 2009), which plays a significant role in photosynthetic electronic energy transfer. HEOM are often used as a reference for other approximate methods, but have been implemented only for small systems due to their adverse computational scaling with the system size. Here, we show that HEOM are also solvable for larger systems, since the underlying algorithm is ideally suited for the usage of graphics processing units (GPU). The tremendous reduction in computational time due to the GPU allows us to perform a systematic study of the energy-transfer efficiency in the Fenna-Matthews-Olson (FMO) light-harvesting complex at physiological temperature under full consideration of memory-effects. We find that approximative methods differ qualitatively and quantitatively from the HEOM results and discuss the importance of finite temperature to achieve high energy-transfer efficiencies.Comment: 14 pages; Journal of Chemical Theory and Computation (2011

    GPU accelerated Monte Carlo simulation of Brownian motors dynamics with CUDA

    Full text link
    This work presents an updated and extended guide on methods of a proper acceleration of the Monte Carlo integration of stochastic differential equations with the commonly available NVIDIA Graphics Processing Units using the CUDA programming environment. We outline the general aspects of the scientific computing on graphics cards and demonstrate them with two models of a well known phenomenon of the noise induced transport of Brownian motors in periodic structures. As a source of fluctuations in the considered systems we selected the three most commonly occurring noises: the Gaussian white noise, the white Poissonian noise and the dichotomous process also known as a random telegraph signal. The detailed discussion on various aspects of the applied numerical schemes is also presented. The measured speedup can be of the astonishing order of about 3000 when compared to a typical CPU. This number significantly expands the range of problems solvable by use of stochastic simulations, allowing even an interactive research in some cases.Comment: 21 pages, 5 figures; Comput. Phys. Commun., accepted, 201
    corecore