4,970 research outputs found

    Bayesian Methods for Analysis and Adaptive Scheduling of Exoplanet Observations

    Full text link
    We describe work in progress by a collaboration of astronomers and statisticians developing a suite of Bayesian data analysis tools for extrasolar planet (exoplanet) detection, planetary orbit estimation, and adaptive scheduling of observations. Our work addresses analysis of stellar reflex motion data, where a planet is detected by observing the "wobble" of its host star as it responds to the gravitational tug of the orbiting planet. Newtonian mechanics specifies an analytical model for the resulting time series, but it is strongly nonlinear, yielding complex, multimodal likelihood functions; it is even more complex when multiple planets are present. The parameter spaces range in size from few-dimensional to dozens of dimensions, depending on the number of planets in the system, and the type of motion measured (line-of-sight velocity, or position on the sky). Since orbits are periodic, Bayesian generalizations of periodogram methods facilitate the analysis. This relies on the model being linearly separable, enabling partial analytical marginalization, reducing the dimension of the parameter space. Subsequent analysis uses adaptive Markov chain Monte Carlo methods and adaptive importance sampling to perform the integrals required for both inference (planet detection and orbit measurement), and information-maximizing sequential design (for adaptive scheduling of observations). We present an overview of our current techniques and highlight directions being explored by ongoing research.Comment: 29 pages, 11 figures. An abridged version is accepted for publication in Statistical Methodology for a special issue on astrostatistics, with selected (refereed) papers presented at the Astronomical Data Analysis Conference (ADA VI) held in Monastir, Tunisia, in May 2010. Update corrects equation (3

    Fourier Analysis of Stochastic Sampling Strategies for Assessing Bias and Variance in Integration

    Get PDF
    Each pixel in a photorealistic, computer generated picture is calculated by approximately integrating all the light arriving at the pixel, from the virtual scene. A common strategy to calculate these high-dimensional integrals is to average the estimates at stochastically sampled locations. The strategy with which the sampled locations are chosen is of utmost importance in deciding the quality of the approximation, and hence rendered image. We derive connections between the spectral properties of stochastic sampling patterns and the first and second order statistics of estimates of integration using the samples. Our equations provide insight into the assessment of stochastic sampling strategies for integration. We show that the amplitude of the expected Fourier spectrum of sampling patterns is a useful indicator of the bias when used in numerical integration. We deduce that estimator variance is directly dependent on the variance of the sampling spectrum over multiple realizations of the sampling pattern. We then analyse Gaussian jittered sampling, a simple variant of jittered sampling, that allows a smooth trade-off of bias for variance in uniform (regular grid) sampling. We verify our predictions using spectral measurement, quantitative integration experiments and qualitative comparisons of rendered images.</jats:p

    Free energy reconstruction from steered dynamics without post-processing

    Full text link
    Various methods achieving importance sampling in ensembles of nonequilibrium trajectories enable to estimate free energy differences and, by maximum-likelihood post-processing, to reconstruct free energy landscapes. Here, based on Bayes theorem, we propose a more direct method in which a posterior likelihood function is used both to construct the steered dynamics and to infer the contribution to equilibrium of all the sampled states. The method is implemented with two steering schedules. First, using non-autonomous steering, we calculate the migration barrier of the vacancy in Fe-alpha. Second, using an autonomous scheduling related to metadynamics and equivalent to temperature-accelerated molecular dynamics, we accurately reconstruct the two-dimensional free energy landscape of the 38-atom Lennard-Jones cluster as a function of an orientational bond-order parameter and energy, down to the solid-solid structural transition temperature of the cluster and without maximum-likelihood post-processing.Comment: Accepted manuscript in Journal of Computational Physics, 7 figure

    Computational Particle Physics for Event Generators and Data Analysis

    Full text link
    High-energy physics data analysis relies heavily on the comparison between experimental and simulated data as stressed lately by the Higgs search at LHC and the recent identification of a Higgs-like new boson. The first link in the full simulation chain is the event generation both for background and for expected signals. Nowadays event generators are based on the automatic computation of matrix element or amplitude for each process of interest. Moreover, recent analysis techniques based on the matrix element likelihood method assign probabilities for every event to belong to any of a given set of possible processes. This method originally used for the top mass measurement, although computing intensive, has shown its power at LHC to extract the new boson signal from the background. Serving both needs, the automatic calculation of matrix element is therefore more than ever of prime importance for particle physics. Initiated in the eighties, the techniques have matured for the lowest order calculations (tree-level), but become complex and CPU time consuming when higher order calculations involving loop diagrams are necessary like for QCD processes at LHC. New calculation techniques for next-to-leading order (NLO) have surfaced making possible the generation of processes with many final state particles (up to 6). If NLO calculations are in many cases under control, although not yet fully automatic, even higher precision calculations involving processes at 2-loops or more remain a big challenge. After a short introduction to particle physics and to the related theoretical framework, we will review some of the computing techniques that have been developed to make these calculations automatic. The main available packages and some of the most important applications for simulation and data analysis, in particular at LHC will also be summarized.Comment: 19 pages, 11 figures, Proceedings of CCP (Conference on Computational Physics) Oct. 2012, Osaka (Japan) in IOP Journal of Physics: Conference Serie

    Efficient Bayesian inference via Monte Carlo and machine learning algorithms

    Get PDF
    Mención Internacional en el título de doctorIn many fields of science and engineering, we are faced with an inverse problem where we aim to recover an unobserved parameter or variable of interest from a set of observed variables. Bayesian inference is a probabilistic approach for inferring this unknown parameter that has become extremely popular, finding application in myriad problems in fields such as machine learning, signal processing, remote sensing and astronomy. In Bayesian inference, all the information about the parameter is summarized by the posterior distribution. Unfortunately, the study of the posterior distribution requires the computation of complicated integrals, that are analytically intractable and need to be approximated. Monte Carlo is a huge family of sampling algorithms for performing optimization and numerical integration that has become the main horsepower for carrying out Bayesian inference. The main idea of Monte Carlo is that we can approximate the posterior distribution by a set of samples, obtained by an iterative process that involves sampling from a known distribution. Markov chain Monte Carlo (MCMC) and importance sampling (IS) are two important groups of Monte Carlo algorithms. This thesis focuses on developing and analyzing Monte Carlo algorithms (either MCMC, IS or combination of both) under different challenging scenarios presented below. In summary, in this thesis we address several important points, enumerated (a)–(f), that currently represent a challenge in Bayesian inference via Monte Carlo. A first challenge that we address is the problematic exploration of the parameter space by off-the-shelf MCMC algorithms when there is (a) multimodality, or with (b) highly concentrated posteriors. Another challenge that we address is the (c) proposal construction in IS. Furtheremore, in recent applications we need to deal with (d) expensive posteriors, and/or we need to handle (e) noisy posteriors. Finally, the Bayesian framework also offers a way of comparing competing hypothesis (models) in a principled way by means of marginal likelihoods. Hence, a task that arises as of fundamental importance is (f) marginal likelihood computation. Chapters 2 and 3 deal with (a), (b), and (c). In Chapter 2, we propose a novel population MCMC algorithm called Parallel Metropolis-Hastings Coupler (PMHC). PMHC is very suitable for multimodal scenarios since it works with a population of states, instead of a single one, hence allowing for sharing information. PMHC combines independent exploration by the use of parallel Metropolis-Hastings algorithms, with cooperative exploration by the use of a population MCMC technique called Normal Kernel Coupler. In Chapter 3, population MCMC are combined with IS within the layered adaptive IS (LAIS) framework. The combination of MCMC and IS serves two purposes. First, an automatic proposal construction. Second, it aims at increasing the robustness, since the MCMC samples are not used directly to form the sample approximation of the posterior. The use of minibatches of data is proposed to deal with highly concentrated posteriors. Other extensions for reducing the costs with respect to the vanilla LAIS framework, based on recycling and clustering, are discussed and analyzed. Chapters 4, 5 and 6 deal with (c), (d) and (e). The use of nonparametric approximations of the posterior plays an important role in the design of efficient Monte Carlo algorithms. Nonparametric approximations of the posterior can be obtained using machine learning algorithms for nonparametric regression, such as Gaussian Processes and Nearest Neighbors. Then, they can serve as cheap surrogate models, or for building efficient proposal distributions. In Chapter 4, in the context of expensive posteriors, we propose adaptive quadratures of posterior expectations and the marginal likelihood using a sequential algorithm that builds and refines a nonparametric approximation of the posterior. In Chapter 5, we propose Regression-based Adaptive Deep Importance Sampling (RADIS), an adaptive IS algorithm that uses a nonparametric approximation of the posterior as the proposal distribution. We illustrate the proposed algorithms in applications of astronomy and remote sensing. Chapter 4 and 5 consider noiseless posterior evaluations for building the nonparametric approximations. More generally, in Chapter 6 we give an overview and classification of MCMC and IS schemes using surrogates built with noisy evaluations. The motivation here is the study of posteriors that are both costly and noisy. The classification reveals a connection between algorithms that use the posterior approximation as a cheap surrogate, and algorithms that use it for building an efficient proposal. We illustrate specific instances of the classified schemes in an application of reinforcement learning. Finally, in Chapter 7 we study noisy IS, namely, IS when the posterior evaluations are noisy, and derive optimal proposal distributions for the different estimators in this setting. Chapter 8 deals with (f). In Chapter 8, we provide with an exhaustive review of methods for marginal likelihood computation, with special focus on the ones based on Monte Carlo. We derive many connections among the methods and compare them in several simulations setups. Finally, in Chapter 9 we summarize the contributions of this thesis and discuss some potential avenues of future research.Programa de Doctorado en Ingeniería Matemática por la Universidad Carlos III de MadridPresidente: Valero Laparra Pérez-Muelas.- Secretario: Michael Peter Wiper.- Vocal: Omer Deniz Akyildi

    The Iray Light Transport Simulation and Rendering System

    Full text link
    While ray tracing has become increasingly common and path tracing is well understood by now, a major challenge lies in crafting an easy-to-use and efficient system implementing these technologies. Following a purely physically-based paradigm while still allowing for artistic workflows, the Iray light transport simulation and rendering system allows for rendering complex scenes by the push of a button and thus makes accurate light transport simulation widely available. In this document we discuss the challenges and implementation choices that follow from our primary design decisions, demonstrating that such a rendering system can be made a practical, scalable, and efficient real-world application that has been adopted by various companies across many fields and is in use by many industry professionals today