128,776 research outputs found

    Hamiltonian Monte Carlo Acceleration Using Surrogate Functions with Random Bases

    Full text link
    For big data analysis, high computational cost for Bayesian methods often limits their applications in practice. In recent years, there have been many attempts to improve computational efficiency of Bayesian inference. Here we propose an efficient and scalable computational technique for a state-of-the-art Markov Chain Monte Carlo (MCMC) methods, namely, Hamiltonian Monte Carlo (HMC). The key idea is to explore and exploit the structure and regularity in parameter space for the underlying probabilistic model to construct an effective approximation of its geometric properties. To this end, we build a surrogate function to approximate the target distribution using properly chosen random bases and an efficient optimization process. The resulting method provides a flexible, scalable, and efficient sampling algorithm, which converges to the correct target distribution. We show that by choosing the basis functions and optimization process differently, our method can be related to other approaches for the construction of surrogate functions such as generalized additive models or Gaussian process models. Experiments based on simulated and real data show that our approach leads to substantially more efficient sampling algorithms compared to existing state-of-the art methods

    Randomized GCUR decompositions

    Full text link
    By exploiting the random sampling techniques, this paper derives an efficient randomized algorithm for computing a generalized CUR decomposition, which provides low-rank approximations of both matrices simultaneously in terms of some of their rows and columns. For large-scale data sets that are expensive to store and manipulate, a new variant of the discrete empirical interpolation method known as L-DEIM, which needs much lower cost and provides a significant acceleration in practice, is also combined with the random sampling approach to further improve the efficiency of our algorithm. Moreover, adopting the randomized algorithm to implement the truncation process of restricted singular value decomposition (RSVD), combined with the L-DEIM procedure, we propose a fast algorithm for computing an RSVD based CUR decomposition, which provides a coordinated low-rank approximation of the three matrices in a CUR-type format simultaneously and provides advantages over the standard CUR approximation for some applications. We establish detailed probabilistic error analysis for the algorithms and provide numerical results that show the promise of our approaches

    Randomized oversampling for generalized multiscale finite element methods

    Get PDF
    In this paper, we develop efficient multiscale methods for ows in heterogeneous media. We use the generalized multiscale finite element (GMsFEM) framework. GMsFEM approxi- mates the solution space locally using a few multiscale basis functions. This approximation selects an appropriate snapshot space and a local spectral decomposition, e.g., the use of oversampled regions, in order to achieve an efficient model reduction. However, the successful construction of snapshot spaces may be costly if too many local problems need to be solved in order to obtain these spaces. We use a moderate quantity of local solutions (or snapshot vectors) with random boundary conditions on oversampled regions with zero forcing to deliver an efficient methodology. Motivated by the random- ized algorithm presented in [P. G. Martinsson, V. Rokhlin, and M. Tygert, A Randomized Algorithm for the approximation of Matrices, YALEU/DCS/TR-1361, Yale University, 2006], we consider a snapshot space which consists of harmonic extensions of random boundary conditions defined in a domain larger than the target region. Furthermore, we perform an eigenvalue decomposition in this small space. We study the application of randomized sampling for GMsFEM in conjunction with adaptivity, where local multiscale spaces are adaptively enriched. Convergence analysis is provided. We present representative numerical results to validate the method proposed

    Sampling in Potts Model on Sparse Random Graphs

    Get PDF
    We study the problem of sampling almost uniform proper q-colorings in sparse Erdos-Renyi random graphs G(n,d/n), a research initiated by Dyer, Flaxman, Frieze and Vigoda [Dyer et al., RANDOM STRUCT ALGOR, 2006]. We obtain a fully polynomial time almost uniform sampler (FPAUS) for the problem provided q>3d+4, improving the current best bound q>5.5d [Efthymiou, SODA, 2014]. Our sampling algorithm works for more generalized models and broader family of sparse graphs. It is an efficient sampler (in the same sense of FPAUS) for anti-ferromagnetic Potts model with activity 03(1-b)d+4. We further identify a family of sparse graphs to which all these results can be extended. This family of graphs is characterized by the notion of contraction function, which is a new measure of the average degree in graphs

    Generalized-ensemble simulations and cluster algorithms

    Get PDF
    The importance-sampling Monte Carlo algorithm appears to be the universally optimal solution to the problem of sampling the state space of statistical mechanical systems according to the relative importance of configurations for the partition function or thermal averages of interest. While this is true in terms of its simplicity and universal applicability, the resulting approach suffers from the presence of temporal correlations of successive samples naturally implied by the Markov chain underlying the importance-sampling simulation. In many situations, these autocorrelations are moderate and can be easily accounted for by an appropriately adapted analysis of simulation data. They turn out to be a major hurdle, however, in the vicinity of phase transitions or for systems with complex free-energy landscapes. The critical slowing down close to continuous transitions is most efficiently reduced by the application of cluster algorithms, where they are available. For first-order transitions and disordered systems, on the other hand, macroscopic energy barriers need to be overcome to prevent dynamic ergodicity breaking. In this situation, generalized-ensemble techniques such as the multicanonical simulation method can effect impressive speedups, allowing to sample the full free-energy landscape. The Potts model features continuous as well as first-order phase transitions and is thus a prototypic example for studying phase transitions and new algorithmic approaches. I discuss the possibilities of bringing together cluster and generalized-ensemble methods to combine the benefits of both techniques. The resulting algorithm allows for the efficient estimation of the random-cluster partition function encoding the information of all Potts models, even with a non-integer number of states, for all temperatures in a single simulation run per system size.Comment: 15 pages, 6 figures, proceedings of the 2009 Workshop of the Center of Simulational Physics, Athens, G

    Penalized likelihood estimation and iterative kalman smoothing for non-gaussian dynamic regression models

    Get PDF
    Dynamic regression or state space models provide a flexible framework for analyzing non-Gaussian time series and longitudinal data, covering for example models for discrete longitudinal observations. As for non-Gaussian random coefficient models, a direct Bayesian approach leads to numerical integration problems, often intractable for more complicated data sets. Recent Markov chain Monte Carlo methods avoid this by repeated sampling from approximative posterior distributions, but there are still open questions about sampling schemes and convergence. In this article we consider simpler methods of inference based on posterior modes or, equivalently, maximum penalized likelihood estimation. From the latter point of view, the approach can also be interpreted as a nonparametric method for smoothing time-varying coefficients. Efficient smoothing algorithms are obtained by iteration of common linear Kalman filtering and smoothing, in the same way as estimation in generalized linear models with fixed effects can be performed by iteratively weighted least squares estimation. The algorithm can be combined with an EM-type method or cross-validation to estimate unknown hyper- or smoothing parameters. The approach is illustrated by applications to a binary time series and a multicategorical longitudinal data set

    Enhancing Sampling in Computational Statistics Using Modified Hamiltonians

    Get PDF
    The Hamiltonian Monte Carlo (HMC) method has been recognized as a powerful sampling tool in computational statistics. In this thesis, we show that performance of HMC can be dramatically improved by replacing Hamiltonians in the Metropolis test with modified Hamiltonians, and a complete momentum update with a partial momentum refreshment. The resulting generalized HMC importance sampler, which we called Mix & Match Hamiltonian Monte Carlo (MMHMC), arose as an extension of the Generalized Shadow Hybrid Monte Carlo (GSHMC) method, previously proposed for molecular simulation. The MMHMC method adapts GSHMC specifically to computational statistics and enriches it with new essential features: (i) the efficient algorithms for computation of modified Hamiltonians; (ii) the implicit momentum update procedure and (iii) the two-stage splitting integration schemes specially derived for the methods sampling with modified Hamiltonians. In addition, different optional strategies for momentum update and flipping are introduced as well as algorithms for adaptive tuning of parameters and efficient sampling of multimodal distributions are developed. MMHMC has been implemented in the in-house software package HaiCS (Hamiltonians in Computational Statistics) written in C, tested on the popular statistical models and compared in sampling efficiency with HMC, Generalized Hybrid Monte Carlo, Riemann Manifold Hamiltonian Monte Carlo, Metropolis Adjusted Langevin Algorithm and Random Walk Metropolis-Hastings. The analysis of time-normalized effective sample size reveals the superiority of MMHMC over popular sampling techniques, especially in solving high-dimensional problems.FPU12/05209, MTM2013–46553–C3–1–
    • …
    corecore