176,338 research outputs found

    Galaxy Modeling with Compound Elliptical Shapelets

    Full text link
    Gauss-Hermite and Gauss-Laguerre ("shapelet") decompositions of images have become important tools in galaxy modeling, particularly for the purpose of extracting ellipticity and morphological information from astronomical data. However, the standard shapelet basis functions cannot compactly represent galaxies with high ellipticity or large Sersic index, and the resulting underfitting bias has been shown to present a serious challenge for weak-lensing methods based on shapelets. We present here a new convolution relation and a compound "multi-scale" shapelet basis to address these problems, and provide a proof-of-concept demonstration using a small sample of nearby galaxies.Comment: 14 pages, 7 figure

    High-Dimensional Bayesian Geostatistics

    Full text link
    With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ∼n\sim n floating point operations (flops), where nn the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings

    Data-Driven Estimation in Equilibrium Using Inverse Optimization

    Get PDF
    Equilibrium modeling is common in a variety of fields such as game theory and transportation science. The inputs for these models, however, are often difficult to estimate, while their outputs, i.e., the equilibria they are meant to describe, are often directly observable. By combining ideas from inverse optimization with the theory of variational inequalities, we develop an efficient, data-driven technique for estimating the parameters of these models from observed equilibria. We use this technique to estimate the utility functions of players in a game from their observed actions and to estimate the congestion function on a road network from traffic count data. A distinguishing feature of our approach is that it supports both parametric and \emph{nonparametric} estimation by leveraging ideas from statistical learning (kernel methods and regularization operators). In computational experiments involving Nash and Wardrop equilibria in a nonparametric setting, we find that a) we effectively estimate the unknown demand or congestion function, respectively, and b) our proposed regularization technique substantially improves the out-of-sample performance of our estimators.Comment: 36 pages, 5 figures Additional theorems for generalization guarantees and statistical analysis adde

    Iterative Updating of Model Error for Bayesian Inversion

    Get PDF
    In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations. We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.Comment: 39 pages, 9 figure

    Bayesian Approximate Kernel Regression with Variable Selection

    Full text link
    Nonlinear kernel regression models are often used in statistics and machine learning because they are more accurate than linear models. Variable selection for kernel regression models is a challenge partly because, unlike the linear regression setting, there is no clear concept of an effect size for regression coefficients. In this paper, we propose a novel framework that provides an effect size analog of each explanatory variable for Bayesian kernel regression models when the kernel is shift-invariant --- for example, the Gaussian kernel. We use function analytic properties of shift-invariant reproducing kernel Hilbert spaces (RKHS) to define a linear vector space that: (i) captures nonlinear structure, and (ii) can be projected onto the original explanatory variables. The projection onto the original explanatory variables serves as an analog of effect sizes. The specific function analytic property we use is that shift-invariant kernel functions can be approximated via random Fourier bases. Based on the random Fourier expansion we propose a computationally efficient class of Bayesian approximate kernel regression (BAKR) models for both nonlinear regression and binary classification for which one can compute an analog of effect sizes. We illustrate the utility of BAKR by examining two important problems in statistical genetics: genomic selection (i.e. phenotypic prediction) and association mapping (i.e. inference of significant variants or loci). State-of-the-art methods for genomic selection and association mapping are based on kernel regression and linear models, respectively. BAKR is the first method that is competitive in both settings.Comment: 22 pages, 3 figures, 3 tables; theory added; new simulations presented; references adde
    • …
    corecore