Search CORE

8,776 research outputs found

Bayesian Approximate Kernel Regression with Variable Selection

Author: Crawford Lorin
Mukherjee Sayan
Wood Kris C.
Zhou Xiang
Publication venue
Publication date: 09/06/2017
Field of study

Nonlinear kernel regression models are often used in statistics and machine learning because they are more accurate than linear models. Variable selection for kernel regression models is a challenge partly because, unlike the linear regression setting, there is no clear concept of an effect size for regression coefficients. In this paper, we propose a novel framework that provides an effect size analog of each explanatory variable for Bayesian kernel regression models when the kernel is shift-invariant --- for example, the Gaussian kernel. We use function analytic properties of shift-invariant reproducing kernel Hilbert spaces (RKHS) to define a linear vector space that: (i) captures nonlinear structure, and (ii) can be projected onto the original explanatory variables. The projection onto the original explanatory variables serves as an analog of effect sizes. The specific function analytic property we use is that shift-invariant kernel functions can be approximated via random Fourier bases. Based on the random Fourier expansion we propose a computationally efficient class of Bayesian approximate kernel regression (BAKR) models for both nonlinear regression and binary classification for which one can compute an analog of effect sizes. We illustrate the utility of BAKR by examining two important problems in statistical genetics: genomic selection (i.e. phenotypic prediction) and association mapping (i.e. inference of significant variants or loci). State-of-the-art methods for genomic selection and association mapping are based on kernel regression and linear models, respectively. BAKR is the first method that is competitive in both settings.Comment: 22 pages, 3 figures, 3 tables; theory added; new simulations presented; references adde

arXiv.org e-Print Archive

FigShare

Multi-Resolution Functional ANOVA for Large-Scale, Many-Input Computer Experiments

Author: Haaland Benjamin
Plumlee Matthew
Sung Chih-Li
Wang Wenjia
Publication venue
Publication date: 08/01/2019
Field of study

The Gaussian process is a standard tool for building emulators for both deterministic and stochastic computer experiments. However, application of Gaussian process models is greatly limited in practice, particularly for large-scale and many-input computer experiments that have become typical. We propose a multi-resolution functional ANOVA model as a computationally feasible emulation alternative. More generally, this model can be used for large-scale and many-input non-linear regression problems. An overlapping group lasso approach is used for estimation, ensuring computational feasibility in a large-scale and many-input setting. New results on consistency and inference for the (potentially overlapping) group lasso in a high-dimensional setting are developed and applied to the proposed multi-resolution functional ANOVA model. Importantly, these results allow us to quantify the uncertainty in our predictions. Numerical examples demonstrate that the proposed model enjoys marked computational advantages. Data capabilities, both in terms of sample size and dimension, meet or exceed best available emulation tools while meeting or exceeding emulation accuracy

arXiv.org e-Print Archive

FigShare

Compressive Measurement Designs for Estimating Structured Signals in Structured Clutter: A Bayesian Experimental Design Approach

Author: Haupt Jarvis
Jain Swayambhoo
Soni Akshay
Publication venue
Publication date: 21/11/2013
Field of study

This work considers an estimation task in compressive sensing, where the goal is to estimate an unknown signal from compressive measurements that are corrupted by additive pre-measurement noise (interference, or clutter) as well as post-measurement noise, in the specific setting where some (perhaps limited) prior knowledge on the signal, interference, and noise is available. The specific aim here is to devise a strategy for incorporating this prior information into the design of an appropriate compressive measurement strategy. Here, the prior information is interpreted as statistics of a prior distribution on the relevant quantities, and an approach based on Bayesian Experimental Design is proposed. Experimental results on synthetic data demonstrate that the proposed approach outperforms traditional random compressive measurement designs, which are agnostic to the prior information, as well as several other knowledge-enhanced sensing matrix designs based on more heuristic notions.Comment: 5 pages, 4 figures. Accepted for publication at The Asilomar Conference on Signals, Systems, and Computers 201

arXiv.org e-Print Archive

Crossref

High-Dimensional Bayesian Optimization via Tree-Structured Additive Models

Author: Arora Ishank
Han Eric
Scarlett Jonathan
Publication venue
Publication date: 23/12/2020
Field of study

Bayesian Optimization (BO) has shown significant success in tackling expensive low-dimensional black-box optimization problems. Many optimization problems of interest are high-dimensional, and scaling BO to such settings remains an important challenge. In this paper, we consider generalized additive models in which low-dimensional functions with overlapping subsets of variables are composed to model a high-dimensional target function. Our goal is to lower the computational resources required and facilitate faster model learning by reducing the model complexity while retaining the sample-efficiency of existing methods. Specifically, we constrain the underlying dependency graphs to tree structures in order to facilitate both the structure learning and optimization of the acquisition function. For the former, we propose a hybrid graph learning algorithm based on Gibbs sampling and mutation. In addition, we propose a novel zooming-based algorithm that permits generalized additive models to be employed more efficiently in the case of continuous domains. We demonstrate and discuss the efficacy of our approach via a range of experiments on synthetic functions and real-world datasets.Comment: To appear in AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications