8,776 research outputs found
Bayesian Approximate Kernel Regression with Variable Selection
Nonlinear kernel regression models are often used in statistics and machine
learning because they are more accurate than linear models. Variable selection
for kernel regression models is a challenge partly because, unlike the linear
regression setting, there is no clear concept of an effect size for regression
coefficients. In this paper, we propose a novel framework that provides an
effect size analog of each explanatory variable for Bayesian kernel regression
models when the kernel is shift-invariant --- for example, the Gaussian kernel.
We use function analytic properties of shift-invariant reproducing kernel
Hilbert spaces (RKHS) to define a linear vector space that: (i) captures
nonlinear structure, and (ii) can be projected onto the original explanatory
variables. The projection onto the original explanatory variables serves as an
analog of effect sizes. The specific function analytic property we use is that
shift-invariant kernel functions can be approximated via random Fourier bases.
Based on the random Fourier expansion we propose a computationally efficient
class of Bayesian approximate kernel regression (BAKR) models for both
nonlinear regression and binary classification for which one can compute an
analog of effect sizes. We illustrate the utility of BAKR by examining two
important problems in statistical genetics: genomic selection (i.e. phenotypic
prediction) and association mapping (i.e. inference of significant variants or
loci). State-of-the-art methods for genomic selection and association mapping
are based on kernel regression and linear models, respectively. BAKR is the
first method that is competitive in both settings.Comment: 22 pages, 3 figures, 3 tables; theory added; new simulations
presented; references adde
Multi-Resolution Functional ANOVA for Large-Scale, Many-Input Computer Experiments
The Gaussian process is a standard tool for building emulators for both
deterministic and stochastic computer experiments. However, application of
Gaussian process models is greatly limited in practice, particularly for
large-scale and many-input computer experiments that have become typical. We
propose a multi-resolution functional ANOVA model as a computationally feasible
emulation alternative. More generally, this model can be used for large-scale
and many-input non-linear regression problems. An overlapping group lasso
approach is used for estimation, ensuring computational feasibility in a
large-scale and many-input setting. New results on consistency and inference
for the (potentially overlapping) group lasso in a high-dimensional setting are
developed and applied to the proposed multi-resolution functional ANOVA model.
Importantly, these results allow us to quantify the uncertainty in our
predictions. Numerical examples demonstrate that the proposed model enjoys
marked computational advantages. Data capabilities, both in terms of sample
size and dimension, meet or exceed best available emulation tools while meeting
or exceeding emulation accuracy
Compressive Measurement Designs for Estimating Structured Signals in Structured Clutter: A Bayesian Experimental Design Approach
This work considers an estimation task in compressive sensing, where the goal
is to estimate an unknown signal from compressive measurements that are
corrupted by additive pre-measurement noise (interference, or clutter) as well
as post-measurement noise, in the specific setting where some (perhaps limited)
prior knowledge on the signal, interference, and noise is available. The
specific aim here is to devise a strategy for incorporating this prior
information into the design of an appropriate compressive measurement strategy.
Here, the prior information is interpreted as statistics of a prior
distribution on the relevant quantities, and an approach based on Bayesian
Experimental Design is proposed. Experimental results on synthetic data
demonstrate that the proposed approach outperforms traditional random
compressive measurement designs, which are agnostic to the prior information,
as well as several other knowledge-enhanced sensing matrix designs based on
more heuristic notions.Comment: 5 pages, 4 figures. Accepted for publication at The Asilomar
Conference on Signals, Systems, and Computers 201
High-Dimensional Bayesian Optimization via Tree-Structured Additive Models
Bayesian Optimization (BO) has shown significant success in tackling
expensive low-dimensional black-box optimization problems. Many optimization
problems of interest are high-dimensional, and scaling BO to such settings
remains an important challenge. In this paper, we consider generalized additive
models in which low-dimensional functions with overlapping subsets of variables
are composed to model a high-dimensional target function. Our goal is to lower
the computational resources required and facilitate faster model learning by
reducing the model complexity while retaining the sample-efficiency of existing
methods. Specifically, we constrain the underlying dependency graphs to tree
structures in order to facilitate both the structure learning and optimization
of the acquisition function. For the former, we propose a hybrid graph learning
algorithm based on Gibbs sampling and mutation. In addition, we propose a novel
zooming-based algorithm that permits generalized additive models to be employed
more efficiently in the case of continuous domains. We demonstrate and discuss
the efficacy of our approach via a range of experiments on synthetic functions
and real-world datasets.Comment: To appear in AAAI 202
- …