5,485 research outputs found
Submodularity in Batch Active Learning and Survey Problems on Gaussian Random Fields
Many real-world datasets can be represented in the form of a graph whose edge
weights designate similarities between instances. A discrete Gaussian random
field (GRF) model is a finite-dimensional Gaussian process (GP) whose prior
covariance is the inverse of a graph Laplacian. Minimizing the trace of the
predictive covariance Sigma (V-optimality) on GRFs has proven successful in
batch active learning classification problems with budget constraints. However,
its worst-case bound has been missing. We show that the V-optimality on GRFs as
a function of the batch query set is submodular and hence its greedy selection
algorithm guarantees an (1-1/e) approximation ratio. Moreover, GRF models have
the absence-of-suppressor (AofS) condition. For active survey problems, we
propose a similar survey criterion which minimizes 1'(Sigma)1. In practice,
V-optimality criterion performs better than GPs with mutual information gain
criteria and allows nonuniform costs for different nodes
A Probabilistic Interpretation of Sampling Theory of Graph Signals
We give a probabilistic interpretation of sampling theory of graph signals.
To do this, we first define a generative model for the data using a pairwise
Gaussian random field (GRF) which depends on the graph. We show that, under
certain conditions, reconstructing a graph signal from a subset of its samples
by least squares is equivalent to performing MAP inference on an approximation
of this GRF which has a low rank covariance matrix. We then show that a
sampling set of given size with the largest associated cut-off frequency, which
is optimal from a sampling theoretic point of view, minimizes the worst case
predictive covariance of the MAP estimate on the GRF. This interpretation also
gives an intuitive explanation for the superior performance of the sampling
theoretic approach to active semi-supervised classification.Comment: 5 pages, 2 figures, To appear in International Conference on
Acoustics, Speech, and Signal Processing (ICASSP) 201
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
We present a tutorial on Bayesian optimization, a method of finding the
maximum of expensive cost functions. Bayesian optimization employs the Bayesian
technique of setting a prior over the objective function and combining it with
evidence to get a posterior function. This permits a utility-based selection of
the next observation to make on the objective function, which must take into
account both exploration (sampling from areas of high uncertainty) and
exploitation (sampling areas likely to offer improvement over the current best
observation). We also present two detailed extensions of Bayesian optimization,
with experiments---active user modelling with preferences, and hierarchical
reinforcement learning---and a discussion of the pros and cons of Bayesian
optimization based on our experiences
Metamodel-based importance sampling for structural reliability analysis
Structural reliability methods aim at computing the probability of failure of
systems with respect to some prescribed performance functions. In modern
engineering such functions usually resort to running an expensive-to-evaluate
computational model (e.g. a finite element model). In this respect simulation
methods, which may require runs cannot be used directly. Surrogate
models such as quadratic response surfaces, polynomial chaos expansions or
kriging (which are built from a limited number of runs of the original model)
are then introduced as a substitute of the original model to cope with the
computational cost. In practice it is almost impossible to quantify the error
made by this substitution though. In this paper we propose to use a kriging
surrogate of the performance function as a means to build a quasi-optimal
importance sampling density. The probability of failure is eventually obtained
as the product of an augmented probability computed by substituting the
meta-model for the original performance function and a correction term which
ensures that there is no bias in the estimation even if the meta-model is not
fully accurate. The approach is applied to analytical and finite element
reliability problems and proves efficient up to 100 random variables.Comment: 20 pages, 7 figures, 2 tables. Preprint submitted to Probabilistic
Engineering Mechanic
Revisiting loss-specific training of filter-based MRFs for image restoration
It is now well known that Markov random fields (MRFs) are particularly
effective for modeling image priors in low-level vision. Recent years have seen
the emergence of two main approaches for learning the parameters in MRFs: (1)
probabilistic learning using sampling-based algorithms and (2) loss-specific
training based on MAP estimate. After investigating existing training
approaches, it turns out that the performance of the loss-specific training has
been significantly underestimated in existing work. In this paper, we revisit
this approach and use techniques from bi-level optimization to solve it. We
show that we can get a substantial gain in the final performance by solving the
lower-level problem in the bi-level framework with high accuracy using our
newly proposed algorithm. As a result, our trained model is on par with highly
specialized image denoising algorithms and clearly outperforms
probabilistically trained MRF models. Our findings suggest that for the
loss-specific training scheme, solving the lower-level problem with higher
accuracy is beneficial. Our trained model comes along with the additional
advantage, that inference is extremely efficient. Our GPU-based implementation
takes less than 1s to produce state-of-the-art performance.Comment: 10 pages, 2 figures, appear at 35th German Conference, GCPR 2013,
Saarbr\"ucken, Germany, September 3-6, 2013. Proceeding
High-dimensional Sparse Inverse Covariance Estimation using Greedy Methods
In this paper we consider the task of estimating the non-zero pattern of the
sparse inverse covariance matrix of a zero-mean Gaussian random vector from a
set of iid samples. Note that this is also equivalent to recovering the
underlying graph structure of a sparse Gaussian Markov Random Field (GMRF). We
present two novel greedy approaches to solving this problem. The first
estimates the non-zero covariates of the overall inverse covariance matrix
using a series of global forward and backward greedy steps. The second
estimates the neighborhood of each node in the graph separately, again using
greedy forward and backward steps, and combines the intermediate neighborhoods
to form an overall estimate. The principal contribution of this paper is a
rigorous analysis of the sparsistency, or consistency in recovering the
sparsity pattern of the inverse covariance matrix. Surprisingly, we show that
both the local and global greedy methods learn the full structure of the model
with high probability given just samples, which is a
\emph{significant} improvement over state of the art -regularized
Gaussian MLE (Graphical Lasso) that requires samples. Moreover,
the restricted eigenvalue and smoothness conditions imposed by our greedy
methods are much weaker than the strong irrepresentable conditions required by
the -regularization based methods. We corroborate our results with
extensive simulations and examples, comparing our local and global greedy
methods to the -regularized Gaussian MLE as well as the Neighborhood
Greedy method to that of nodewise -regularized linear regression
(Neighborhood Lasso).Comment: Accepted to AI STAT 2012 for Oral Presentatio
Sequential Design with Mutual Information for Computer Experiments (MICE): Emulation of a Tsunami Model
Computer simulators can be computationally intensive to run over a large
number of input values, as required for optimization and various uncertainty
quantification tasks. The standard paradigm for the design and analysis of
computer experiments is to employ Gaussian random fields to model computer
simulators. Gaussian process models are trained on input-output data obtained
from simulation runs at various input values. Following this approach, we
propose a sequential design algorithm, MICE (Mutual Information for Computer
Experiments), that adaptively selects the input values at which to run the
computer simulator, in order to maximize the expected information gain (mutual
information) over the input space. The superior computational efficiency of the
MICE algorithm compared to other algorithms is demonstrated by test functions,
and a tsunami simulator with overall gains of up to 20% in that case
- …