23,819 research outputs found
Differentiating the multipoint Expected Improvement for optimal batch design
This work deals with parallel optimization of expensive objective functions
which are modeled as sample realizations of Gaussian processes. The study is
formalized as a Bayesian optimization problem, or continuous multi-armed bandit
problem, where a batch of q > 0 arms is pulled in parallel at each iteration.
Several algorithms have been developed for choosing batches by trading off
exploitation and exploration. As of today, the maximum Expected Improvement
(EI) and Upper Confidence Bound (UCB) selection rules appear as the most
prominent approaches for batch selection. Here, we build upon recent work on
the multipoint Expected Improvement criterion, for which an analytic expansion
relying on Tallis' formula was recently established. The computational burden
of this selection rule being still an issue in application, we derive a
closed-form expression for the gradient of the multipoint Expected Improvement,
which aims at facilitating its maximization using gradient-based ascent
algorithms. Substantial computational savings are shown in application. In
addition, our algorithms are tested numerically and compared to
state-of-the-art UCB-based batch-sequential algorithms. Combining starting
designs relying on UCB with gradient-based EI local optimization finally
appears as a sound option for batch design in distributed Gaussian Process
optimization
Generalized power method for sparse principal component analysis
In this paper we develop a new approach to sparse principal component
analysis (sparse PCA). We propose two single-unit and two block optimization
formulations of the sparse PCA problem, aimed at extracting a single sparse
dominant principal component of a data matrix, or more components at once,
respectively. While the initial formulations involve nonconvex functions, and
are therefore computationally intractable, we rewrite them into the form of an
optimization program involving maximization of a convex function on a compact
set. The dimension of the search space is decreased enormously if the data
matrix has many more columns (variables) than rows. We then propose and analyze
a simple gradient method suited for the task. It appears that our algorithm has
best convergence properties in the case when either the objective function or
the feasible set are strongly convex, which is the case with our single-unit
formulations and can be enforced in the block case. Finally, we demonstrate
numerically on a set of random and gene expression test problems that our
approach outperforms existing algorithms both in quality of the obtained
solution and in computational speed.Comment: Submitte
EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis
Data clustering has received a lot of attention and numerous methods,
algorithms and software packages are available. Among these techniques,
parametric finite-mixture models play a central role due to their interesting
mathematical properties and to the existence of maximum-likelihood estimators
based on expectation-maximization (EM). In this paper we propose a new mixture
model that associates a weight with each observed point. We introduce the
weighted-data Gaussian mixture and we derive two EM algorithms. The first one
considers a fixed weight for each observation. The second one treats each
weight as a random variable following a gamma distribution. We propose a model
selection method based on a minimum message length criterion, provide a weight
initialization strategy, and validate the proposed algorithms by comparing them
with several state of the art parametric and non-parametric clustering
techniques. We also demonstrate the effectiveness and robustness of the
proposed clustering technique in the presence of heterogeneous data, namely
audio-visual scene analysis.Comment: 14 pages, 4 figures, 4 table
- …