109 research outputs found
On-line PCA with Optimal Regrets
We carefully investigate the on-line version of PCA, where in each trial a
learning algorithm plays a k-dimensional subspace, and suffers the compression
loss on the next instance when projected into the chosen subspace. In this
setting, we analyze two popular on-line algorithms, Gradient Descent (GD) and
Exponentiated Gradient (EG). We show that both algorithms are essentially
optimal in the worst-case. This comes as a surprise, since EG is known to
perform sub-optimally when the instances are sparse. This different behavior of
EG for PCA is mainly related to the non-negativity of the loss in this case,
which makes the PCA setting qualitatively different from other settings studied
in the literature. Furthermore, we show that when considering regret bounds as
function of a loss budget, EG remains optimal and strictly outperforms GD.
Next, we study the extension of the PCA setting, in which the Nature is allowed
to play with dense instances, which are positive matrices with bounded largest
eigenvalue. Again we can show that EG is optimal and strictly better than GD in
this setting
Removing batch effects for prediction problems with frozen surrogate variable analysis
Batch effects are responsible for the failure of promising genomic prognos-
tic signatures, major ambiguities in published genomic results, and retractions
of widely-publicized findings. Batch effect corrections have been developed to
re- move these artifacts, but they are designed to be used in population
studies. But genomic technologies are beginning to be used in clinical
applications where sam- ples are analyzed one at a time for diagnostic,
prognostic, and predictive applica- tions. There are currently no batch
correction methods that have been developed specifically for prediction. In
this paper, we propose an new method called frozen surrogate variable analysis
(fSVA) that borrows strength from a training set for individual sample batch
correction. We show that fSVA improves prediction ac- curacy in simulations and
in public genomic studies. fSVA is available as part of the sva Bioconductor
package
Second-order Quantile Methods for Experts and Combinatorial Games
We aim to design strategies for sequential decision making that adjust to the
difficulty of the learning problem. We study this question both in the setting
of prediction with expert advice, and for more general combinatorial decision
tasks. We are not satisfied with just guaranteeing minimax regret rates, but we
want our algorithms to perform significantly better on easy data. Two popular
ways to formalize such adaptivity are second-order regret bounds and quantile
bounds. The underlying notions of 'easy data', which may be paraphrased as "the
learning problem has small variance" and "multiple decisions are useful", are
synergetic. But even though there are sophisticated algorithms that exploit one
of the two, no existing algorithm is able to adapt to both.
In this paper we outline a new method for obtaining such adaptive algorithms,
based on a potential function that aggregates a range of learning rates (which
are essential tuning parameters). By choosing the right prior we construct
efficient algorithms and show that they reap both benefits by proving the first
bounds that are both second-order and incorporate quantiles
- …