1,169 research outputs found
Online Isotonic Regression
We consider the online version of the isotonic regression problem. Given a
set of linearly ordered points (e.g., on the real line), the learner must
predict labels sequentially at adversarially chosen positions and is evaluated
by her total squared loss compared against the best isotonic (non-decreasing)
function in hindsight. We survey several standard online learning algorithms
and show that none of them achieve the optimal regret exponent; in fact, most
of them (including Online Gradient Descent, Follow the Leader and Exponential
Weights) incur linear regret. We then prove that the Exponential Weights
algorithm played over a covering net of isotonic functions has a regret bounded
by and present a matching
lower bound on regret. We provide a computationally efficient version of this
algorithm. We also analyze the noise-free case, in which the revealed labels
are isotonic, and show that the bound can be improved to or even to
(when the labels are revealed in isotonic order). Finally, we extend the
analysis beyond squared loss and give bounds for entropic loss and absolute
loss.Comment: 25 page
Defensive forecasting for optimal prediction with expert advice
The method of defensive forecasting is applied to the problem of prediction
with expert advice for binary outcomes. It turns out that defensive forecasting
is not only competitive with the Aggregating Algorithm but also handles the
case of "second-guessing" experts, whose advice depends on the learner's
prediction; this paper assumes that the dependence on the learner's prediction
is continuous.Comment: 14 page
Flexible and accurate inference and learning for deep generative models
We introduce a new approach to learning in hierarchical latent-variable
generative models called the "distributed distributional code Helmholtz
machine", which emphasises flexibility and accuracy in the inferential process.
In common with the original Helmholtz machine and later variational autoencoder
algorithms (but unlike adverserial methods) our approach learns an explicit
inference or "recognition" model to approximate the posterior distribution over
the latent variables. Unlike in these earlier methods, the posterior
representation is not limited to a narrow tractable parameterised form (nor is
it represented by samples). To train the generative and recognition models we
develop an extended wake-sleep algorithm inspired by the original Helmholtz
Machine. This makes it possible to learn hierarchical latent models with both
discrete and continuous variables, where an accurate posterior representation
is essential. We demonstrate that the new algorithm outperforms current
state-of-the-art methods on synthetic, natural image patch and the MNIST data
sets
- …