11 research outputs found
Generative and Discriminative Text Classification with Recurrent Neural Networks
We empirically characterize the performance of discriminative and generative
LSTM models for text classification. We find that although RNN-based generative
models are more powerful than their bag-of-words ancestors (e.g., they account
for conditional dependencies across words in a document), they have higher
asymptotic error rates than discriminatively trained RNN models. However we
also find that generative models approach their asymptotic error rate more
rapidly than their discriminative counterparts---the same pattern that Ng &
Jordan (2001) proved holds for linear classification models that make more
naive conditional independence assumptions. Building on this finding, we
hypothesize that RNN-based generative classification models will be more robust
to shifts in the data distribution. This hypothesis is confirmed in a series of
experiments in zero-shot and continual learning settings that show that
generative models substantially outperform discriminative models
Preferential Batch Bayesian Optimization
Most research in Bayesian optimization (BO) has focused on \emph{direct
feedback} scenarios, where one has access to exact, or perturbed, values of
some expensive-to-evaluate objective. This direction has been mainly driven by
the use of \bo in machine learning hyper-parameter configuration problems.
However, in domains such as modelling human preferences, A/B tests or
recommender systems, there is a need of methods that are able to replace direct
feedback with \emph{preferential feedback}, obtained via rankings or pairwise
comparisons. In this work, we present Preferential Batch Bayesian Optimization
(PBBO), a new framework that allows to find the optimum of a latent function of
interest, given any type of parallel preferential feedback for a group of two
or more points. We do so by using a Gaussian process model with a likelihood
specially designed to enable parallel and efficient data collection mechanisms,
which are key in modern machine learning. We show how the acquisitions
developed under this framework generalize and augment previous approaches in
Bayesian optimization, expanding the use of these techniques to a wider range
of domains. An extensive simulation study shows the benefits of this approach,
both with simulated functions and four real data sets