Search CORE

7,632 research outputs found

Generative and Discriminative Text Classification with Recurrent Neural Networks

Author: Blunsom Phil
Dyer Chris
Ling Wang
Yogatama Dani
Publication venue
Publication date: 01/01/2017
Field of study

We empirically characterize the performance of discriminative and generative LSTM models for text classification. We find that although RNN-based generative models are more powerful than their bag-of-words ancestors (e.g., they account for conditional dependencies across words in a document), they have higher asymptotic error rates than discriminatively trained RNN models. However we also find that generative models approach their asymptotic error rate more rapidly than their discriminative counterparts---the same pattern that Ng & Jordan (2001) proved holds for linear classification models that make more naive conditional independence assumptions. Building on this finding, we hypothesize that RNN-based generative classification models will be more robust to shifts in the data distribution. This hypothesis is confirmed in a series of experiments in zero-shot and continual learning settings that show that generative models substantially outperform discriminative models

arXiv.org e-Print Archive

Oxford University Research Archive

Continual Classification Learning Using Generative Models

Author: Gregorova Magda
Kalousis Alexandros
Lavda Frantzeska
Ramapuram Jason
Publication venue
Publication date: 24/10/2018
Field of study

Continual learning is the ability to sequentially learn over time by accommodating knowledge while retaining previously learned experiences. Neural networks can learn multiple tasks when trained on them jointly, but cannot maintain performance on previously learned tasks when tasks are presented one at a time. This problem is called catastrophic forgetting. In this work, we propose a classification model that learns continuously from sequentially observed tasks, while preventing catastrophic forgetting. We build on the lifelong generative capabilities of [10] and extend it to the classification setting by deriving a new variational bound on the joint log likelihood,

\log p(x; y)

.Comment: 5 pages, 4 figures, under review in Continual learning Workshop NIPS 201

arXiv.org e-Print Archive

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)