4 research outputs found
MedLDA: A General Framework of Maximum Margin Supervised Topic Models
Supervised topic models utilize document's side information for discovering
predictive low dimensional representations of documents. Existing models apply
the likelihood-based estimation. In this paper, we present a general framework
of max-margin supervised topic models for both continuous and categorical
response variables. Our approach, the maximum entropy discrimination latent
Dirichlet allocation (MedLDA), utilizes the max-margin principle to train
supervised topic models and estimate predictive topic representations that are
arguably more suitable for prediction tasks. The general principle of MedLDA
can be applied to perform joint max-margin learning and maximum likelihood
estimation for arbitrary topic models, directed or undirected, and supervised
or unsupervised, when the supervised side information is available. We develop
efficient variational methods for posterior inference and parameter estimation,
and demonstrate qualitatively and quantitatively the advantages of MedLDA over
likelihood-based topic models on movie review and 20 Newsgroups data sets.Comment: 27 Page
Max-margin Deep Generative Models
Deep generative models (DGMs) are effective on learning multilayered
representations of complex data and performing inference of input data by
exploring the generative ability. However, little work has been done on
examining or empowering the discriminative ability of DGMs on making accurate
predictions. This paper presents max-margin deep generative models (mmDGMs),
which explore the strongly discriminative principle of max-margin learning to
improve the discriminative power of DGMs, while retaining the generative
capability. We develop an efficient doubly stochastic subgradient algorithm for
the piecewise linear objective. Empirical results on MNIST and SVHN datasets
demonstrate that (1) max-margin learning can significantly improve the
prediction performance of DGMs and meanwhile retain the generative ability; and
(2) mmDGMs are competitive to the state-of-the-art fully discriminative
networks by employing deep convolutional neural networks (CNNs) as both
recognition and generative models
Maximum Entropy Discrimination Markov Networks
In this paper, we present a novel and general framework called {\it Maximum
Entropy Discrimination Markov Networks} (MaxEnDNet), which integrates the
max-margin structured learning and Bayesian-style estimation and combines and
extends their merits. Major innovations of this model include: 1) It
generalizes the extant Markov network prediction rule based on a point
estimator of weights to a Bayesian-style estimator that integrates over a
learned distribution of the weights. 2) It extends the conventional max-entropy
discrimination learning of classification rule to a new structural max-entropy
discrimination paradigm of learning the distribution of Markov networks. 3) It
subsumes the well-known and powerful Maximum Margin Markov network (MN) as
a special case, and leads to a model similar to an -regularized MN
that is simultaneously primal and dual sparse, or other types of Markov network
by plugging in different prior distributions of the weights. 4) It offers a
simple inference algorithm that combines existing variational inference and
convex-optimization based MN solvers as subroutines. 5) It offers a
PAC-Bayesian style generalization bound. This work represents the first
successful attempt to combine Bayesian-style learning (based on generative
models) with structured maximum margin learning (based on a discriminative
model), and outperforms a wide array of competing methods for structured
input/output learning on both synthetic and real data sets.Comment: 39 page
Partially Observed Maximum Entropy Discrimination Markov Networks
Learning graphical models with hidden variables can offer semantic insights to complex data and lead to salient structured predictors without relying on expensive, sometime unattainable fully annotated training data. While likelihood-based methods have been extensively explored, to our knowledge, learning structured prediction models with latent variables based on the max-margin principle remains largely an open problem. In this paper, we present a partially observed Maximum Entropy Discrimination Markov Network (PoMEN) model that attempts to combine the advantages of Bayesian and margin based paradigms for learning Markov networks from partially labeled data. PoMEN leads to an averaging prediction rule that resembles a Bayes predictor that is more robust to overfitting, but is also built on the desirable discriminative laws resemble those of the M 3 N. We develop an EM-style algorithm utilizing existing convex optimization algorithms for M 3 N as a subroutine. We demonstrate competent performance of PoMEN over existing methods on a real-world web data extraction task.