2 research outputs found
Large Margin Boltzmann Machines and Large Margin Sigmoid Belief Networks
Current statistical models for structured prediction make simplifying
assumptions about the underlying output graph structure, such as assuming a
low-order Markov chain, because exact inference becomes intractable as the
tree-width of the underlying graph increases. Approximate inference algorithms,
on the other hand, force one to trade off representational power with
computational efficiency. In this paper, we propose two new types of
probabilistic graphical models, large margin Boltzmann machines (LMBMs) and
large margin sigmoid belief networks (LMSBNs), for structured prediction.
LMSBNs in particular allow a very fast inference algorithm for arbitrary graph
structures that runs in polynomial time with a high probability. This
probability is data-distribution dependent and is maximized in learning. The
new approach overcomes the representation-efficiency trade-off in previous
models and allows fast structured prediction with complicated graph structures.
We present results from applying a fully connected model to multi-label scene
classification and demonstrate that the proposed approach can yield significant
performance gains over current state-of-the-art methods
Large Margin Boltzmann Machines β
Boltzmann Machines are a powerful class of undirected graphical models. Originally proposed as artificial neural networks, they can be regarded as a type of Markov Random Field in which the connection weights between nodes are symmetric and learned from data. They are also closely related to recent models such as Markov logic networks and Conditional Random Fields. A major challenge for Boltzmann machines (as well as other graphical models) is speeding up learning for large-scale problems. The heart of the problem lies in efficiently and effectively approximating the partition function. In this paper, we propose a new efficient learning algorithm for Boltzmann machines that allows them to be applied to problems with large numbers of random variables. We introduce a new large-margin variational approximation to the partition function that allows Boltzmann machines to be trained using a support vector machine (SVM) style learning algorithm. For discriminative learning tasks, these large margin Boltzmann machines provide an alternative approach to structural SVMs. We show that these machines have low sample complexity and derive a generalization bound. Our results demonstrate that on multilabel classification problems, large margin Boltzmann machines achieve orders of magnitude faster performance than structural SVMs and also outperform structural SVMs on problems with large numbers of labels.