Learning as MAP Inference in Discrete Graphical Models

Abstract

We present a new formulation for binary classification. Instead of relying on convex losses and regularizers such as in SVMs, logistic regression and boosting, or instead non-convex but continuous formulations such as those encountered in neural networks and deep belief networks, our framework entails a non-convex but discrete formulation, where estimation amounts to finding a MAP configuration in a graphical model whose potential functions are low-dimensional discrete surrogates for the misclassification loss. We argue that such a discrete formulation can naturally account for a number of issues that are typically encountered in either the convex or the contin-uous non-convex approaches, or both. By reducing the learning problem to a MAP inference problem, we can immediately translate the guarantees available for many inference settings to the learning problem itself. We empirically demonstrate in a number of experiments that this approach is promising in dealing with issues such as severe label noise, while still having global optimality guarantees. Due to the discrete nature of the for-mulation, it also allows for direct regularization through cardinality-based penalties, such as the `0 pseudo-norm, thus providing the ability to perform feature selection and trade-off interpretability and predictability in a prin-cipled manner. We also outline a number of open problems arising from the formulation.

    Similar works