1,275 research outputs found
Statistical signal processing with nonnegativity constraints
Nonnegativity constraints arise frequently in statistical learning and pattern recognition. Multiplicative updates provide natural solutions to optimizations involving these constraints. One well known set of multiplicative updates is given by the Expectation-Maximization algorithm for hidden Markov models, as used in automatic speech recognition. Recently, we have derived similar algorithms for nonnegative deconvolution and nonnegative quadratic programming. These algorithms have applications to low-level problems in voice processing, such as fundamental frequency estimation, as well as high-level problems, such as the training of large margin classifiers. In this paper, we describe these algorithms and the ideas that connect them
Multiplicative Updates for Nonnegative Quadratic Programming
Many problems in neural computation and statistical learning involve optimizations with nonnegativity constraints. In this article, we study convex problems in quadratic programming where the optimization is confined to an axis-aligned region in the nonnegative orthant. For these problems, we derive multiplicative updates that improve the value of the objective function at each iteration and converge monotonically to the global minimum. The updates have a simple closed form and do not involve any heuristics or free parameters that must be tuned to ensure convergence. Despite their simplicity, they differ strikingly in form from other multiplicative updates used in machine learning.We provide complete proofs of convergence for these updates and describe their application to problems in signal processing and pattern recognition
A Winnow-Based Approach to Context-Sensitive Spelling Correction
A large class of machine-learning problems in natural language require the
characterization of linguistic context. Two characteristic properties of such
problems are that their feature space is of very high dimensionality, and their
target concepts refer to only a small subset of the features in the space.
Under such conditions, multiplicative weight-update algorithms such as Winnow
have been shown to have exceptionally good theoretical properties. We present
an algorithm combining variants of Winnow and weighted-majority voting, and
apply it to a problem in the aforementioned class: context-sensitive spelling
correction. This is the task of fixing spelling errors that happen to result in
valid words, such as substituting "to" for "too", "casual" for "causal", etc.
We evaluate our algorithm, WinSpell, by comparing it against BaySpell, a
statistics-based method representing the state of the art for this task. We
find: (1) When run with a full (unpruned) set of features, WinSpell achieves
accuracies significantly higher than BaySpell was able to achieve in either the
pruned or unpruned condition; (2) When compared with other systems in the
literature, WinSpell exhibits the highest performance; (3) The primary reason
that WinSpell outperforms BaySpell is that WinSpell learns a better linear
separator; (4) When run on a test set drawn from a different corpus than the
training set was drawn from, WinSpell is better able than BaySpell to adapt,
using a strategy we will present that combines supervised learning on the
training set with unsupervised learning on the (noisy) test set.Comment: To appear in Machine Learning, Special Issue on Natural Language
Learning, 1999. 25 page
Robust Large-Margin Learning in Hyperbolic Space
Recently, there has been a surge of interest in representation learning in
hyperbolic spaces, driven by their ability to represent hierarchical data with
significantly fewer dimensions than standard Euclidean spaces. However, the
viability and benefits of hyperbolic spaces for downstream machine learning
tasks have received less attention. In this paper, we present, to our
knowledge, the first theoretical guarantees for learning a classifier in
hyperbolic rather than Euclidean space. Specifically, we consider the problem
of learning a large-margin classifier for data possessing a hierarchical
structure. Our first contribution is a hyperbolic perceptron algorithm, which
provably converges to a separating hyperplane. We then provide an algorithm to
efficiently learn a large-margin hyperplane, relying on the careful injection
of adversarial examples. Finally, we prove that for hierarchical data that
embeds well into hyperbolic space, the low embedding dimension ensures superior
guarantees when learning the classifier directly in hyperbolic space.Comment: Accepted to NeurIPS 202
- …