Search CORE

2,457 research outputs found

A Winnow-Based Approach to Context-Sensitive Spelling Correction

Author: Golding Andrew R.
Roth Dan
Publication venue
Publication date: 31/10/1998
Field of study

A large class of machine-learning problems in natural language require the characterization of linguistic context. Two characteristic properties of such problems are that their feature space is of very high dimensionality, and their target concepts refer to only a small subset of the features in the space. Under such conditions, multiplicative weight-update algorithms such as Winnow have been shown to have exceptionally good theoretical properties. We present an algorithm combining variants of Winnow and weighted-majority voting, and apply it to a problem in the aforementioned class: context-sensitive spelling correction. This is the task of fixing spelling errors that happen to result in valid words, such as substituting "to" for "too", "casual" for "causal", etc. We evaluate our algorithm, WinSpell, by comparing it against BaySpell, a statistics-based method representing the state of the art for this task. We find: (1) When run with a full (unpruned) set of features, WinSpell achieves accuracies significantly higher than BaySpell was able to achieve in either the pruned or unpruned condition; (2) When compared with other systems in the literature, WinSpell exhibits the highest performance; (3) The primary reason that WinSpell outperforms BaySpell is that WinSpell learns a better linear separator; (4) When run on a test set drawn from a different corpus than the training set was drawn from, WinSpell is better able than BaySpell to adapt, using a strategy we will present that combines supervised learning on the training set with unsupervised learning on the (noisy) test set.Comment: To appear in Machine Learning, Special Issue on Natural Language Learning, 1999. 25 page

arXiv.org e-Print Archive

CiteSeerX

Applying Winnow to Context-Sensitive Spelling Correction

Author: Golding Andrew R.
Roth Dan
Publication venue
Publication date: 01/01/1996
Field of study

Multiplicative weight-updating algorithms such as Winnow have been studied extensively in the COLT literature, but only recently have people started to use them in applications. In this paper, we apply a Winnow-based algorithm to a task in natural language: context-sensitive spelling correction. This is the task of fixing spelling errors that happen to result in valid words, such as substituting {\it to\/} for {\it too}, {\it casual\/} for {\it causal}, and so on. Previous approaches to this problem have been statistics-based; we compare Winnow to one of the more successful such approaches, which uses Bayesian classifiers. We find that: (1)~When the standard (heavily-pruned) set of features is used to describe problem instances, Winnow performs comparably to the Bayesian method; (2)~When the full (unpruned) set of features is used, Winnow is able to exploit the new features and convincingly outperform Bayes; and (3)~When a test set is encountered that is dissimilar to the training set, Winnow is better than Bayes at adapting to the unfamiliar test set, using a strategy we will present for combining learning on the training set with unsupervised learning on the (noisy) test set.Comment: 9 page

arXiv.org e-Print Archive

CiteSeerX

Compositional Vector Space Models for Knowledge Base Completion

Author: McCallum Andrew
Neelakantan Arvind
Roth Benjamin
Publication venue
Publication date: 01/01/2015
Field of study

Knowledge base (KB) completion adds new facts to a KB by making inferences from existing facts, for example by inferring with high likelihood nationality(X,Y) from bornIn(X,Y). Most previous methods infer simple one-hop relational synonyms like this, or use as evidence a multi-hop relational path treated as an atomic feature, like bornIn(X,Z) -> containedIn(Z,Y). This paper presents an approach that reasons about conjunctions of multi-hop relations non-atomically, composing the implications of a path using a recursive neural network (RNN) that takes as inputs vector embeddings of the binary relation in the path. Not only does this allow us to generalize to paths unseen at training time, but also, with a single high-capacity RNN, to predict new relation types not seen when the compositional model was trained (zero-shot learning). We assemble a new dataset of over 52M relational triples, and show that our method improves over a traditional classifier by 11%, and a method leveraging pre-trained embeddings by 7%.Comment: The 53rd Annual Meeting of the Association for Computational Linguistics and The 7th International Joint Conference of the Asian Federation of Natural Language Processing, 201

arXiv.org e-Print Archive

CiteSeerX

Particle Gibbs Split-Merge Sampling for Bayesian Inference in Mixture Models

Author: Bouchard-Côté Alexandre
Doucet Arnaud
Roth Andrew
Publication venue
Publication date: 01/01/2017
Field of study

This paper presents a new Markov chain Monte Carlo method to sample from the posterior distribution of conjugate mixture models. This algorithm relies on a flexible split-merge procedure built using the particle Gibbs sampler. Contrary to available split-merge procedures, the resulting so-called Particle Gibbs Split-Merge sampler does not require the computation of a complex acceptance ratio, is simple to implement using existing sequential Monte Carlo libraries and can be parallelized. We investigate its performance experimentally on synthetic problems as well as on geolocation and cancer genomics data. In all these examples, the particle Gibbs split-merge sampler outperforms state-of-the-art split-merge methods by up to an order of magnitude for a fixed computational complexity

arXiv.org e-Print Archive

Oxford University Research Archive

Selectivity in binary fluid mixtures: static and dynamical properties

Author: Archer Andrew J.
Rauscher Markus
Roth Roland
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2009
Field of study

Selectivity of particles in a region of space can be achieved by applying external potentials to influence the particles in that region. We investigate static and dynamical properties of size selectivity in binary fluid mixtures of two particles sizes. We find that by applying an external potential that is attractive to both kinds of particles, due to crowding effects, this can lead to one species of particles being expelled from that region, whilst the other species is attracted into the region where the potential is applied. This selectivity of one species of particle over the other in a localized region of space depends on the density and composition of the fluid mixture. Applying an external potential that repels both kinds of particles leads to selectivity of the opposite species of particles to the selectivity with attractive potentials. We use equilibrium and dynamical density functional theory to describe and understand the static and dynamical properties of this striking phenomenon. Selectivity by some ion-channels is believed to be due to this effect.Comment: 11 pages, 9 figure

arXiv.org e-Print Archive

Loughborough University Institutional Repository

MPG.PuRe

Inverted initial conditions: exploring the growth of cosmic structure and voids

Author: Peiris Hiranya V.
Pontzen Andrew
Roth Nina
Slosar Anže
Publication venue: 'American Physical Society (APS)'
Publication date: 12/11/2015
Field of study

We introduce and explore "paired" cosmological simulations. A pair consists of an A and B simulation with initial conditions related by the inversion

\delta_A(x, t_{initial})=-\delta_B(x,t_{initial})

(underdensities substituted for overdensities and vice versa). We argue that the technique is valuable for improving our understanding of cosmic structure formation. The A and B fields are by definition equally likely draws from {\Lambda}CDM initial conditions, and in the linear regime evolve identically up to the overall sign. As non-linear evolution takes hold, a region that collapses to form a halo in simulation A will tend to expand to create a void in simulation B. Applications include (i) contrasting the growth of A-halos and B-voids to test excursion-set theories of structure formation; (ii) cross-correlating the density field of the A and B universes as a novel test for perturbation theory; and (iii) canceling error terms by averaging power spectra between the two boxes. Generalizations of the method to more elaborate field transformations are suggested.Comment: 10 pages (including appendix), 6 figures. To be submitted to PR

arXiv.org e-Print Archive

UCL Discovery

Recommended from our members

Pathways of genetic adaptation: multistep origin of mutants under selection without induced mutagenesis in Salmonella enterica.

Author: Quiñones-Soto Semarhy
Reams Andrew B
Roth John R
Publication venue: eScholarship, University of California
Publication date: 01/11/2012
Field of study

In several bacterial systems, mutant cell populations plated on growth-restricting medium give rise to revertant colonies that accumulate over several days. One model suggests that nongrowing parent cells mutagenize their own genome and thereby create beneficial mutations (stress-induced mutagenesis). By this model, the first-order induction of new mutations in a nongrowing parent cell population leads to the delayed accumulation of visible colonies. In an alternative model (selection only), selective conditions allow preexisting small-effect mutants to initiate clones that grow and give rise to faster-growing mutants. By the selection-only model, the delay in appearance of revertant colonies reflects (1) the time required for initial clones to reach a size sufficient to allow the second mutation plus (2) the time required for growth of the improved subclone. We previously characterized a system in which revertant colonies accumulate slowly and contain cells with two mutations, one formed before plating and one after. This left open the question of whether mutation rates increase under selection. Here we measure the unselected formation rate and the growth contribution of each mutant type. When these parameters are used in a graphic model of revertant colony development, they demonstrate that no increase in mutation rate is required to explain the number and delayed appearance of two of the revertant types

eScholarship - University of California

Spectacle and discourse of decommoditisation in the construction of subaltern public spheres: the P’urhépecha New Year and P’urhéecherio

Author: Roth-Seneff Andrew
Publication venue: School of Advanced Study, University of London
Publication date: 01/01/2014
Field of study

SAS-SPACE