25,408 research outputs found
A Winnow-Based Approach to Context-Sensitive Spelling Correction
A large class of machine-learning problems in natural language require the
characterization of linguistic context. Two characteristic properties of such
problems are that their feature space is of very high dimensionality, and their
target concepts refer to only a small subset of the features in the space.
Under such conditions, multiplicative weight-update algorithms such as Winnow
have been shown to have exceptionally good theoretical properties. We present
an algorithm combining variants of Winnow and weighted-majority voting, and
apply it to a problem in the aforementioned class: context-sensitive spelling
correction. This is the task of fixing spelling errors that happen to result in
valid words, such as substituting "to" for "too", "casual" for "causal", etc.
We evaluate our algorithm, WinSpell, by comparing it against BaySpell, a
statistics-based method representing the state of the art for this task. We
find: (1) When run with a full (unpruned) set of features, WinSpell achieves
accuracies significantly higher than BaySpell was able to achieve in either the
pruned or unpruned condition; (2) When compared with other systems in the
literature, WinSpell exhibits the highest performance; (3) The primary reason
that WinSpell outperforms BaySpell is that WinSpell learns a better linear
separator; (4) When run on a test set drawn from a different corpus than the
training set was drawn from, WinSpell is better able than BaySpell to adapt,
using a strategy we will present that combines supervised learning on the
training set with unsupervised learning on the (noisy) test set.Comment: To appear in Machine Learning, Special Issue on Natural Language
Learning, 1999. 25 page
Applying Winnow to Context-Sensitive Spelling Correction
Multiplicative weight-updating algorithms such as Winnow have been studied
extensively in the COLT literature, but only recently have people started to
use them in applications. In this paper, we apply a Winnow-based algorithm to a
task in natural language: context-sensitive spelling correction. This is the
task of fixing spelling errors that happen to result in valid words, such as
substituting {\it to\/} for {\it too}, {\it casual\/} for {\it causal}, and so
on. Previous approaches to this problem have been statistics-based; we compare
Winnow to one of the more successful such approaches, which uses Bayesian
classifiers. We find that: (1)~When the standard (heavily-pruned) set of
features is used to describe problem instances, Winnow performs comparably to
the Bayesian method; (2)~When the full (unpruned) set of features is used,
Winnow is able to exploit the new features and convincingly outperform Bayes;
and (3)~When a test set is encountered that is dissimilar to the training set,
Winnow is better than Bayes at adapting to the unfamiliar test set, using a
strategy we will present for combining learning on the training set with
unsupervised learning on the (noisy) test set.Comment: 9 page
A survey of techniques for refrigeration, reliquefaction, and production of slush for hydrogen
Several techniques were surveyed for the refrigeration, reliquefaction and production of slush from hydrogen. The techniques included auger; bubbling helium gas; Simon desorption; the Petlier effect; Joule-Kelvin expansion using Stirling, Brayton, and Viulleumirer approaches; rotary reciprocating; a dilution refrigerator; adiabatic demagnetization of a paramagnetic salt; and adiabatic magnetization of a superconductor
Spatiotemporal dynamics in 2D Kolmogorov flow over large domains
Kolmogorov flow in two dimensions - the two-dimensional Navier-Stokes
equations with a sinusoidal body force - is considered over extended periodic
domains to reveal localised spatiotemporal complexity. The flow response
mimicks the forcing at small forcing amplitudes but beyond a critical value
develops a long wavelength instability. The ensuing state is described by a
Cahn-Hilliard-type equation and as a result coarsening dynamics are observed
for random initial data. After further bifurcations, this regime gives way to
multiple attractors, some of which possess spatially-localised time dependence.
Co-existence of such attractors in a large domain gives rise to interesting
collisional dynamics which is captured by a system of 5 (1-space and 1-time)
PDEs based on a long wavelength limit. The coarsening regime reinstates itself
at yet higher forcing amplitudes in the sense that only longest-wavelength
solutions remain attractors. Eventually, there is one global longest-wavelength
attractor which possesses two localised chaotic regions - a kink and antikink -
which connect two steady one-dimensional flow regions of essentially half the
domain width each. The wealth of spatiotemporal complexity uncovered presents a
bountiful arena in which to study the existence of simple invariant localised
solutions which presumably underpin all of the observed behaviour
Two Emission Mechanisms in the Fermi Bubbles: A Possible Signal of Annihilating Dark Matter
We study the variation of the spectrum of the Fermi Bubbles with Galactic
latitude. Far from the Galactic plane (|b| > 30 degrees), the observed
gamma-ray emission is nearly invariant with latitude, and is consistent with
arising from inverse Compton scattering of the interstellar radiation field by
cosmic-ray electrons with an approximately power-law spectrum. The same
electrons in the presence of microgauss-scale magnetic fields can also generate
the the observed microwave "haze". At lower latitudes (b < 20 degrees), in
contrast, the spectrum of the emission correlated with the Bubbles possesses a
pronounced spectral feature peaking at 1-4 GeV (in E^2 dN/dE) which cannot be
generated by any realistic spectrum of electrons. Instead, we conclude that a
second (non-inverse-Compton) emission mechanism must be responsible for the
bulk of the low-energy, low-latitude emission. This second component is
spectrally similar to the excess GeV emission previously reported from the
Galactic Center (GC), and also appears spatially consistent with a luminosity
per volume falling approximately as r^-2.4, where r is the distance from the
GC. We argue that the spectral feature visible in the low-latitude Bubbles is
the extended counterpart of the GC excess, now detected out to at least 2-3 kpc
from the GC. The spectrum and angular distribution of the signal is consistent
with that predicted from ~10 GeV dark matter particles annihilating to leptons,
or from ~50 GeV dark matter particles annihilating to quarks, following a
distribution similar to the canonical Navarro-Frenk-White (NFW) profile. We
also consider millisecond pulsars as a possible astrophysical explanation for
the signal, as observed millisecond pulsars possess a spectral cutoff at
approximately the required energy. Any such scenario would require a large
population of unresolved millisecond pulsars extending at least 2-3 kpc from
the GC.Comment: 26 pages, 20 figure
- …