4,876 research outputs found

    Do not forget: Full memory in memory-based learning of word pronunciation

    Get PDF
    Memory-based learning, keeping full memory of learning material, appears a viable approach to learning NLP tasks, and is often superior in generalisation accuracy to eager learning approaches that abstract from learning material. Here we investigate three partial memory-based learning approaches which remove from memory specific task instance types estimated to be exceptional. The three approaches each implement one heuristic function for estimating exceptionality of instance types: (i) typicality, (ii) class prediction strength, and (iii) friendly-neighbourhood size. Experiments are performed with the memory-based learning algorithm IB1-IG trained on English word pronunciation. We find that removing instance types with low prediction strength (ii) is the only tested method which does not seriously harm generalisation accuracy. We conclude that keeping full memory of types rather than tokens, and excluding minority ambiguities appear to be the only performance-preserving optimisations of memory-based learning.Comment: uses conll98, epsf, and ipamacs (WSU IPA

    Morphological Analysis as Classification: an Inductive-Learning Approach

    Full text link
    Morphological analysis is an important subtask in text-to-speech conversion, hyphenation, and other language engineering tasks. The traditional approach to performing morphological analysis is to combine a morpheme lexicon, sets of (linguistic) rules, and heuristics to find a most probable analysis. In contrast we present an inductive learning approach in which morphological analysis is reformulated as a segmentation task. We report on a number of experiments in which five inductive learning algorithms are applied to three variations of the task of morphological analysis. Results show (i) that the generalisation performance of the algorithms is good, and (ii) that the lazy learning algorithm IB1-IG performs best on all three tasks. We conclude that lazy learning of morphological analysis as a classification task is indeed a viable approach; moreover, it has the strong advantages over the traditional approach of avoiding the knowledge-acquisition bottleneck, being fast and deterministic in learning and processing, and being language-independent.Comment: 11 pages, 5 encapsulated postscript figures, uses non-standard NeMLaP proceedings style nemlap.sty; inputs ipamacs (international phonetic alphabet) and epsf macro

    Electrodynamics from Noncommutative Geometry

    Get PDF
    Within the framework of Connes' noncommutative geometry, the notion of an almost commutative manifold can be used to describe field theories on compact Riemannian spin manifolds. The most notable example is the derivation of the Standard Model of high energy physics from a suitably chosen almost commutative manifold. In contrast to such a non-abelian gauge theory, it has long been thought impossible to describe an abelian gauge theory within this framework. The purpose of this paper is to improve on this point. We provide a simple example of a commutative spectral triple based on the two-point space, and show that it yields a U(1) gauge theory. Then, we slightly modify the spectral triple such that we obtain the full classical theory of electrodynamics on a curved background manifold.Comment: 16 page

    Supersymmetric QCD and noncommutative geometry

    Get PDF
    We derive supersymmetric quantum chromodynamics from a noncommutative manifold, using the spectral action principle of Chamseddine and Connes. After a review of the Einstein-Yang-Mills system in noncommutative geometry, we establish in full detail that it possesses supersymmetry. This noncommutative model is then extended to give a theory of quarks, squarks, gluons and gluinos by constructing a suitable noncommutative spin manifold (i.e. a spectral triple). The particles are found at their natural place in a spectral triple: the quarks and gluinos as fermions in the Hilbert space, the gluons and squarks as bosons as the inner fluctuations of a (generalized) Dirac operator by the algebra of matrix-valued functions on a manifold. The spectral action principle applied to this spectral triple gives the Lagrangian of supersymmetric QCD, including soft supersymmetry breaking mass terms for the squarks. We find that these results are in good agreement with the physics literature

    Forgetting Exceptions is Harmful in Language Learning

    Get PDF
    We show that in language learning, contrary to received wisdom, keeping exceptional training instances in memory can be beneficial for generalization accuracy. We investigate this phenomenon empirically on a selection of benchmark natural language processing tasks: grapheme-to-phoneme conversion, part-of-speech tagging, prepositional-phrase attachment, and base noun phrase chunking. In a first series of experiments we combine memory-based learning with training set editing techniques, in which instances are edited based on their typicality and class prediction strength. Results show that editing exceptional instances (with low typicality or low class prediction strength) tends to harm generalization accuracy. In a second series of experiments we compare memory-based learning and decision-tree learning methods on the same selection of tasks, and find that decision-tree learning often performs worse than memory-based learning. Moreover, the decrease in performance can be linked to the degree of abstraction from exceptions (i.e., pruning or eagerness). We provide explanations for both results in terms of the properties of the natural language processing tasks and the learning algorithms.Comment: 31 pages, 7 figures, 10 tables. uses 11pt, fullname, a4wide tex styles. Pre-print version of article to appear in Machine Learning 11:1-3, Special Issue on Natural Language Learning. Figures on page 22 slightly compressed to avoid page overloa
    • …
    corecore