1,936 research outputs found

    Learning probability distributions generated by finite-state machines

    Get PDF
    We review methods for inference of probability distributions generated by probabilistic automata and related models for sequence generation. We focus on methods that can be proved to learn in the inference in the limit and PAC formal models. The methods we review are state merging and state splitting methods for probabilistic deterministic automata and the recently developed spectral method for nondeterministic probabilistic automata. In both cases, we derive them from a high-level algorithm described in terms of the Hankel matrix of the distribution to be learned, given as an oracle, and then describe how to adapt that algorithm to account for the error introduced by a finite sample.Peer ReviewedPostprint (author's final draft

    Some improvements of the spectral learning approach for probabilistic grammatical inference

    No full text
    International audienceSpectral methods propose new and elegant solutions in probabilistic grammatical inference. We propose two ways to improve them. We show how a linear representation, or equivalently a weighted automata, output by the spectral learning algorithm can be taken as an initial point for the Baum Welch algorithm, in order to increase the likelihood of the observation data. Secondly, we show how the inference problem can naturally be expressed in the framework of Structured Low-Rank Approximation. Both ideas are tested on a benchmark extracted from the PAutomaC challenge

    Complexity of Equivalence and Learning for Multiplicity Tree Automata

    Full text link
    We consider the complexity of equivalence and learning for multiplicity tree automata, i.e., weighted tree automata over a field. We first show that the equivalence problem is logspace equivalent to polynomial identity testing, the complexity of which is a longstanding open problem. Secondly, we derive lower bounds on the number of queries needed to learn multiplicity tree automata in Angluin's exact learning model, over both arbitrary and fixed fields. Habrard and Oncina (2006) give an exact learning algorithm for multiplicity tree automata, in which the number of queries is proportional to the size of the target automaton and the size of a largest counterexample, represented as a tree, that is returned by the Teacher. However, the smallest tree-counterexample may be exponential in the size of the target automaton. Thus the above algorithm does not run in time polynomial in the size of the target automaton, and has query complexity exponential in the lower bound. Assuming a Teacher that returns minimal DAG representations of counterexamples, we give a new exact learning algorithm whose query complexity is quadratic in the target automaton size, almost matching the lower bound, and improving the best previously-known algorithm by an exponential factor

    Relevant Representations for the Inference of Rational Stochastic Tree Languages

    Get PDF
    International audienceRecently, an algorithm, DEES, was proposed for learning rational stochastic tree languages. Given an independantly and identically distributed sample of trees, drawn according to a rational stochastic language, DEES outputs a linear representation of a rational series which converges to the target. DEES can then be used to identify in the limit with probability one rational stochastic tree languages. However, when DEES deals with finite samples, it often outputs a rational tree series which does not define a stochastic language. Moreover, the linear representation can not be directly used as a generative model. In this paper, we show that any representation of a rational stochastic tree language can be transformed in a reduced normalised representation that can be used to generate trees from the underlying distribution. We also study some properties of consistency for rational stochastic tree languages and discuss their implication for the inference. We finally consider the applicability of DEES to trees built over an unranked alphabet

    Statistical methods in language processing

    Full text link
    The term statistical methods here refers to a methodology that has been dominant in computational linguistics since about 1990. It is characterized by the use of stochastic models, substantial data sets, machine learning, and rigorous experimental evaluation. The shift to statistical methods in computational linguistics parallels a movement in artificial intelligence more broadly. Statistical methods have so thoroughly permeated computational linguistics that almost all work in the field draws on them in some way. There has, however, been little penetration of the methods into general linguistics. The methods themselves are largely borrowed from machine learning and information theory. We limit attention to that which has direct applicability to language processing, though the methods are quite general and have many nonlinguistic applications. Not every use of statistics in language processing falls under statistical methods as we use the term. Standard hypothesis testing and experimental design, for example, are not covered in this article. WIREs Cogni Sci 2011 2 315–322 DOI: 10.1002/wcs.111 For further resources related to this article, please visit the WIREs websitePeer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/83468/1/111_ftp.pd

    Unsupervised spectral learning of WCFG as low-rank matrix completion

    Get PDF
    We derive a spectral method for unsupervised learning ofWeighted Context Free Grammars. We frame WCFG induction as finding a Hankel matrix that has low rank and is linearly constrained to represent a function computed by inside-outside recursions. The proposed algorithm picks the grammar that agrees with a sample and is the simplest with respect to the nuclear norm of the Hankel matrix.Peer ReviewedPreprin

    Backdoors in Neural Models of Source Code

    Full text link
    Deep neural networks are vulnerable to a range of adversaries. A particularly pernicious class of vulnerabilities are backdoors, where model predictions diverge in the presence of subtle triggers in inputs. An attacker can implant a backdoor by poisoning the training data to yield a desired target prediction on triggered inputs. We study backdoors in the context of deep-learning for source code. (1) We define a range of backdoor classes for source-code tasks and show how to poison a dataset to install such backdoors. (2) We adapt and improve recent algorithms from robust statistics for our setting, showing that backdoors leave a spectral signature in the learned representation of source code, thus enabling detection of poisoned data. (3) We conduct a thorough evaluation on different architectures and languages, showing the ease of injecting backdoors and our ability to eliminate them
    • …
    corecore