research

Correction of Uniformly Noisy Distributions to Improve Probabilistic Grammatical Inference Algorithms

Abstract

International audienceIn this paper, we aim at correcting distributions of noisy samples in order to improve the inference of probabilistic automata. Rather than definitively removing corrupted examples before the learning process, we propose a technique, based on statisticalestimates and linear regression, for correcting the probabilistic prefix tree automaton (PPTA). It requires a human expertise to correct only a small sample of data, selected in order to estimate the noise level. This statistical information permits us to automatically correct the whole PPTA and then to infer better models from a generalization point of view. After a theoretical analysis of the noise impact, we present a large experimental study on several datasets

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 11/11/2016
    Last time updated on 12/11/2016