Correction of Uniformly Noisy Distributions to Improve Probabilistic Grammatical Inference Algorithms

Bernard, Marc; Habrard, Amaury; Sebban, Marc

research

Correction of Uniformly Noisy Distributions to Improve Probabilistic Grammatical Inference Algorithms

Authors: Marc Bernard
Amaury Habrard
Marc Sebban
Publication date: 15 May 2005
Publisher: HAL CCSD

Abstract

International audienceIn this paper, we aim at correcting distributions of noisy samples in order to improve the inference of probabilistic automata. Rather than definitively removing corrupted examples before the learning process, we propose a technique, based on statisticalestimates and linear regression, for correcting the probabilistic prefix tree automaton (PPTA). It requires a human expertise to correct only a small sample of data, selected in order to estimate the noise level. This statistical information permits us to automatically correct the whole PPTA and then to infer better models from a generalization point of view. After a theoretical analysis of the noise impact, we present a large experimental study on several datasets

Similar works

Full text

Available Versions

HAL AMU

oai:HAL:ujm-00378062v1

Last time updated on 11/11/2016

HAL-UJM

oai:HAL:ujm-00378062v1

Last time updated on 12/11/2016