Unsupervised Statistical Learning of Context-free Grammar

Abstract

In this paper, we address the problem of inducing (weighted) context-free grammar (WCFG) on data given. The induction is performed by using a new model of grammatical inference, i.e., weighted Grammar-based Classifier System (wGCS). wGCS derives from learning classifier systems and searches grammar structure using a genetic algorithm and covering. Weights of rules are estimated by using a novelty Inside-Outside Contrastive Estimation algorithm. The proposed method employs direct negative evidence and learns WCFG both form positive and negative samples. Results of experiments on three synthetic context-free languages show that wGCS is competitive with other statistical-based method for unsupervised CFG learning

    Similar works