7 research outputs found

    Learning Syntactic Rules and Tags with Genetic Algorithms for Information Retrieval and Filtering: An Empirical Basis for Grammatical Rules

    Get PDF
    The grammars of natural languages may be learned by using genetic algorithms that reproduce and mutate grammatical rules and part-of-speech tags, improving the quality of later generations of grammatical components. Syntactic rules are randomly generated and then evolve; those rules resulting in improved parsing and occasionally improved retrieval and filtering performance are allowed to further propagate. The LUST system learns the characteristics of the language or sublanguage used in document abstracts by learning from the document rankings obtained from the parsed abstracts. Unlike the application of traditional linguistic rules to retrieval and filtering applications, LUST develops grammatical structures and tags without the prior imposition of some common grammatical assumptions (e.g., part-of-speech assumptions), producing grammars that are empirically based and are optimized for this particular application.Comment: latex document, postscript figures not included. Accepted for publication in Information Processing and Managemen

    Grammatical inference with a genetic algorithm

    No full text

    Grammatical inference with a genetic algorithm

    No full text

    Breeding Grammars: Grammatical Inference with a Genetic Algorithm

    No full text
    This paper presents a genetic algorithm used to infer context-free grammars from legal and illegal examples of a language. It discusses the representation of grammar rules in the form of bitstrings by way of an interval coding scheme, genetic operators for reproduction of grammars, and the method of evaluating the fitness of grammars with respect to the training examples. Results are reported on the inference of several of these grammars. Grammars for the language of correctly balanced and nested brackets, the language of sentences containing an equal number of a's and b's, a set of regular languages, and a micro-NL language were inferred. Furthermore, some possible improvements and extensions of the algorithm are discussed
    corecore