25,660 research outputs found

    Message-Passing Protocols for Real-World Parsing -- An Object-Oriented Model and its Preliminary Evaluation

    Full text link
    We argue for a performance-based design of natural language grammars and their associated parsers in order to meet the constraints imposed by real-world NLP. Our approach incorporates declarative and procedural knowledge about language and language use within an object-oriented specification framework. We discuss several message-passing protocols for parsing and provide reasons for sacrificing completeness of the parse in favor of efficiency based on a preliminary empirical evaluation.Comment: 12 pages, uses epsfig.st

    Generation of folk song melodies using Bayes transforms

    Get PDF
    The paper introduces the `Bayes transform', a mathematical procedure for putting data into a hierarchical representation. Applicable to any type of data, the procedure yields interesting results when applied to sequences. In this case, the representation obtained implicitly models the repetition hierarchy of the source. There are then natural applications to music. Derivation of Bayes transforms can be the means of determining the repetition hierarchy of note sequences (melodies) in an empirical and domain-general way. The paper investigates application of this approach to Folk Song, examining the results that can be obtained by treating such transforms as generative models

    A Data-Oriented Model of Literary Language

    Get PDF
    We consider the task of predicting how literary a text is, with a gold standard from human ratings. Aside from a standard bigram baseline, we apply rich syntactic tree fragments, mined from the training set, and a series of hand-picked features. Our model is the first to distinguish degrees of highly and less literary novels using a variety of lexical and syntactic features, and explains 76.0 % of the variation in literary ratings.Comment: To be published in EACL 2017, 11 page

    Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure

    Get PDF
    It has been established that incorporating word cluster features derived from large unlabeled corpora can significantly improve prediction of linguistic structure. While previous work has focused primarily on English, we extend these results to other languages along two dimensions. First, we show that these results hold true for a number of languages across families. Second, and more interestingly, we provide an algorithm for inducing cross-lingual clusters and we show that features derived from these clusters significantly improve the accuracy of cross-lingual structure prediction. Specifically, we show that by augmenting direct-transfer systems with cross-lingual cluster features, the relative error of delexicalized dependency parsers, trained on English treebanks and transferred to foreign languages, can be reduced by up to 13%. When applying the same method to direct transfer of named-entity recognizers, we observe relative improvements of up to 26%
    • ā€¦
    corecore