3,974 research outputs found

    A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars

    Full text link
    The paper gives a brief review of the expectation-maximization algorithm (Dempster 1977) in the comprehensible framework of discrete mathematics. In Section 2, two prominent estimation methods, the relative-frequency estimation and the maximum-likelihood estimation are presented. Section 3 is dedicated to the expectation-maximization algorithm and a simpler variant, the generalized expectation-maximization algorithm. In Section 4, two loaded dice are rolled. A more interesting example is presented in Section 5: The estimation of probabilistic context-free grammars.Comment: Presented at the 15th European Summer School in Logic, Language and Information (ESSLLI 2003). Example 5 extended (and partially corrected

    Learning to Resolve Natural Language Ambiguities: A Unified Approach

    Full text link
    We analyze a few of the commonly used statistics based and machine learning algorithms for natural language disambiguation tasks and observe that they can be re-cast as learning linear separators in the feature space. Each of the methods makes a priori assumptions, which it employs, given the data, when searching for its hypothesis. Nevertheless, as we show, it searches a space that is as rich as the space of all linear separators. We use this to build an argument for a data driven approach which merely searches for a good linear separator in the feature space, without further assumptions on the domain or a specific problem. We present such an approach - a sparse network of linear separators, utilizing the Winnow learning algorithm - and show how to use it in a variety of ambiguity resolution problems. The learning approach presented is attribute-efficient and, therefore, appropriate for domains having very large number of attributes. In particular, we present an extensive experimental comparison of our approach with other methods on several well studied lexical disambiguation tasks such as context-sensitive spelling correction, prepositional phrase attachment and part of speech tagging. In all cases we show that our approach either outperforms other methods tried for these tasks or performs comparably to the best

    The processing of ambiguous sentences by first and second language learners of English

    Get PDF
    This study compares the way English-speaking children and adult second language learners of English resolve relative clause attachment ambiguities in sentences such as The dean liked the secretary of the professor who was reading a letter. Two groups of advanced L2 learners of English with Greek or German as their L1 participated in a set of off-line and on-line tasks. While the participants ' disambiguation preferences were influenced by lexical-semantic properties of the preposition linking the two potential antecedent NPs (of vs. with), there was no evidence that they were applying any structure-based ambiguity resolution strategies of the type that have been claimed to influence sentence processing in monolingual adults. These findings differ markedly from those obtained from 6 to 7 yearold monolingual English children in a parallel auditory study (Felser, Marinis, & Clahsen, submitted) in that the children's attachment preferences were not affected by the type of preposition at all. We argue that whereas children primarily rely on structure-based parsing principles during processing, adult L2 learners are guided mainly by non-structural informatio

    Semantic indeterminacy in object relative clauses

    Get PDF
    This article examined whether semantic indeterminacy plays a role in comprehension of complex structures such as object relative clauses. Study 1 used a gated sentence completion task to assess which alternative interpretations are dominant as the relative clause unfolds; Study 2 compared reading times in object relative clauses containing different animacy configurations to unambiguous passive controls; and Study 3 related completion data and reading data. The results showed that comprehension difficulty was modulated by animacy configuration and voice (active vs. passive). These differences were well correlated with the availability of alternative interpretations as the relative clause unfolds, as revealed by the completion data. In contrast to approaches arguing that comprehension difficulty stems from syntactic complexity, these results suggest that semantic indeterminacy is a major source of comprehension difficulty in object relative clauses. Results are consistent with constraint-based approaches to ambiguity resolution and bring new insights into previously identified sources of difficulty. (C) 2007 Elsevier Inc. All rights reserved

    How do treebank annotation schemes influence parsing results? : or how not to compare apples and oranges

    Get PDF
    In the last decade, the Penn treebank has become the standard data set for evaluating parsers. The fact that most parsers are solely evaluated on this specific data set leaves the question unanswered how much these results depend on the annotation scheme of the treebank. In this paper, we will investigate the influence which different decisions in the annotation schemes of treebanks have on parsing. The investigation uses the comparison of similar treebanks of German, NEGRA and TüBa-D/Z, which are subsequently modified to allow a comparison of the differences. The results show that deleted unary nodes and a flat phrase structure have a negative influence on parsing quality while a flat clause structure has a positive influence

    Constraint-Based Models of Sentence Processing

    Get PDF
    • …
    corecore