154 research outputs found

    Use of Discourse Cues During Garden-Path Resolution is Modulated by Verb Argument Structure

    Get PDF
    Studies on garden-path sentences such as While the man hunted the deer ran into the woods have shown that comprehenders face processing difficulties due to the locally ambiguous noun phrase “the deer”. This critical noun phrase tends to be initially interpreted as the object of the preceding verb, but it must ultimately be interpreted as the subject of the following clause. This grammatical role ambiguity is particularly of interest because in English (and other languages) discourse information tends to be packaged in such a way that objects are typically indefinite, new information, and subjects are most often previously-mentioned, definite information (e.g., Comrie 1989, Prince 1992). We hypothesized that, if discourse information is at play, noun phrases that are more subject-like might facilitate garden-path resolution relative to more object-like noun phrases. However, in order to better understand when the discourse level is engaged in processing these constructions, we examined garden-paths with two verb types: reflexive absolute verbs (RATs, e.g., “wash”) and optionally transitive verbs (OPTs, e.g., “hunt”). Due to their reflexive nature, RATs were expected to operate mainly through a structural route (syntax-only). On the other hand, OPT verbs can introduce implicit arguments and are more likely to engage in operations beyond the domain of syntax (e.g., syntax+discourse). In this study, we discuss data from a self-paced reading experiment (see Besserman and Kaiser 2016) to shed light on the syntax/discourse division of labor: we found effects of information status related to subject-/object-hood in the processing of garden-paths with OPT, but not RAT verbs. These findings suggest that engagement of discourse representations was modulated by the verb’s argument structure

    Unsupervised learning of contextual role knowledge for coreference resolution

    Get PDF
    Journal ArticleWe present a coreference resolver called BABAR that uses contextual role knowledge to evaluate possible antecedents for an anaphor. BABAR uses information extraction patterns to identify contextual roles and creates four contextual role knowledge sources using unsupervised learning. These knowledge sources determine whether the contexts surrounding an anaphor and antecedent are compatible. BABAR applies a Dempster-Shafer probabilistic model to make resolutions based on evidence from the contextual role knowledge sources as well as general knowledge sources. Experiments in two domains showed that the contextual role knowledge improved coreference performance, especially on pronouns

    Anaphora resolution for Arabic machine translation :a case study of nafs

    Get PDF
    PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing. This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government

    Inter-sentential anaphora and coherence relations in discourse: a perfect match

    Get PDF
    International audienceHobbs (1979) ('Coherence and Coreference', Cognitive Science 3, 67-90) claims that the interpretation of inter-sentential anaphors 'falls out' as a 'by-product' of using a particular coherence relation to integrate two discourse units. The article argues that this is only partly true. Taking the reader's perspective, I suggest that there are three stages in invoking and implementing a given coherence relation to integrate two discourse units when updating a given discourse context. Interleaved with these are two distinguishable levels in the assignment of reference to the anaphor(s) in the second unit: first, through a search for evidence for the appropriateness of a given anticipated relation, the reader will provisionally assign a referent to the anaphor(s) in the second unit via the semantic structure within the relation's definition (this would correspond to Hobbs's original thesis); and second, in coming to a final decision as to the applicability of the coherence relation(s), the anaphor(s) will receive a full, expanded interpretation. This in turn will serve to actually implement the coherence relation initially assumed. In more general terms, the article aims to pinpoint the precise nature of the interactions between the invocation and implementation of given coherence relations and the functioning of anaphors in non-initial units, in processing multi-propositional texts

    The pronoun interpretation problem in Italian complex predicates

    Get PDF
    This thesis explores the syntactic and pragmatic factors involved in the interpretation of clitic pronouns in Principle B contexts in both theoretical and acquisition perspective. The Pronoun Interpretation Problem, i.e. children’s apparent difficulty with the application of Principle B, defines a stage lasting up to about age 6: (1) Mama Beari is washing heri (50% correct at age 5;6) (2) Lo gnomoi lo*i lava (85% correct at age 4;8) Italian The gnome him.washes It is assumed that clitic pronouns like lo are exempted from interpretation problems because they can only be interpreted via binding. Romance children, however, show interpretation problems in complex sentences like (3): (3) La niñai lai ve bailar (64% correct at age 5;6) Sentences like the above, which involve Exceptional Case Marking, are the main focus of the present research. We maintain that (3) can only be explained if Principle B does not apply to these structures, as also proposed by Reinhart and Reuland’s (1993) and Reuland’s (2001) alternative binding theories. In order to explain (i) why clitics can only be interpreted via binding in simple sentences like (2) and (ii) why binding does not apply to (3), we draw on two fundamental assumptions: (i) binding effects in object cliticization are the output of the narrow syntactic derivation, specifically, of movement to the left edge of v*P; (ii) under a phase‐based model of syntactic derivations (Chomsky 2001), the binding domain is not the sentence, but the vP phase. We argue that the derivation in (3) contains an unbound occurrence of the pronoun, which allows children to covalue the matrix subject and the pronoun in pragmatics; such hypothesis receives support by our experimental finding that another complex predicate in Italian, causative faire‐par, triggers PIP. Ultimately, we suggest that the PIP can be ascribed to a unitary cause across languages, namely, the delayed pragmatic acquisition of local coreference

    Linguistics parameters for zero anaphora resolution

    Get PDF
    Dissertação de mest., Natural Language Processing and Human Language Technology, Univ. do Algarve, 2009This dissertation describes and proposes a set of linguistically motivated rules for zero anaphora resolution in the context of a natural language processing chain developed for Portuguese. Some languages, like Portuguese, allow noun phrase (NP) deletion (or zeroing) in several syntactic contexts in order to avoid the redundancy that would result from repetition of previously mentioned words. The co-reference relation between the zeroed element and its antecedent (or previous mention) in the discourse is here called zero anaphora (Mitkov, 2002). In Computational Linguistics, zero anaphora resolution may be viewed as a subtask of anaphora resolution and has an essential role in various Natural Language Processing applications such as information extraction, automatic abstracting, dialog systems, machine translation and question answering. The main goal of this dissertation is to describe the grammatical rules imposing subject NP deletion and referential constraints in the Brazilian Portuguese, in order to allow a correct identification of the antecedent of the deleted subject NP. Some of these rules were then formalized into the Xerox Incremental Parser or XIP (Ait-Mokhtar et al., 2002: 121-144) in order to constitute a module of the Portuguese grammar (Mamede et al. 2010) developed at Spoken Language Laboratory (L2F). Using this rule-based approach we expected to improve the performance of the Portuguese grammar namely by producing better dependency structures with (reconstructed) zeroed NPs for the syntactic-semantic interface. Because of the complexity of the task, the scope of this dissertation had to be limited: (a) subject NP deletion; b) within sentence boundaries and (c) with an explicit antecedent; besides, (d) rules were formalized based solely on the results of the shallow parser (or chunks), that is, with minimal syntactic (and no semantic) knowledge. A corpus of different text genres was manually annotated for zero anaphors and other zero-shaped, usually indefinite, subjects. The rule-based approached is evaluated and results are presented and discussed

    Processing for relevance: a pragmatically based account of how we process natural language

    Get PDF
    This thesis presents an account of some of the mental mechanisms and processes that take the addressee from a linguistic input to the interpretation of that input. Because on-line interpretation involves our knowledge of language, the relation between input processing and grammar is evaluated. The full interpretation of a linguistic input also involves pragmatic, i.e. central cognitive processes, but these processes are the least well understood within psycholinguistics. Relevance theory (Sperber & Wilson, 1986) gives us a way of making our understanding of these processes more explicit. However, Relevance theory claims turn out to be incompatible with psycholinguistic models which postulate an autonomous syntactic parser, such as the 'Garden-path' model. A review of the experimental literature reveals that the findings claimed to support the 'Garden-path' model do not in fact support it. Likewise, the principle of Lexical Preference, proposed to account for how verb subcategorization frames are accessed, turns out not to be supported by the experimental evidence. Full interpretation involves computing a conceptual representation, and an account is given of what constitutes conceptual structure. This leads to the proposal that verbs are represented as structured concepts. This view of verb representation together with Relevance theory can account for when arguments of verbs can be left implicit. Finally, an account is given of how the addressee computes the propositional form communicated by an utterance, by building hypotheses about the conceptual structure of the proposition on-line. These hypotheses are based on structural information stored under the concepts referred to by the utterance. This proposal can account for psycholinguistic research findings, with pragmatics playing an integral role in the explanations: it is no longer grafted onto the model as a psycholinguistic afterthought

    An authoring tool for decision support systems in context questions of ecological knowledge

    Get PDF
    Decision support systems (DSS) support business or organizational decision-making activities, which require the access to information that is internally stored in databases or data warehouses, and externally in the Web accessed by Information Retrieval (IR) or Question Answering (QA) systems. Graphical interfaces to query these sources of information ease to constrain dynamically query formulation based on user selections, but they present a lack of flexibility in query formulation, since the expressivity power is reduced to the user interface design. Natural language interfaces (NLI) are expected as the optimal solution. However, especially for non-expert users, a real natural communication is the most difficult to realize effectively. In this paper, we propose an NLI that improves the interaction between the user and the DSS by means of referencing previous questions or their answers (i.e. anaphora such as the pronoun reference in “What traits are affected by them?”), or by eliding parts of the question (i.e. ellipsis such as “And to glume colour?” after the question “Tell me the QTLs related to awn colour in wheat”). Moreover, in order to overcome one of the main problems of NLIs about the difficulty to adapt an NLI to a new domain, our proposal is based on ontologies that are obtained semi-automatically from a framework that allows the integration of internal and external, structured and unstructured information. Therefore, our proposal can interface with databases, data warehouses, QA and IR systems. Because of the high NL ambiguity of the resolution process, our proposal is presented as an authoring tool that helps the user to query efficiently in natural language. Finally, our proposal is tested on a DSS case scenario about Biotechnology and Agriculture, whose knowledge base is the CEREALAB database as internal structured data, and the Web (e.g. PubMed) as external unstructured information.This paper has been partially supported by the MESOLAP (TIN2010-14860), GEODAS-BI (TIN2012-37493-C03-03), LEGOLANGUAGE (TIN2012-31224) and DIIM2.0 (PROMETEOII/2014/001) projects from the Spanish Ministry of Education and Competitivity. Alejandro Maté is funded by the Generalitat Valenciana under an ACIF grant (ACIF/2010/298)

    Sentence Processing in a Second Language: Ambiguity Resolution in German Learners of English

    Get PDF
    The dissertation argues against fundamental differences between first and second language processing with regard to access to deep syntactic structures and phrase structure heuristics. This claim is supported by empirical data from off-line and on-line syntactic ambiguity resolution in non-immersed German learners of English. Although non-proficient learners’ processing deviates from native speakers because of L1 interference as well as less automatic processing and difficulties to recover from misanalyses resulting from processing capacity limitations, these effects were found to attenuate gradually with increasing second language proficiency. Moreover, it depends on the demands of the specific tasks and materials whether the learners’ limited processing resources actually lead to non-native like performance. In structures involving complex syntactic movement via an intermediate gap, the learners showed native-like intermediate gap and filler integration effects. Hence they have access to deep syntactic structures during on-line sentence processing. Moreover, participants did not show a general preference for ambiguous over disambiguated structures, which indicates that they have access to native-like syntactic processing principles. Taken together, the findings show that it is possible even for non-immersed learners to develop native-like syntactic processing in their second language
    corecore