415 research outputs found

    Text as scene: discourse deixis and bridging relations

    Get PDF
    En este artículo se presenta un nuevo marco, “el texto como escena”, que establece las bases para la anotación de dos relaciones de correferencia: la deixis discursiva y las relaciones de bridging. La incorporación de lo que llamamos escenas textuales y contextuales proporciona unas directrices de anotación más flexibles, que diferencian claramente entre tipos de categorías generales. Un marco como éste, capaz de tratar la deixis discursiva y las relaciones de bridging desde una perspectiva común, tiene como objetivo mejorar el bajo grado de acuerdo entre anotadores obtenido por esquemas de anotación anteriores, que son incapaces de captar las referencias vagas inherentes a estos dos tipos de relaciones. Las directrices aquí presentadas completan el esquema de anotación diseñado para enriquecer el corpus español CESS-ECE con información correferencial y así construir el corpus CESS-Ancora.This paper presents a new framework, “text as scene”, which lays the foundations for the annotation of two coreferential links: discourse deixis and bridging relations. The incorporation of what we call textual and contextual scenes provides more flexible annotation guidelines, broad type categories being clearly differentiated. Such a framework that is capable of dealing with discourse deixis and bridging relations from a common perspective aims at improving the poor reliability scores obtained by previous annotation schemes, which fail to capture the vague references inherent in both these links. The guidelines presented here complete the annotation scheme designed to enrich the Spanish CESS-ECE corpus with coreference information, thus building the CESS-Ancora corpus.This paper has been supported by the FPU grant (AP2006-00994) from the Spanish Ministry of Education and Science. It is based on work supported by the CESS-ECE (HUM2004-21127), Lang2World (TIN2006- 15265-C06-06), and Praxem (HUM2006- 27378-E) projects

    Discourse Deixis and Coreference: Evidence from AnCora

    Get PDF
    Proceedings of the Second Workshop on Anaphora Resolution (WAR II). Editor: Christer Johansson. NEALT Proceedings Series, Vol. 2 (2008), 73-82. © 2008 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/7129

    Text as Scene: Discourse Deixis and Bridging Relations

    Get PDF
    This paper presents a new framework, "text as scene", which lays the foundations for the annotation of two coreferential links: discourse deixis and bridging relations. The incorporation of what we call textual and contextual scenes provides more flexible annotation guidelines, broad type categories being clearly differentiated. Such a framework that is capable of dealing with discourse deixis and bridging relations from a common perspective aims at improving the poor reliability scores obtained by previous annotation schemes, which fail to capture the vague references inherent in both these links. The guidelines presented here complete the annotation scheme designed to enrich the Spanish CESS-ECE corpus with coreference information, thus building the CESS-Ancora corpus

    Coreference and anaphoric annotations for spontaneous speech corpos in French

    Get PDF
    International audienceThis paper presents a corpus-based analysis of coreference and anaphoric relations in French spontaneous conversational speech. It presents the annotation task and two experiments on this corpus (gender and number agreement, definite descriptions as first mention of new discourse entities) which aim at assessing the relevancy of current anaphora solvers on spontaneous speech

    Corpora for Computational Linguistics

    Get PDF
    Since the mid 90s corpora has become very important for computational linguistics. This paper offers a survey of how they are currently used in different fields of the discipline, with particular emphasis on anaphora and coreference resolution, automatic summarisation and term extraction. Their influence on other fields is also briefly discussed

    Abstract pronominal anaphors and label nouns in German and English: Selected case studies and quantitative investigations

    Get PDF
    Abstract anaphors refer to abstract referents, such as facts or events. This paper presents a corpus-based comparative study of German and English abstract anaphors. Parallel bi-directional texts from the Europarl Corpus were annotated with functional and morpho-syntactic information, focusing on the pronouns ‘it’, ‘this’, and ‘that’, as well as demonstrative noun phrases headed by “label nouns”, such as ‘this event’, ‘that issue’, etc., and their German counterparts. We induce information about the cross-linguistic realization of abstract anaphors from the parallel texts. The contrastive findings are then controlled for translation-specific characteristics by examination of the differences between the original text and the translated text in each of the languages. In selected case studies, we investigate in detail “translation mismatches”, including changes in grammatical category (from pronouns to full noun phrases, and vice versa), grammatical function, or clausal position, addition or omission of modifying adjectives, changes in the lexical realization of head nouns, and transpositions of the demonstrative determiner. In some of these cases, the specificity of the abstract noun phrase is altered by the translation process

    From grammar to reading : a study on referential dependencies

    Get PDF
    In this article, we present a study that investigated relations between grammar teaching and reading in Portuguese as mother tongue. The study, in which 91 students have participated, aimed at (i) pre-assessing students' ability to comprehend referential dependencies in reading at different stages (grade 4, 9-10 years old; grade 6, 11-12 years old; and grade 8, 12‑13 years old), (ii) proposing a teaching intervention to develop language awareness about referential dependencies and, more specifically, to develop strategies to identify antecedents of pronouns (grade 4) and (iii) assessing the effects of the teaching intervention (grade 4). The study was based on a quasi‑experimental methodology, with pre and posttests and a teaching intervention developed in the classroom, based on discovery-learning methods. Results of the study, which show positive effects of the teaching intervention, reinforce the benefits of grammar teaching as language awareness development. The study also offers a contribution towards the discussion of the role of grammar teaching to the development of late acquired structures, such as particular types of referential dependencies.En aquest article, es presenta un estudi sobre les relacions entre l'ensenyament de la gramàtica i la lectura en portuguès com a llengua materna. L'estudi, en què van participar 91 alumnes, va tenir com a objectius (i) pre-avaluar el coneixement lingüístic d'alumnes en diferents nivells educatius (quart curs, 9-10 anys; sisè curs, 11-12 anys; i vuitè curs, 12 -13 anys) pel que fa a la capacitat per comprendre dependències referencials en la lectura, (ii) proposar una intervenció didàctica per al desenvolupament de la consciència lingüística sobre dependències referencials i, més específicament, per al desenvolupament d'estratègies d'identificació d'antecedents de pronoms (quart curs) i (iii) avaluar els efectes de la intervenció didàctica (quart curs). Basat en una metodologia quasi experimental, l'estudi es va desenvolupar amb prova prèvia i prova posterior i una intervenció didàctica, que va seguir els mètodes d'aprenentatge per descobriment. Els resultats de l'estudi, que mostren els efectes positius de la intervenció didàctica, reforcen els beneficis d'ensenyar la gramàtica com un desenvolupament de la consciència lingüística. L'estudi també ajuda a mostrar el paper de l'ensenyament de la gramàtica en les estructures lingüístiques de desenvolupament tardà, com certs tipus de dependències referencials.En este artículo, se presenta un estudio que investigó relaciónes entre la enseñanza de la gramática y la lectura en portugués como lengua materna. El estudio, en el que participaron 91 alumnos, tuvo como objetivos (i) pre-evaluar el conocimiento lingüístico de alumnos en diferentes niveles educativos (cuarto curso, 9-10 años; sexto curso, 11-12 años; e octavo curso, 12-13 años) en cuanto a la capacidad para comprender dependencias referenciales en la lectura, (ii) proponer una intervención didáctica para el desarrollo de la conciencia lingüística sobre dependencias referenciales y, más específicamente, para el desarrollo de estrategias de identificación de antecedentes de pronombres (cuarto curso) y (iii) evaluar los efectos de la intervención didáctica (cuarto curso). Basado en una metodologia cuasi experimental, el estudio se desarrolló con prueba previa y prueba posterior y una intervención didáctica, que siguió los métodos de aprendizaje por descubrimiento. Los resultados del estudio, que muestran los efectos positivos de la intervención didáctica, refuerzan los beneficios de enseñar la gramática como un desarrollo de la conciencia lingüística. El estudio también ayuda a mostrar el papel de la enseñanza de la gramática en las estructuras lingüísticas de desarrollo tardío, como ciertos tipos de dependencias referenciales.Dans cet article, nous présentons une étude sur les relations entre l'enseignement de la grammaire et la lecture en portugais langue maternelle. L'étude, à laquelle ont participé 91 étudiants, visait (i) à pré-évaluer la capacité des étudiants pour comprendre les dépendances référentielles en lecture à différents stades (4ème année, 9-10 ans ; 6ème année, 11-12 ans; et 8ème année, 12-13 ans), (ii) à proposer une intervention didactique pour développer la conscience linguistique sur les dépendances référentielles et, plus spécifiquement, sur l'élaboration de stratégies pour identifier les antécédents de pronoms (4ème année) et (iii) à évaluer des effets de l'intervention didactique (4ème année). Basée sur une méthodologie quasi expérimentale, l'étude a été développée avec pré-test et post‑test et une intervention didactique, qui a suivi les méthodes d'apprentissage par la découverte. Les résultats de l'étude, qui montrent les effets positifs de l'intervention didactique, renforcent les avantages de l'enseignement de la grammaire en tant que développement de la conscience linguistique. L'étude contribue également à montrer le rôle de l'enseignement de la grammaire dans les structures linguistiques en développement tardif, telles que certains types de dépendances référentielles

    AnaPro, Tool for Identification and Resolution of Direct Anaphora in Spanish

    Get PDF
    Introduction Anaphora is a relation of coreference between linguistic terms. According to Webster’s dictionary: “It is the use of a grammatical substitute (as a pronoun or a pro-verb) to refer to the denotation of a preceding word or group of words; also : the relation between a grammatical substitute and its antecedent.” Therefore, anaphora is a discourse relation. Anaphora resolution is very important in Natural Language Processing (NLP). This work is part of Project OM* (Ontology Merging), which seeks to build a large ontology by fusing smaller ontologies extracted from textual documents. An important part of the project is to analyze the sentences in a document with the goal to transform that text into an ontology that comprises its contents. A brief description of Project OM* follows.AnaPro is software that solves direct anaphora in Spanish, specifically pronouns: it finds the noun or group of words to which the pronoun refers. It locates in the previous sentenc es the referent or antecedent which the pronoun replaces. An example of a direct anaphora solved is the pronoun “ he” in the sentence “He is sad.” Much of the work on anaphora has been done for texts in English; thus , we specifically focus on Spanish documents. AnaPro directly supports text analys is (to understand what a document says ), a non trivial task since there are different writing styles, references, idiomatic expressions, etc. The problem grows if t he analyzer is a computer, because they lack “common sense” (which persons possess) . Hence, before text analysis, its preprocessing is required, in order to assign tags (noun, verb,...) to each word, find the stems, disambiguate nouns, verbs, prepositions, identify colloquial expressions, i dentify and resolve anaphor a, among other chores. AnaPro works for Spanish sentences. It is a novel procedure, since it is automatic (no user intervenes during the resolution) and it does not need dictionaries. It employs heu ristics procedures to discover the semantics and help in the decisions; they are rather easy to implement and use li mited knowledge. Nevertheless, its results are good (81% of correct answers, at least). However, more tests will give a better idea of its goodness.Authors I.T. and E.V. would like to acknowledge ESCOM-IPN, where they defended their thesis, #20110083 , which gives a more detailed description of AnaPro. Work herein reported was partially sponsored by CONACYT Grant #128163 (Project OM*), by IPN and by SNI and UAEM

    Coreference resolution for portuguese using parallel corpora word alignment

    Get PDF
    A área de Extração da Informação tem como objetivo essencial investigar métodos e técnicas para transformar a informação não estruturada presente em textos de língua natural em dados estruturados. Um importante passo deste processo é a resolução de correferência, tarefa que identifica diferentes sintagmas nominais que se referem a mesma entidade no discurso. A área de estudos sobre resolução de correferência tem sido extensivamente pesquisada para a Língua Inglesa (Ng, 2010) lista uma série de estudos da área, entretanto tem recebido menos atenção em outras línguas. Isso se deve ao fato de que a grande maioria das abordagens utilizadas nessas pesquisas são baseadas em aprendizado de máquina e, portanto, requerem uma extensa quantidade de dados anotados

    Paths through meaning and form: Festschrift offered to Klaus von Heusinger on the occasion of his 60th birthday

    Get PDF
    “Paths through meaning and form. Festschrift offered to Klaus von Heusinger on the occasion of his 60th birthday” umfasst 60 Beiträge von Kolleginnen und Kollegen, die mit Klaus von Heusinger in seiner wissenschaftlichen Laufbahn zusammengearbeitet haben. Die in den einzelnen Beiträgen behandelten Themen gehen auf Prominenz, Referentialität, Quantifikation, Kasus, Spracherwerb und experimentelle Psycholinguistik ein