415 research outputs found
Text as scene: discourse deixis and bridging relations
En este artículo se presenta un nuevo marco, “el texto como escena”, que establece
las bases para la anotación de dos relaciones de correferencia: la deixis discursiva y las
relaciones de bridging. La incorporación de lo que llamamos escenas textuales y contextuales
proporciona unas directrices de anotación más flexibles, que diferencian claramente entre tipos
de categorías generales. Un marco como éste, capaz de tratar la deixis discursiva y las
relaciones de bridging desde una perspectiva común, tiene como objetivo mejorar el bajo grado
de acuerdo entre anotadores obtenido por esquemas de anotación anteriores, que son incapaces
de captar las referencias vagas inherentes a estos dos tipos de relaciones. Las directrices aquí
presentadas completan el esquema de anotación diseñado para enriquecer el corpus español
CESS-ECE con información correferencial y así construir el corpus CESS-Ancora.This paper presents a new framework, “text as scene”, which lays the foundations for
the annotation of two coreferential links: discourse deixis and bridging relations. The
incorporation of what we call textual and contextual scenes provides more flexible annotation
guidelines, broad type categories being clearly differentiated. Such a framework that is capable
of dealing with discourse deixis and bridging relations from a common perspective aims at
improving the poor reliability scores obtained by previous annotation schemes, which fail to
capture the vague references inherent in both these links. The guidelines presented here
complete the annotation scheme designed to enrich the Spanish CESS-ECE corpus with
coreference information, thus building the CESS-Ancora corpus.This paper has been supported by the FPU
grant (AP2006-00994) from the Spanish
Ministry of Education and Science. It is based
on work supported by the CESS-ECE
(HUM2004-21127), Lang2World (TIN2006-
15265-C06-06), and Praxem (HUM2006-
27378-E) projects
Discourse Deixis and Coreference: Evidence from AnCora
Proceedings of the Second Workshop on Anaphora Resolution
(WAR II).
Editor: Christer Johansson.
NEALT Proceedings Series, Vol. 2 (2008), 73-82.
© 2008 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/7129
Text as Scene: Discourse Deixis and Bridging Relations
This paper presents a new framework, "text as scene", which lays the foundations for the annotation of two coreferential links: discourse deixis and bridging relations. The incorporation of what we call textual and contextual scenes provides more flexible annotation guidelines, broad type categories being clearly differentiated. Such a framework that is capable of dealing with discourse deixis and bridging relations from a common perspective aims at improving the poor reliability scores obtained by previous annotation schemes, which fail to capture the vague references inherent in both these links. The guidelines presented here complete the annotation scheme designed to enrich the Spanish CESS-ECE corpus with coreference information, thus building the CESS-Ancora corpus
Coreference and anaphoric annotations for spontaneous speech corpos in French
International audienceThis paper presents a corpus-based analysis of coreference and anaphoric relations in French spontaneous conversational speech. It presents the annotation task and two experiments on this corpus (gender and number agreement, definite descriptions as first mention of new discourse entities) which aim at assessing the relevancy of current anaphora solvers on spontaneous speech
Corpora for Computational Linguistics
Since the mid 90s corpora has become very important for computational linguistics. This paper offers a survey of how they are currently used in different fields of the discipline, with particular emphasis on anaphora and coreference resolution, automatic summarisation and term extraction.
Their influence on other fields is also briefly discussed
Abstract pronominal anaphors and label nouns in German and English: Selected case studies and quantitative investigations
Abstract anaphors refer to abstract referents, such as facts or events. This paper presents a corpus-based comparative study of German and English abstract
anaphors. Parallel bi-directional texts from the Europarl Corpus were annotated
with functional and morpho-syntactic information, focusing on the pronouns ‘it’,
‘this’, and ‘that’, as well as demonstrative noun phrases headed by “label nouns”,
such as ‘this event’, ‘that issue’, etc., and their German counterparts. We induce
information about the cross-linguistic realization of abstract anaphors from the
parallel texts. The contrastive findings are then controlled for translation-specific
characteristics by examination of the differences between the original text and the
translated text in each of the languages. In selected case studies, we investigate in
detail “translation mismatches”, including changes in grammatical category (from
pronouns to full noun phrases, and vice versa), grammatical function, or clausal
position, addition or omission of modifying adjectives, changes in the lexical realization of head nouns, and transpositions of the demonstrative determiner. In
some of these cases, the specificity of the abstract noun phrase is altered by the
translation process
From grammar to reading : a study on referential dependencies
In this article, we present a study that investigated relations between grammar teaching and reading in Portuguese as mother tongue. The study, in which 91 students have participated, aimed at (i) pre-assessing students' ability to comprehend referential dependencies in reading at different stages (grade 4, 9-10 years old; grade 6, 11-12 years old; and grade 8, 12‑13 years old), (ii) proposing a teaching intervention to develop language awareness about referential dependencies and, more specifically, to develop strategies to identify antecedents of pronouns (grade 4) and (iii) assessing the effects of the teaching intervention (grade 4). The study was based on a quasi‑experimental methodology, with pre and posttests and a teaching intervention developed in the classroom, based on discovery-learning methods. Results of the study, which show positive effects of the teaching intervention, reinforce the benefits of grammar teaching as language awareness development. The study also offers a contribution towards the discussion of the role of grammar teaching to the development of late acquired structures, such as particular types of referential dependencies.En aquest article, es presenta un estudi sobre les relacions entre l'ensenyament de la gramàtica i la lectura en portuguès com a llengua materna. L'estudi, en què van participar 91 alumnes, va tenir com a objectius (i) pre-avaluar el coneixement lingüístic d'alumnes en diferents nivells educatius (quart curs, 9-10 anys; sisè curs, 11-12 anys; i vuitè curs, 12 -13 anys) pel que fa a la capacitat per comprendre dependències referencials en la lectura, (ii) proposar una intervenció didàctica per al desenvolupament de la consciència lingüística sobre dependències referencials i, més específicament, per al desenvolupament d'estratègies d'identificació d'antecedents de pronoms (quart curs) i (iii) avaluar els efectes de la intervenció didàctica (quart curs). Basat en una metodologia quasi experimental, l'estudi es va desenvolupar amb prova prèvia i prova posterior i una intervenció didàctica, que va seguir els mètodes d'aprenentatge per descobriment. Els resultats de l'estudi, que mostren els efectes positius de la intervenció didàctica, reforcen els beneficis d'ensenyar la gramàtica com un desenvolupament de la consciència lingüística. L'estudi també ajuda a mostrar el paper de l'ensenyament de la gramàtica en les estructures lingüístiques de desenvolupament tardà, com certs tipus de dependències referencials.En este artículo, se presenta un estudio que investigó relaciónes entre la enseñanza de la gramática y la lectura en portugués como lengua materna. El estudio, en el que participaron 91 alumnos, tuvo como objetivos (i) pre-evaluar el conocimiento lingüístico de alumnos en diferentes niveles educativos (cuarto curso, 9-10 años; sexto curso, 11-12 años; e octavo curso, 12-13 años) en cuanto a la capacidad para comprender dependencias referenciales en la lectura, (ii) proponer una intervención didáctica para el desarrollo de la conciencia lingüística sobre dependencias referenciales y, más específicamente, para el desarrollo de estrategias de identificación de antecedentes de pronombres (cuarto curso) y (iii) evaluar los efectos de la intervención didáctica (cuarto curso). Basado en una metodologia cuasi experimental, el estudio se desarrolló con prueba previa y prueba posterior y una intervención didáctica, que siguió los métodos de aprendizaje por descubrimiento. Los resultados del estudio, que muestran los efectos positivos de la intervención didáctica, refuerzan los beneficios de enseñar la gramática como un desarrollo de la conciencia lingüística. El estudio también ayuda a mostrar el papel de la enseñanza de la gramática en las estructuras lingüísticas de desarrollo tardío, como ciertos tipos de dependencias referenciales.Dans cet article, nous présentons une étude sur les relations entre l'enseignement de la grammaire et la lecture en portugais langue maternelle. L'étude, à laquelle ont participé 91 étudiants, visait (i) à pré-évaluer la capacité des étudiants pour comprendre les dépendances référentielles en lecture à différents stades (4ème année, 9-10 ans ; 6ème année, 11-12 ans; et 8ème année, 12-13 ans), (ii) à proposer une intervention didactique pour développer la conscience linguistique sur les dépendances référentielles et, plus spécifiquement, sur l'élaboration de stratégies pour identifier les antécédents de pronoms (4ème année) et (iii) à évaluer des effets de l'intervention didactique (4ème année). Basée sur une méthodologie quasi expérimentale, l'étude a été développée avec pré-test et post‑test et une intervention didactique, qui a suivi les méthodes d'apprentissage par la découverte. Les résultats de l'étude, qui montrent les effets positifs de l'intervention didactique, renforcent les avantages de l'enseignement de la grammaire en tant que développement de la conscience linguistique. L'étude contribue également à montrer le rôle de l'enseignement de la grammaire dans les structures linguistiques en développement tardif, telles que certains types de dépendances référentielles
AnaPro, Tool for Identification and Resolution of Direct Anaphora in Spanish
Introduction Anaphora is a relation of coreference between linguistic terms. According to Webster’s dictionary: “It is the use of a grammatical substitute (as a pronoun or a pro-verb) to refer to the denotation of a preceding word or group of words; also : the relation between a grammatical substitute and its antecedent.” Therefore, anaphora is a discourse relation. Anaphora resolution is very important in Natural Language Processing (NLP). This work is part of Project OM* (Ontology Merging), which seeks to build a large ontology by fusing smaller ontologies extracted from textual documents. An important part of the project is to analyze the sentences in a document with the goal to transform that text into an ontology that comprises its contents. A brief description of Project OM* follows.AnaPro is software that solves direct anaphora in Spanish, specifically pronouns: it finds the noun or group of words to which the pronoun refers. It locates in the previous sentenc es the referent or antecedent which the pronoun replaces. An example of a direct anaphora solved is the pronoun “ he” in the sentence “He is sad.” Much of the work on anaphora has been done for texts in English; thus , we specifically focus on Spanish documents. AnaPro directly supports text analys is (to understand what a document says ), a non trivial task since there are different writing styles, references, idiomatic expressions, etc. The problem grows if t he analyzer is a computer, because they lack “common sense” (which persons possess) . Hence, before text analysis, its preprocessing is required, in order to assign tags (noun, verb,...) to each word, find the stems, disambiguate nouns, verbs, prepositions, identify colloquial expressions, i dentify and resolve anaphor a, among other chores. AnaPro works for Spanish sentences. It is a novel procedure, since it is automatic (no user intervenes during the resolution) and it does not need dictionaries. It employs heu ristics procedures to discover the semantics and help in the decisions; they are rather easy to implement and use li mited knowledge. Nevertheless, its results are good (81% of correct answers, at least). However, more tests will give a better idea of its goodness.Authors I.T. and E.V. would like to acknowledge ESCOM-IPN, where they defended their thesis, #20110083 , which gives a more detailed description of AnaPro. Work herein reported was partially sponsored by CONACYT Grant #128163 (Project OM*), by IPN and by SNI and UAEM
Coreference resolution for portuguese using parallel corpora word alignment
A área de Extração da Informação tem como objetivo essencial investigar
métodos e técnicas para transformar a informação não estruturada presente em
textos de língua natural em dados estruturados. Um importante passo deste
processo é a resolução de correferência, tarefa que identifica diferentes sintagmas
nominais que se referem a mesma entidade no discurso. A área de estudos sobre
resolução de correferência tem sido extensivamente pesquisada para a Língua
Inglesa (Ng, 2010) lista uma série de estudos da área, entretanto tem recebido
menos atenção em outras línguas. Isso se deve ao fato de que a grande maioria das
abordagens utilizadas nessas pesquisas são baseadas em aprendizado de máquina
e, portanto, requerem uma extensa quantidade de dados anotados
Paths through meaning and form: Festschrift offered to Klaus von Heusinger on the occasion of his 60th birthday
“Paths through meaning and form. Festschrift offered to Klaus von Heusinger on the occasion of his 60th birthday” umfasst 60 Beiträge von Kolleginnen und Kollegen, die mit Klaus von Heusinger in seiner wissenschaftlichen Laufbahn zusammengearbeitet haben. Die in den einzelnen Beiträgen behandelten Themen gehen auf Prominenz, Referentialität, Quantifikation, Kasus, Spracherwerb und experimentelle Psycholinguistik ein
- …