Search CORE

1,245 research outputs found

Reference resolution in multi-modal interaction: Preliminary observations

Author: Nijholt A.
Publication venue: Universidad de Pinar del Rio "Hermanos Saiz Montes de Oca"
Publication date: 01/01/2002
Field of study

In this paper we present our research on multimodal interaction in and with virtual environments. The aim of this presentation is to emphasize the necessity to spend more research on reference resolution in multimodal contexts. In multi-modal interaction the human conversational partner can apply more than one modality in conveying his or her message to the environment in which a computer detects and interprets signals from different modalities. We show some naturally arising problems but do not give general solutions. Rather we decide to perform more detailed research on reference resolution in uni-modal contexts to obtain methods generalizable to multi-modal contexts. Since we try to build applications for a Dutch audience and since hardly any research has been done on reference resolution for Dutch, we give results on the resolution of anaphoric and deictic references in Dutch texts. We hope to be able to extend these results to our multimodal contexts later

University of Twente Research Information

Modelling the flow of discourse in a corpus of written academic English

Author: Moore Nicolas
Publication venue
Publication date
Field of study

Discourse studies attempt to describe how context affects text, and how text progresses from one sentence to the next. Systemic Functional Linguistics (SFL) offers a model of language to describe how information flow varies according to context and co-text through the Textual metafunction, especially using the functions of Participant Identification and Tracking, Theme and Information Structure. These systems were evaluated by assembling a corpus of academic texts and assessing their information flow. Results of the analysis of the three grammatical systems in the Textual Metafunction demonstrate significant patterns, or unmarked choices, where the participant, thematic and information systems combine to powerful effect. Where the systems are not aligned, there is a recognisable effect on the flow of information

Sheffield Hallam University Research Archive

Corpora for Computational Linguistics

Author: Evans Richard
Ha Le An
Hasler Laura
Mitkov Ruslan
Orăsan Constantin
Publication venue: 'Universidade Federal de Santa Catarina (UFSC)'
Publication date: 01/01/2007
Field of study

Since the mid 90s corpora has become very important for computational linguistics. This paper offers a survey of how they are currently used in different fields of the discipline, with particular emphasis on anaphora and coreference resolution, automatic summarisation and term extraction. Their influence on other fields is also briefly discussed

Directory of Open Access Journals

Wolverhampton Intellectual Repository and E-theses

A Survey on Semantic Processing Techniques

Author: Cambria Erik
Chen Guanyi
He Kai
Mao Rui
Ni Jinjie
Yang Zonglin
Zhang Xulang
Publication venue
Publication date: 22/10/2023
Field of study

Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

arXiv.org e-Print Archive

Translation of Pronominal Anaphora between English and Spanish: Discrepancies and Evaluation

Author: Ferrandez A.
Peral J.
Publication venue: 'AI Access Foundation'
Publication date: 23/06/2011
Field of study

This paper evaluates the different tasks carried out in the translation of pronominal anaphora in a machine translation (MT) system. The MT interlingua approach named AGIR (Anaphora Generation with an Interlingua Representation) improves upon other proposals presented to date because it is able to translate intersentential anaphors, detect co-reference chains, and translate Spanish zero pronouns into English---issues hardly considered by other systems. The paper presents the resolution and evaluation of these anaphora problems in AGIR with the use of different kinds of knowledge (lexical, morphological, syntactic, and semantic). The translation of English and Spanish anaphoric third-person personal pronouns (including Spanish zero pronouns) into the target language has been evaluated on unrestricted corpora. We have obtained a precision of 80.4% and 84.8% in the translation of Spanish and English pronouns, respectively. Although we have only studied the Spanish and English languages, our approach can be easily extended to other languages such as Portuguese, Italian, or Japanese

arXiv.org e-Print Archive

Crossref

Coreference chains in Czech, English and Russian: Preliminary findings

Author: Nedoluzhko Anna
Novák Michal
Toldova Svetlana
Publication venue
Publication date: 01/01/2015
Field of study

Tento článek je pilotní srovnavací výzkum koreferenčních řetězců v češtině, angličtině a ruštině. Podrobili jsme analýze 16 srovnatelných textů ve třech jazycích. Naší motivací bylo zjistit lingvistickou strukturu koreferenčních řetězců v těchto jazycích a určit, které faktory ovlivňují tuto strukturu

Biblio at Institute of Formal and Applied Linguistics

Investigating Multilingual Coreference Resolution by Universal Annotations

Author: Chai Haixia
Strube Michael
Publication venue
Publication date: 26/10/2023
Field of study

Multilingual coreference resolution (MCR) has been a long-standing and challenging task. With the newly proposed multilingual coreference dataset, CorefUD (Nedoluzhko et al., 2022), we conduct an investigation into the task by using its harmonized universal morphosyntactic and coreference annotations. First, we study coreference by examining the ground truth data at different linguistic levels, namely mention, entity and document levels, and across different genres, to gain insights into the characteristics of coreference across multiple languages. Second, we perform an error analysis of the most challenging cases that the SotA system fails to resolve in the CRAC 2022 shared task using the universal annotations. Last, based on this analysis, we extract features from universal morphosyntactic annotations and integrate these features into a baseline system to assess their potential benefits for the MCR task. Our results show that our best configuration of features improves the baseline by 0.9% F1 score.Comment: Accepted at Findings of EMNLP202

arXiv.org e-Print Archive

Towards Multilingual Coreference Resolution

Author: Zhekova Desislava
Publication venue
Publication date: 01/01/2013
Field of study

The current work investigates the problems that occur when coreference resolution is considered as a multilingual task. We assess the issues that arise when a framework using the mention-pair coreference resolution model and memory-based learning for the resolution process are used. Along the way, we revise three essential subtasks of coreference resolution: mention detection, mention head detection and feature selection. For each of these aspects we propose various multilingual solutions including both heuristic, rule-based and machine learning methods. We carry out a detailed analysis that includes eight different languages (Arabic, Catalan, Chinese, Dutch, English, German, Italian and Spanish) for which datasets were provided by the only two multilingual shared tasks on coreference resolution held so far: SemEval-2 and CoNLL-2012. Our investigation shows that, although complex, the coreference resolution task can be targeted in a multilingual and even language independent way. We proposed machine learning methods for each of the subtasks that are affected by the transition, evaluated and compared them to the performance of rule-based and heuristic approaches. Our results confirmed that machine learning provides the needed flexibility for the multilingual task and that the minimal requirement for a language independent system is a part-of-speech annotation layer provided for each of the approached languages. We also showed that the performance of the system can be improved by introducing other layers of linguistic annotations, such as syntactic parses (in the form of either constituency or dependency parses), named entity information, predicate argument structure, etc. Additionally, we discuss the problems occurring in the proposed approaches and suggest possibilities for their improvement

E-LIB Dokumentserver - Staats und Universitätsbibliothek Bremen