Search CORE

33 research outputs found

Using Zero Anaphora Resolution to Improve Text Categorization

Author: Chen Yi-Chun
Yeh Ching-Long
Publication venue: COLIPS PUBLICATIONS
Publication date: 01/01/2003
Field of study

Waseda University Repository

Gender and Animacy Knowledge Discovery from Web-Scale N-Grams for Unsupervised Person Mention Detection

Author: Ji Heng
Lin Dekang
Publication venue: City University of Hong Kong
Publication date: 01/01/2009
Field of study

PACLIC 23 / City University of Hong Kong / 3-5 December 200

Waseda University Repository

Evaluating anaphora and coreference resolution to improve automatic keyphrase extraction

Author: Basaldella Marco
Chiaradia Giorgia
Tasso Carlo
Publication venue: country:JPN
Publication date: 01/01/2016
Field of study

In this paper we analyze the effectiveness of using linguistic knowledge from coreference and anaphora resolution for improving the performance for supervised keyphrase extraction. In order to verify the impact of these features, we de\ufb01ne a baseline keyphrase extraction system and evaluate its performance on a standard dataset using different machine learning algorithms. Then, we consider new sets of features by adding combinations of the linguistic features we propose and we evaluate the new performance of the system. We also use anaphora and coreference resolution to transform the documents, trying to simulate the cohesion process performed by the human mind. We found that our approach has a slightly positive impact on the performance of automatic keyphrase extraction, in particular when considering the ranking of the results

Archivio istituzionale della ricerca - Università degli Studi di Udine

Unsupervised learning of contextual role knowledge for coreference resolution

Author: Bean David
Riloff Ellen M.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2004
Field of study

Journal ArticleWe present a coreference resolver called BABAR that uses contextual role knowledge to evaluate possible antecedents for an anaphor. BABAR uses information extraction patterns to identify contextual roles and creates four contextual role knowledge sources using unsupervised learning. These knowledge sources determine whether the contexts surrounding an anaphor and antecedent are compatible. BABAR applies a Dempster-Shafer probabilistic model to make resolutions based on evidence from the contextual role knowledge sources as well as general knowledge sources. Experiments in two domains showed that the contextual role knowledge improved coreference performance, especially on pronouns

The University of Utah: J. Willard Marriott Digital Library

Detecting Bridge Anaphora

Author: Cioca Lucian-Ionel
Gîfu Daniela
Publication venue: Agora University Press
Publication date: 28/02/2017
Field of study

The paper presents one of most important issues in natural language processing (NLP), namely the automated recognition of semantic relations (in this case, bridge anaphora). In this sense, we propose to recognize automatically, as accurately as possible, this type of relations in a literary corpus (the novel Quo Vadis), knowing that the diversity and complexity of relations between entities is impressive. Furthermore, we defined and classified the bridge anaphora type relations based on annotation conventions. In order to achieve the main goal, we developed a computational instrument, BAT (Bridge Anaphora Tool), currently still in a test (and implicitly an improvable) version. This study is intended to help especially specialists and researchers in the field of natural language processing, linguists, but not only

Agora University Editing House: Journals

Benchmarking natural-language parsers for biological applications using dependency graphs

Author: A Bies
AB Clegg
Adrian J Shepherd
Andrew B Clegg
B Rosario
B Srinivas
C Friedman
C Grover
C Grover
D Blaheta
D Gildea
D Klein
D Klein
D Lin
D Lin
D Sleator
DM Bikel
E Charniak
E Tsivtsivadze
EB Camon
EJ Briscoe
G Sampson
G Schneider
G Schneider
IM Goldin
J Carroll
J Carroll
J Finkel
J Xiao
JM Temkin
K Franzén
K Knight
KB Cohen
L Smith
M Collins
M Lease
MC de Marneffe
MP Marcus
N Domedel-Puig
N Ge
O Sanchez
P Merlo
PG Mutalik
S Abney
S Kübler
S Pyysalo
ST Ahmed
T Briscoe
TC Rindflesch
Y Huang
Z Shi
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. RESULTS: Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. CONCLUSION: Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Review of coreference resolution in English and Persian

Author: Aznaveh Ahmad Mahmoudi
Mohammadi Hassan Haji
Talebpour Alireza
Yazdani Samaneh
Publication venue
Publication date: 08/11/2022
Field of study

Coreference resolution (CR) is one of the most challenging areas of natural language processing. This task seeks to identify all textual references to the same real-world entity. Research in this field is divided into coreference resolution and anaphora resolution. Due to its application in textual comprehension and its utility in other tasks such as information extraction systems, document summarization, and machine translation, this field has attracted considerable interest. Consequently, it has a significant effect on the quality of these systems. This article reviews the existing corpora and evaluation metrics in this field. Then, an overview of the coreference algorithms, from rule-based methods to the latest deep learning techniques, is provided. Finally, coreference resolution and pronoun resolution systems in Persian are investigated.Comment: 44 pages, 11 figures, 5 table

arXiv.org e-Print Archive

Sistem za razreševanje koreferenc pri analizi slovenskih besedil in možnosti njegove uporabe

Author: Peter Holozan
Publication venue: 'University of Ljubljana'
Publication date: 01/12/2015
Field of study

Razreševanje koreferenc je pomemben del jezikovnih tehnologij, vendar za slovenščino ta tehnologija še ni bila razvita. Obstajajo različne vrste koreferenc, članek se osredotoča predvsem na anafore pri osebnih zaimkih. Uporabljenih je bilo sedem metod razreševanja, ki se med seboj dopolnjujejo, najpomembnejša temelji na metodah na osnovi aktivacije. Prvi rezultati so obetavni, za podrobnejšo analizo delovanja pa bo potreben korpus z označenimi primeri. Razreševanje koreferenc je bilo uporabljeno tudi v sistemu za odgovarjanje na vprašanja Piflar, ki zna s tem odgovoriti na več vprašanj, ker mu uspe nadomestiti osebne zaimke, hkrati pa je bil Piflar dopolnjen še z drugimi dodatki, npr. z odgovarjanjem na posamične stavčne člene in na trdilne povedi, izboljšano pa je bilo tudi tvorjenje dolgih odgovorov pri odločevalnih vprašanjih. Razreševanje koreferenc je izboljšalo tudi delovanje strojnega prevajalnika Presis, in sicer pri določanju spola osebnih zaimkov in pri razdvoumljanju prilastkovih odvisnikov

Directory of Open Access Journals

Journals of Faculty of Arts, University of Ljubljana

Pattern Based Information Extraction System in Business News Articles

Author: Wang Yiqi
Publication venue: University of North Carolina at Chapel Hill
Publication date: 01/05/2016
Field of study

Business news journals provide a rich resource of business events, which enable domain experts to further understand the spatio-temporal changes occur among a set of firms and people. However, extracting structured data from journal resource that is text-based and unstructured is a non-trivial challenge. This project designs and implements a Business Information Extraction System, which combines advanced natural language processing (NLP) tools and knowledge-based extraction patterns to process and extract information of target business event from news journals automatically. The performance evaluation on the proposed system suggests that IE techniques works well on business event extraction and it is promising to apply the technique to extract more types of business events.Master of Science in Information Scienc

Carolina Digital Repository