Search CORE

28,200 research outputs found

Framing Named Entity Linking Error Types

Author: Brasoveanu Adrian M.P.
Kuntschik Philipp
Nixon Lyndon
Rizzo Giuseppe
Weichselbraun Albert
Publication venue
Publication date: 07/05/2018
Field of study

Named Entity Linking (NEL) and relation extraction forms the backbone of Knowledge Base Population tasks. The recent rise of large open source Knowledge Bases and the continuous focus on improving NEL performance has led to the creation of automated benchmark solutions during the last decade. The benchmarking of NEL systems offers a valuable approach to understand a NEL system’s performance quantitatively. However, an in-depth qualitative analysis that helps improving NEL methods by identifying error causes usually requires a more thorough error analysis. This paper proposes a taxonomy to frame common errors and applies this taxonomy in a survey study to assess the performance of four well-known Named Entity Linking systems on three recent gold standards. Keywords: Named Entity Linking, Linked Data Quality, Corpora, Evaluation, Error Analysi

webLyzard technology gmbh

Same but Different: Distant Supervision for Predicting and Understanding Entity Linking Difficulty

Author: Brasoveanu Adrian
Guo Stephen
Hoffart Johannes
Horne Benjamin D.
Jana Abhik
Mikolov Tomas
Sandhaus Evan
Zheng Zhicheng
Publication venue
Publication date: 13/12/2018
Field of study

Entity Linking (EL) is the task of automatically identifying entity mentions in a piece of text and resolving them to a corresponding entity in a reference knowledge base like Wikipedia. There is a large number of EL tools available for different types of documents and domains, yet EL remains a challenging task where the lack of precision on particularly ambiguous mentions often spoils the usefulness of automated disambiguation results in real applications. A priori approximations of the difficulty to link a particular entity mention can facilitate flagging of critical cases as part of semi-automated EL systems, while detecting latent factors that affect the EL performance, like corpus-specific features, can provide insights on how to improve a system based on the special characteristics of the underlying corpus. In this paper, we first introduce a consensus-based method to generate difficulty labels for entity mentions on arbitrary corpora. The difficulty labels are then exploited as training data for a supervised classification task able to predict the EL difficulty of entity mentions using a variety of features. Experiments over a corpus of news articles show that EL difficulty can be estimated with high accuracy, revealing also latent features that affect EL performance. Finally, evaluation results demonstrate the effectiveness of the proposed method to inform semi-automated EL pipelines.Comment: Preprint of paper accepted for publication in the 34th ACM/SIGAPP Symposium On Applied Computing (SAC 2019

arXiv.org e-Print Archive

Crossref

Neural End-to-End Learning for Computational Argumentation Mining

Author: Daxenberger Johannes
Eger Steffen
Gurevych Iryna
Publication venue
Publication date: 22/04/2017
Field of study

We investigate neural techniques for end-to-end computational argumentation mining (AM). We frame AM both as a token-based dependency parsing and as a token-based sequence tagging problem, including a multi-task learning setup. Contrary to models that operate on the argument component level, we find that framing AM as dependency parsing leads to subpar performance results. In contrast, less complex (local) tagging models based on BiLSTMs perform robustly across classification scenarios, being able to catch long-range dependencies inherent to the AM problem. Moreover, we find that jointly learning 'natural' subtasks, in a multi-task learning setup, improves performance.Comment: To be published at ACL 201

arXiv.org e-Print Archive

TUbiblio

An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols

Author: Kulkarni Chaitanya
Machiraju Raghu
Ritter Alan
Xu Wei
Publication venue
Publication date: 01/01/2018
Field of study

We describe an effort to annotate a corpus of natural language instructions consisting of 622 wet lab protocols to facilitate automatic or semi-automatic conversion of protocols into a machine-readable format and benefit biological research. Experimental results demonstrate the utility of our corpus for developing machine learning approaches to shallow semantic parsing of instructional texts. We make our annotated Wet Lab Protocol Corpus available to the research community

arXiv.org e-Print Archive

Crossref

Visual Event Cueing in Linked Spatiotemporal Data

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: The media disperses a large amount of information daily pertaining to political events social movements, and societal conflicts. Media pertaining to these topics, no matter the format of publication used, are framed a particular way. Framing is used not for just guiding audiences to desired beliefs, but also to fuel societal change or legitimize/delegitimize social movements. For this reason, tools that can help to clarify when changes in social discourse occur and identify their causes are of great use. This thesis presents a visual analytics framework that allows for the exploration and visualization of changes that occur in social climate with respect to space and time. Focusing on the links between data from the Armed Conflict Location and Event Data Project (ACLED) and a streaming RSS news data set, users can be cued into interesting events enabling them to form and explore hypothesis. This visual analytics framework also focuses on improving intervention detection, allowing users to hypothesize about correlations between events and happiness levels, and supports collaborative analysis.Dissertation/ThesisMasters Thesis Computer Science 201

ASU Digital Repository

A Fair and In-Depth Evaluation of Existing End-to-End Entity Linking Systems

Author: Bast Hannah
Hertel Matthias
Prange Natalie
Publication venue: Association for Computational Linguistics
Publication date: 19/12/2023
Field of study

KITopen

A Fair and In-Depth Evaluation of Existing End-to-End Entity Linking Systems

Author: Bast Hannah
Hertel Matthias
Prange Natalie
Publication venue
Publication date: 24/05/2023
Field of study

Existing evaluations of entity linking systems often say little about how the system is going to perform for a particular application. There are four fundamental reasons for this: many benchmarks focus on named entities; it is hard to define which other entities to include; there are ambiguities in entity recognition and entity linking; many benchmarks have errors or artifacts that invite overfitting or lead to evaluation results of limited meaningfulness. We provide a more meaningful and fair in-depth evaluation of a variety of existing end-to-end entity linkers. We characterize the strengths and weaknesses of these linkers and how well the results from the respective publications can be reproduced. Our evaluation is based on several widely used benchmarks, which exhibit the problems mentioned above to various degrees, as well as on two new benchmarks, which address these problems

arXiv.org e-Print Archive