8,535 research outputs found
Use of Wikipedia Categories in Entity Ranking
Wikipedia is a useful source of knowledge that has many applications in
language processing and knowledge representation. The Wikipedia category graph
can be compared with the class hierarchy in an ontology; it has some
characteristics in common as well as some differences. In this paper, we
present our approach for answering entity ranking queries from the Wikipedia.
In particular, we explore how to make use of Wikipedia categories to improve
entity ranking effectiveness. Our experiments show that using categories of
example entities works significantly better than using loosely defined target
categories
Learning Relatedness Measures for Entity Linking
Entity Linking is the task of detecting, in text documents, relevant mentions to entities of a given knowledge base. To this end, entity-linking algorithms use several signals and features extracted from the input text or from the knowl- edge base. The most important of such features is entity relatedness. Indeed, we argue that these algorithms benefit from maximizing the relatedness among the relevant enti- ties selected for annotation, since this minimizes errors in disambiguating entity-linking.
The definition of an e↵ective relatedness function is thus a crucial point in any entity-linking algorithm. In this paper we address the problem of learning high-quality entity relatedness functions. First, we formalize the problem of learning entity relatedness as a learning-to-rank problem. We propose a methodology to create reference datasets on the basis of manually annotated data. Finally, we show that our machine-learned entity relatedness function performs better than other relatedness functions previously proposed, and, more importantly, improves the overall performance of dif- ferent state-of-the-art entity-linking algorithms
Entity Query Feature Expansion Using Knowledge Base Links
Recent advances in automatic entity linking and knowledge base
construction have resulted in entity annotations for document and
query collections. For example, annotations of entities from large
general purpose knowledge bases, such as Freebase and the Google
Knowledge Graph. Understanding how to leverage these entity
annotations of text to improve ad hoc document retrieval is an open
research area. Query expansion is a commonly used technique to
improve retrieval effectiveness. Most previous query expansion
approaches focus on text, mainly using unigram concepts. In this
paper, we propose a new technique, called entity query feature
expansion (EQFE) which enriches the query with features from
entities and their links to knowledge bases, including structured
attributes and text. We experiment using both explicit query entity
annotations and latent entities. We evaluate our technique on TREC
text collections automatically annotated with knowledge base entity
links, including the Google Freebase Annotations (FACC1) data.
We find that entity-based feature expansion results in significant
improvements in retrieval effectiveness over state-of-the-art text
expansion approaches
On Type-Aware Entity Retrieval
Today, the practice of returning entities from a knowledge base in response
to search queries has become widespread. One of the distinctive characteristics
of entities is that they are typed, i.e., assigned to some hierarchically
organized type system (type taxonomy). The primary objective of this paper is
to gain a better understanding of how entity type information can be utilized
in entity retrieval. We perform this investigation in an idealized "oracle"
setting, assuming that we know the distribution of target types of the relevant
entities for a given query. We perform a thorough analysis of three main
aspects: (i) the choice of type taxonomy, (ii) the representation of
hierarchical type information, and (iii) the combination of type-based and
term-based similarity in the retrieval model. Using a standard entity search
test collection based on DBpedia, we find that type information proves most
useful when using large type taxonomies that provide very specific types. We
provide further insights on the extensional coverage of entities and on the
utility of target types.Comment: Proceedings of the 3rd ACM International Conference on the Theory of
Information Retrieval (ICTIR '17), 201
LODE: Linking Digital Humanities Content to the Web of Data
Numerous digital humanities projects maintain their data collections in the
form of text, images, and metadata. While data may be stored in many formats,
from plain text to XML to relational databases, the use of the resource
description framework (RDF) as a standardized representation has gained
considerable traction during the last five years. Almost every digital
humanities meeting has at least one session concerned with the topic of digital
humanities, RDF, and linked data. While most existing work in linked data has
focused on improving algorithms for entity matching, the aim of the
LinkedHumanities project is to build digital humanities tools that work "out of
the box," enabling their use by humanities scholars, computer scientists,
librarians, and information scientists alike. With this paper, we report on the
Linked Open Data Enhancer (LODE) framework developed as part of the
LinkedHumanities project. With LODE we support non-technical users to enrich a
local RDF repository with high-quality data from the Linked Open Data cloud.
LODE links and enhances the local RDF repository without compromising the
quality of the data. In particular, LODE supports the user in the enhancement
and linking process by providing intuitive user-interfaces and by suggesting
high-quality linking candidates using tailored matching algorithms. We hope
that the LODE framework will be useful to digital humanities scholars
complementing other digital humanities tools
Target Type Identification for Entity-Bearing Queries
Identifying the target types of entity-bearing queries can help improve
retrieval performance as well as the overall search experience. In this work,
we address the problem of automatically detecting the target types of a query
with respect to a type taxonomy. We propose a supervised learning approach with
a rich variety of features. Using a purpose-built test collection, we show that
our approach outperforms existing methods by a remarkable margin. This is an
extended version of the article published with the same title in the
Proceedings of SIGIR'17.Comment: Extended version of SIGIR'17 short paper, 5 page
On the Impact of Entity Linking in Microblog Real-Time Filtering
Microblogging is a model of content sharing in which the temporal locality of
posts with respect to important events, either of foreseeable or unforeseeable
nature, makes applica- tions of real-time filtering of great practical
interest. We propose the use of Entity Linking (EL) in order to improve the
retrieval effectiveness, by enriching the representation of microblog posts and
filtering queries. EL is the process of recognizing in an unstructured text the
mention of relevant entities described in a knowledge base. EL of short pieces
of text is a difficult task, but it is also a scenario in which the information
EL adds to the text can have a substantial impact on the retrieval process. We
implement a start-of-the-art filtering method, based on the best systems from
the TREC Microblog track realtime adhoc retrieval and filtering tasks , and
extend it with a Wikipedia-based EL method. Results show that the use of EL
significantly improves over non-EL based versions of the filtering methods.Comment: 6 pages, 1 figure, 1 table. SAC 2015, Salamanca, Spain - April 13 -
17, 201
- …