15,145 research outputs found
On Type-Aware Entity Retrieval
Today, the practice of returning entities from a knowledge base in response
to search queries has become widespread. One of the distinctive characteristics
of entities is that they are typed, i.e., assigned to some hierarchically
organized type system (type taxonomy). The primary objective of this paper is
to gain a better understanding of how entity type information can be utilized
in entity retrieval. We perform this investigation in an idealized "oracle"
setting, assuming that we know the distribution of target types of the relevant
entities for a given query. We perform a thorough analysis of three main
aspects: (i) the choice of type taxonomy, (ii) the representation of
hierarchical type information, and (iii) the combination of type-based and
term-based similarity in the retrieval model. Using a standard entity search
test collection based on DBpedia, we find that type information proves most
useful when using large type taxonomies that provide very specific types. We
provide further insights on the extensional coverage of entities and on the
utility of target types.Comment: Proceedings of the 3rd ACM International Conference on the Theory of
Information Retrieval (ICTIR '17), 201
Completeness and Consistency Analysis for Evolving Knowledge Bases
Assessing the quality of an evolving knowledge base is a challenging task as
it often requires to identify correct quality assessment procedures.
Since data is often derived from autonomous, and increasingly large data
sources, it is impractical to manually curate the data, and challenging to
continuously and automatically assess their quality.
In this paper, we explore two main areas of quality assessment related to
evolving knowledge bases: (i) identification of completeness issues using
knowledge base evolution analysis, and (ii) identification of consistency
issues based on integrity constraints, such as minimum and maximum cardinality,
and range constraints.
For completeness analysis, we use data profiling information from consecutive
knowledge base releases to estimate completeness measures that allow predicting
quality issues. Then, we perform consistency checks to validate the results of
the completeness analysis using integrity constraints and learning models.
The approach has been tested both quantitatively and qualitatively by using a
subset of datasets from both DBpedia and 3cixty knowledge bases. The
performance of the approach is evaluated using precision, recall, and F1 score.
From completeness analysis, we observe a 94% precision for the English DBpedia
KB and 95% precision for the 3cixty Nice KB. We also assessed the performance
of our consistency analysis by using five learning models over three sub-tasks,
namely minimum cardinality, maximum cardinality, and range constraint. We
observed that the best performing model in our experimental setup is the Random
Forest, reaching an F1 score greater than 90% for minimum and maximum
cardinality and 84% for range constraints.Comment: Accepted for Journal of Web Semantic
mARC: Memory by Association and Reinforcement of Contexts
This paper introduces the memory by Association and Reinforcement of Contexts
(mARC). mARC is a novel data modeling technology rooted in the second
quantization formulation of quantum mechanics. It is an all-purpose incremental
and unsupervised data storage and retrieval system which can be applied to all
types of signal or data, structured or unstructured, textual or not. mARC can
be applied to a wide range of information clas-sification and retrieval
problems like e-Discovery or contextual navigation. It can also for-mulated in
the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast
to Conway approach, the objects evolve in a massively multidimensional space.
In order to start evaluating the potential of mARC we have built a mARC-based
Internet search en-gine demonstrator with contextual functionality. We compare
the behavior of the mARC demonstrator with Google search both in terms of
performance and relevance. In the study we find that the mARC search engine
demonstrator outperforms Google search by an order of magnitude in response
time while providing more relevant results for some classes of queries
- …