121,077 research outputs found
MinoanER: Schema-Agnostic, Non-Iterative, Massively Parallel Resolution of Web Entities
Entity Resolution (ER) aims to identify different descriptions in various
Knowledge Bases (KBs) that refer to the same entity. ER is challenged by the
Variety, Volume and Veracity of entity descriptions published in the Web of
Data. To address them, we propose the MinoanER framework that simultaneously
fulfills full automation, support of highly heterogeneous entities, and massive
parallelization of the ER process. MinoanER leverages a token-based similarity
of entities to define a new metric that derives the similarity of neighboring
entities from the most important relations, as they are indicated only by
statistics. A composite blocking method is employed to capture different
sources of matching evidence from the content, neighbors, or names of entities.
The search space of candidate pairs for comparison is compactly abstracted by a
novel disjunctive blocking graph and processed by a non-iterative, massively
parallel matching algorithm that consists of four generic, schema-agnostic
matching rules that are quite robust with respect to their internal
configuration. We demonstrate that the effectiveness of MinoanER is comparable
to existing ER tools over real KBs exhibiting low Variety, but it outperforms
them significantly when matching KBs with high Variety.Comment: Presented at EDBT 2001
A Logic-based Approach for Recognizing Textual Entailment Supported by Ontological Background Knowledge
We present the architecture and the evaluation of a new system for
recognizing textual entailment (RTE). In RTE we want to identify automatically
the type of a logical relation between two input texts. In particular, we are
interested in proving the existence of an entailment between them. We conceive
our system as a modular environment allowing for a high-coverage syntactic and
semantic text analysis combined with logical inference. For the syntactic and
semantic analysis we combine a deep semantic analysis with a shallow one
supported by statistical models in order to increase the quality and the
accuracy of results. For RTE we use logical inference of first-order employing
model-theoretic techniques and automated reasoning tools. The inference is
supported with problem-relevant background knowledge extracted automatically
and on demand from external sources like, e.g., WordNet, YAGO, and OpenCyc, or
other, more experimental sources with, e.g., manually defined presupposition
resolutions, or with axiomatized general and common sense knowledge. The
results show that fine-grained and consistent knowledge coming from diverse
sources is a necessary condition determining the correctness and traceability
of results.Comment: 25 pages, 10 figure
- …