289 research outputs found
A Logic-based Approach for Recognizing Textual Entailment Supported by Ontological Background Knowledge
We present the architecture and the evaluation of a new system for
recognizing textual entailment (RTE). In RTE we want to identify automatically
the type of a logical relation between two input texts. In particular, we are
interested in proving the existence of an entailment between them. We conceive
our system as a modular environment allowing for a high-coverage syntactic and
semantic text analysis combined with logical inference. For the syntactic and
semantic analysis we combine a deep semantic analysis with a shallow one
supported by statistical models in order to increase the quality and the
accuracy of results. For RTE we use logical inference of first-order employing
model-theoretic techniques and automated reasoning tools. The inference is
supported with problem-relevant background knowledge extracted automatically
and on demand from external sources like, e.g., WordNet, YAGO, and OpenCyc, or
other, more experimental sources with, e.g., manually defined presupposition
resolutions, or with axiomatized general and common sense knowledge. The
results show that fine-grained and consistent knowledge coming from diverse
sources is a necessary condition determining the correctness and traceability
of results.Comment: 25 pages, 10 figure
SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases
The Internet has enabled the creation of a growing number of large-scale
knowledge bases in a variety of domains containing complementary information.
Tools for automatically aligning these knowledge bases would make it possible
to unify many sources of structured knowledge and answer complex queries.
However, the efficient alignment of large-scale knowledge bases still poses a
considerable challenge. Here, we present Simple Greedy Matching (SiGMa), a
simple algorithm for aligning knowledge bases with millions of entities and
facts. SiGMa is an iterative propagation algorithm which leverages both the
structural information from the relationship graph as well as flexible
similarity measures between entity properties in a greedy local search, thus
making it scalable. Despite its greedy nature, our experiments indicate that
SiGMa can efficiently match some of the world's largest knowledge bases with
high precision. We provide additional experiments on benchmark datasets which
demonstrate that SiGMa can outperform state-of-the-art approaches both in
accuracy and efficiency.Comment: 10 pages + 2 pages appendix; 5 figures -- initial preprin
Classifying the Wikipedia articles into the OpenCyc taxonomy
This article presents a method of classification of the Wikipedia articles into the taxonomy of OpenCyc. This method utilises several sources of the classification information, namely the Wikipedia category system, the infoboxes attached to the articles, the first sentences of the articles, treated as their definitions and the direct mapping between the articles and the Cyc symbols. The classification decision made using these methods are accommodated using the Cyc built-in inconsistency
detection mechanism. The combination of the best classification methods yields 1,47 millions of classified articles and has a manually verified precision above 97%, while the combination of all of them yields 2.2 millions of articles with estimated precision of 93%
- …