569 research outputs found
Enriching Knowledge Bases with Counting Quantifiers
Information extraction traditionally focuses on extracting relations between
identifiable entities, such as . Yet, texts
often also contain Counting information, stating that a subject is in a
specific relation with a number of objects, without mentioning the objects
themselves, for example, "California is divided into 58 counties". Such
counting quantifiers can help in a variety of tasks such as query answering or
knowledge base curation, but are neglected by prior work. This paper develops
the first full-fledged system for extracting counting information from text,
called CINEX. We employ distant supervision using fact counts from a knowledge
base as training seeds, and develop novel techniques for dealing with several
challenges: (i) non-maximal training seeds due to the incompleteness of
knowledge bases, (ii) sparse and skewed observations in text sources, and (iii)
high diversity of linguistic patterns. Experiments with five human-evaluated
relations show that CINEX can achieve 60% average precision for extracting
counting information. In a large-scale experiment, we demonstrate the potential
for knowledge base enrichment by applying CINEX to 2,474 frequent relations in
Wikidata. CINEX can assert the existence of 2.5M facts for 110 distinct
relations, which is 28% more than the existing Wikidata facts for these
relations.Comment: 16 pages, The 17th International Semantic Web Conference (ISWC 2018
{CounQER}: {A} System for Discovering and Linking Count Information in Knowledge Bases
Predicate constraints of general-purpose knowledge bases (KBs) like Wikidata, DBpedia and Freebase are often limited to subproperty, domain and range constraints. In this demo we showcase CounQER, a system that illustrates the alignment of counting predicates, like staffSize, and enumerating predicates, like workInstitution^{-1} . In the demonstration session, attendees can inspect these alignments, and will learn about the importance of these alignments for KB question answering and curation. CounQER is available at https://counqer.mpi-inf.mpg.de/spo
On the Limits of Machine Knowledge: {C}ompleteness, Recall and Negation in Web-scale Knowledge Bases
Uncovering Hidden Semantics of Set Information in Knowledge Bases
Knowledge Bases (KBs) contain a wealth of structured information about entities and predicates. This paper focuses on set-valued predicates, i.e., the relationship between an entity and a set of entities. In KBs, this information is often represented in two formats: (i) via counting predicates such as numberOfChildren and staffSize, that store aggregated integers, and (ii) via enumerating predicates such as parentOf and worksFor, that store individual set memberships. Both formats are typically complementary: unlike enumerating predicates, counting predicates do not give away individuals, but are more likely informative towards the true set size, thus this coexistence could enable interesting applications in question answering and KB curation. In this paper we aim at uncovering this hidden knowledge. We proceed in two steps. (i) We identify set-valued predicates from a given KB predicates via statistical and embedding-based features. (ii) We link counting predicates and enumerating predicates by a combination of co-occurrence, correlation and textual relatedness metrics. We analyze the prevalence of count information in four prominent knowledge bases, and show that our linking method achieves up to 0.55 F1 score in set predicate identification versus 0.40 F1 score of a random selection, and normalized discounted gains of up to 0.84 at position 1 and 0.75 at position 3 in relevant predicate alignments. Our predicate alignments are showcased in a demonstration system available at https://counqer.mpi-inf.mpg.de/spo
How Controlled English can Improve Semantic Wikis
The motivation of semantic wikis is to make acquisition, maintenance, and
mining of formal knowledge simpler, faster, and more flexible. However, most
existing semantic wikis have a very technical interface and are restricted to a
relatively low level of expressivity. In this paper, we explain how AceWiki
uses controlled English - concretely Attempto Controlled English (ACE) - to
provide a natural and intuitive interface while supporting a high degree of
expressivity. We introduce recent improvements of the AceWiki system and user
studies that indicate that AceWiki is usable and useful
Complexity Results for Modal Dependence Logic
Modal dependence logic was introduced recently by V\"a\"an\"anen. It enhances
the basic modal language by an operator =(). For propositional variables
p_1,...,p_n, =(p_1,...,p_(n-1);p_n) intuitively states that the value of p_n is
determined by those of p_1,...,p_(n-1). Sevenster (J. Logic and Computation,
2009) showed that satisfiability for modal dependence logic is complete for
nondeterministic exponential time. In this paper we consider fragments of modal
dependence logic obtained by restricting the set of allowed propositional
connectives. We show that satisfibility for poor man's dependence logic, the
language consisting of formulas built from literals and dependence atoms using
conjunction, necessity and possibility (i.e., disallowing disjunction), remains
NEXPTIME-complete. If we only allow monotone formulas (without negation, but
with disjunction), the complexity drops to PSPACE-completeness. We also extend
V\"a\"an\"anen's language by allowing classical disjunction besides dependence
disjunction and show that the satisfiability problem remains NEXPTIME-complete.
If we then disallow both negation and dependence disjunction, satistiability is
complete for the second level of the polynomial hierarchy. In this way we
completely classify the computational complexity of the satisfiability problem
for all restrictions of propositional and dependence operators considered by
V\"a\"an\"anen and Sevenster.Comment: 22 pages, full version of CSL 2010 pape
A discussion on particle number and quantum indistinguishability
The concept of individuality in quantum mechanics shows radical differences
from the concept of individuality in classical physics, as E. Schroedinger
pointed out in the early steps of the theory. Regarding this fact, some authors
suggested that quantum mechanics does not possess its own language, and
therefore, quantum indistinguishability is not incorporated in the theory from
the beginning. Nevertheless, it is possible to represent the idea of quantum
indistinguishability with a first order language using quasiset theory (Q). In
this work, we show that Q cannot capture one of the most important features of
quantum non individuality, which is the fact that there are quantum systems for
which particle number is not well defined. An axiomatic variant of Q, in which
quasicardinal is not a primitive concept (for a kind of quasisets called finite
quasisets), is also given. This result encourages the searching of theories in
which the quasicardinal, being a secondary concept, stands undefined for some
quasisets, besides showing explicitly that in a set theory about collections of
truly indistinguishable entities, the quasicardinal needs not necessarily be a
primitive concept.Comment: 46 pages, no figures. Accepted by Foundations of Physic
- …