8,518 research outputs found
Query Answering in Probabilistic Data and Knowledge Bases
Probabilistic data and knowledge bases are becoming increasingly important in academia and industry. They are continuously extended with new data, powered by modern information extraction tools that associate probabilities with knowledge base facts. The state of the art to store and process such data is founded on probabilistic database systems, which are widely and successfully employed. Beyond all the success stories, however, such systems still lack the fundamental machinery to convey some of the valuable knowledge hidden in them to the end user, which limits their potential applications in practice. In particular, in their classical form, such systems are typically based on strong, unrealistic limitations, such as the closed-world assumption, the closed-domain assumption, the tuple-independence assumption, and the lack of commonsense knowledge. These limitations do not only lead to unwanted consequences, but also put such systems on weak footing in important tasks, querying answering being a very central one. In this thesis, we enhance probabilistic data and knowledge bases with more realistic data models, thereby allowing for better means for querying them. Building on the long endeavor of unifying logic and probability, we develop different rigorous semantics for probabilistic data and knowledge bases, analyze their computational properties and identify sources of (in)tractability and design practical scalable query answering algorithms whenever possible. To achieve this, the current work brings together some recent paradigms from logics, probabilistic inference, and database theory
Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures
Hakimov S. Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures. Bielefeld: Universität Bielefeld; 2019.The task of answering natural language questions over structured data has received wide
interest in recent years. Structured data in the form of knowledge bases has been available
for public usage with coverage on multiple domains. DBpedia and Freebase are such knowledge
bases that include encyclopedic data about multiple domains. However, querying such
knowledge bases requires an understanding of a query language and the underlying ontology,
which requires domain expertise. Querying structured data via question answering systems
that understand natural language has gained popularity to bridge the gap between the data
and the end user.
In order to understand a natural language question, a question answering system needs
to map the question into query representation that can be evaluated given a knowledge base.
An important aspect that we focus in this thesis is the multilinguality. While most research
focused on building monolingual solutions, mainly English, this thesis focuses on building
multilingual question answering systems. The main challenge for processing language input
is interpreting the meaning of questions in multiple languages.
In this thesis, we present three different semantic parsing approaches that learn models
to map questions into meaning representations, into a query in particular, in a supervised
fashion. Each approach differs in the way the model is learned, the features of the model, the
way of representing the meaning and how the meaning of questions is composed. The first
approach learns a joint probabilistic model for syntax and semantics simultaneously from the
labeled data. The second method learns a factorized probabilistic graphical model that builds
on a dependency parse of the input question and predicts the meaning representation that is
converted into a query. The last approach presents a number of different neural architectures
that tackle the task of question answering in end-to-end fashion. We evaluate each approach
using publicly available datasets and compare them with state-of-the-art QA systems
Structurally Tractable Uncertain Data
Many data management applications must deal with data which is uncertain,
incomplete, or noisy. However, on existing uncertain data representations, we
cannot tractably perform the important query evaluation tasks of determining
query possibility, certainty, or probability: these problems are hard on
arbitrary uncertain input instances. We thus ask whether we could restrict the
structure of uncertain data so as to guarantee the tractability of exact query
evaluation. We present our tractability results for tree and tree-like
uncertain data, and a vision for probabilistic rule reasoning. We also study
uncertainty about order, proposing a suitable representation, and study
uncertain data conditioned by additional observations.Comment: 11 pages, 1 figure, 1 table. To appear in SIGMOD/PODS PhD Symposium
201
Open-Vocabulary Semantic Parsing with both Distributional Statistics and Formal Knowledge
Traditional semantic parsers map language onto compositional, executable
queries in a fixed schema. This mapping allows them to effectively leverage the
information contained in large, formal knowledge bases (KBs, e.g., Freebase) to
answer questions, but it is also fundamentally limiting---these semantic
parsers can only assign meaning to language that falls within the KB's
manually-produced schema. Recently proposed methods for open vocabulary
semantic parsing overcome this limitation by learning execution models for
arbitrary language, essentially using a text corpus as a kind of knowledge
base. However, all prior approaches to open vocabulary semantic parsing replace
a formal KB with textual information, making no use of the KB in their models.
We show how to combine the disparate representations used by these two
approaches, presenting for the first time a semantic parser that (1) produces
compositional, executable representations of language, (2) can successfully
leverage the information contained in both a formal KB and a large corpus, and
(3) is not limited to the schema of the underlying KB. We demonstrate
significantly improved performance over state-of-the-art baselines on an
open-domain natural language question answering task.Comment: Re-written abstract and intro, other minor changes throughout. This
version published at AAAI 201
Towards Log-Linear Logics with Concrete Domains
We present (M denotes Markov logic networks) an
extension of the log-linear description logics -LL with
concrete domains, nominals, and instances. We use Markov logic networks (MLNs)
in order to find the most probable, classified and coherent
ontology from an knowledge base. In particular, we develop
a novel way to deal with concrete domains (also known as datatypes) by
extending MLN's cutting plane inference (CPI) algorithm.Comment: StarAI201
- …