197 research outputs found
Open-Vocabulary Semantic Parsing with both Distributional Statistics and Formal Knowledge
Traditional semantic parsers map language onto compositional, executable
queries in a fixed schema. This mapping allows them to effectively leverage the
information contained in large, formal knowledge bases (KBs, e.g., Freebase) to
answer questions, but it is also fundamentally limiting---these semantic
parsers can only assign meaning to language that falls within the KB's
manually-produced schema. Recently proposed methods for open vocabulary
semantic parsing overcome this limitation by learning execution models for
arbitrary language, essentially using a text corpus as a kind of knowledge
base. However, all prior approaches to open vocabulary semantic parsing replace
a formal KB with textual information, making no use of the KB in their models.
We show how to combine the disparate representations used by these two
approaches, presenting for the first time a semantic parser that (1) produces
compositional, executable representations of language, (2) can successfully
leverage the information contained in both a formal KB and a large corpus, and
(3) is not limited to the schema of the underlying KB. We demonstrate
significantly improved performance over state-of-the-art baselines on an
open-domain natural language question answering task.Comment: Re-written abstract and intro, other minor changes throughout. This
version published at AAAI 201
Learning structured natural language representations for semantic parsing
We introduce a neural semantic parser that converts natural language
utterances to intermediate representations in the form of predicate-argument
structures, which are induced with a transition system and subsequently mapped
to target domains. The semantic parser is trained end-to-end using annotated
logical forms or their denotations. We obtain competitive results on various
datasets. The induced predicate-argument structures shed light on the types of
representations useful for semantic parsing and how these are different from
linguistically motivated ones
KQA Pro: A Large-Scale Dataset with Interpretable Programs and Accurate SPARQLs for Complex Question Answering over Knowledge Base
Complex question answering over knowledge base (Complex KBQA) is challenging
because it requires various compositional reasoning capabilities, such as
multi-hop inference, attribute comparison, set operation, and etc. Existing
benchmarks have some shortcomings that limit the development of Complex KBQA:
1) they only provide QA pairs without explicit reasoning processes; 2)
questions are either generated by templates, leading to poor diversity, or on a
small scale. To this end, we introduce KQA Pro, a large-scale dataset for
Complex KBQA. We define a compositional and highly-interpretable formal format,
named Program, to represent the reasoning process of complex questions. We
propose compositional strategies to generate questions, corresponding SPARQLs,
and Programs with a small number of templates, and then paraphrase the
generated questions to natural language questions (NLQ) by crowdsourcing,
giving rise to around 120K diverse instances. SPARQL and Program depict two
complementary solutions to answer complex questions, which can benefit a large
spectrum of QA methods. Besides the QA task, KQA Pro can also serves for the
semantic parsing task. As far as we know, it is currently the largest corpus of
NLQ-to-SPARQL and NLQ-to-Program. We conduct extensive experiments to evaluate
whether machines can learn to answer our complex questions in different cases,
that is, with only QA supervision or with intermediate SPARQL/Program
supervision. We find that state-of-the-art KBQA methods learnt from only QA
pairs perform very poor on our dataset, implying our questions are more
challenging than previous datasets. However, pretrained models learnt from our
NLQ-to-SPARQL and NLQ-to-Program annotations surprisingly achieve about 90\%
answering accuracy, which is even close to the human expert performance..
Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures
Hakimov S. Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures. Bielefeld: Universität Bielefeld; 2019.The task of answering natural language questions over structured data has received wide
interest in recent years. Structured data in the form of knowledge bases has been available
for public usage with coverage on multiple domains. DBpedia and Freebase are such knowledge
bases that include encyclopedic data about multiple domains. However, querying such
knowledge bases requires an understanding of a query language and the underlying ontology,
which requires domain expertise. Querying structured data via question answering systems
that understand natural language has gained popularity to bridge the gap between the data
and the end user.
In order to understand a natural language question, a question answering system needs
to map the question into query representation that can be evaluated given a knowledge base.
An important aspect that we focus in this thesis is the multilinguality. While most research
focused on building monolingual solutions, mainly English, this thesis focuses on building
multilingual question answering systems. The main challenge for processing language input
is interpreting the meaning of questions in multiple languages.
In this thesis, we present three different semantic parsing approaches that learn models
to map questions into meaning representations, into a query in particular, in a supervised
fashion. Each approach differs in the way the model is learned, the features of the model, the
way of representing the meaning and how the meaning of questions is composed. The first
approach learns a joint probabilistic model for syntax and semantics simultaneously from the
labeled data. The second method learns a factorized probabilistic graphical model that builds
on a dependency parse of the input question and predicts the meaning representation that is
converted into a query. The last approach presents a number of different neural architectures
that tackle the task of question answering in end-to-end fashion. We evaluate each approach
using publicly available datasets and compare them with state-of-the-art QA systems
- …