10 research outputs found
ArcaneQA: Dynamic Program Induction and Contextualized Encoding for Knowledge Base Question Answering
Question answering on knowledge bases (KBQA) poses a unique challenge for
semantic parsing research due to two intertwined factors: large search space
and ambiguities in schema linking. The predominant ranking-based KBQA models,
which rely on a candidate enumeration step to reduce the search space, struggle
with flexibility and have impractical online running time. In this paper, we
present ArcaneQA, a novel generation-based model that addresses both the large
search space and schema linking in a unified framework with two mutually
boosting ingredients: we use dynamic program induction to tackle the large
search space and dynamic contextualized encoding to enhance schema linking.
Experiment results on multiple popular KBQA datasets demonstrate the highly
competitive performance of ArcaneQA in both effectiveness and efficiency
KQA Pro: A Large-Scale Dataset with Interpretable Programs and Accurate SPARQLs for Complex Question Answering over Knowledge Base
Complex question answering over knowledge base (Complex KBQA) is challenging
because it requires various compositional reasoning capabilities, such as
multi-hop inference, attribute comparison, set operation, and etc. Existing
benchmarks have some shortcomings that limit the development of Complex KBQA:
1) they only provide QA pairs without explicit reasoning processes; 2)
questions are either generated by templates, leading to poor diversity, or on a
small scale. To this end, we introduce KQA Pro, a large-scale dataset for
Complex KBQA. We define a compositional and highly-interpretable formal format,
named Program, to represent the reasoning process of complex questions. We
propose compositional strategies to generate questions, corresponding SPARQLs,
and Programs with a small number of templates, and then paraphrase the
generated questions to natural language questions (NLQ) by crowdsourcing,
giving rise to around 120K diverse instances. SPARQL and Program depict two
complementary solutions to answer complex questions, which can benefit a large
spectrum of QA methods. Besides the QA task, KQA Pro can also serves for the
semantic parsing task. As far as we know, it is currently the largest corpus of
NLQ-to-SPARQL and NLQ-to-Program. We conduct extensive experiments to evaluate
whether machines can learn to answer our complex questions in different cases,
that is, with only QA supervision or with intermediate SPARQL/Program
supervision. We find that state-of-the-art KBQA methods learnt from only QA
pairs perform very poor on our dataset, implying our questions are more
challenging than previous datasets. However, pretrained models learnt from our
NLQ-to-SPARQL and NLQ-to-Program annotations surprisingly achieve about 90\%
answering accuracy, which is even close to the human expert performance..
Complex Knowledge Base Question Answering: A Survey
Knowledge base question answering (KBQA) aims to answer a question over a
knowledge base (KB). Early studies mainly focused on answering simple questions
over KBs and achieved great success. However, their performance on complex
questions is still far from satisfactory. Therefore, in recent years,
researchers propose a large number of novel methods, which looked into the
challenges of answering complex questions. In this survey, we review recent
advances on KBQA with the focus on solving complex questions, which usually
contain multiple subjects, express compound relations, or involve numerical
operations. In detail, we begin with introducing the complex KBQA task and
relevant background. Then, we describe benchmark datasets for complex KBQA task
and introduce the construction process of these datasets. Next, we present two
mainstream categories of methods for complex KBQA, namely semantic
parsing-based (SP-based) methods and information retrieval-based (IR-based)
methods. Specifically, we illustrate their procedures with flow designs and
discuss their major differences and similarities. After that, we summarize the
challenges that these two categories of methods encounter when answering
complex questions, and explicate advanced solutions and techniques used in
existing work. Finally, we conclude and discuss several promising directions
related to complex KBQA for future research.Comment: 20 pages, 4 tables, 7 figures. arXiv admin note: text overlap with
arXiv:2105.1164
GNN2R: Weakly-Supervised Rationale-Providing Question Answering over Knowledge Graphs
Most current methods for multi-hop question answering (QA) over knowledge
graphs (KGs) only provide final conclusive answers without explanations, such
as a set of KG entities that is difficult for normal users to review and
comprehend. This issue severely limits the application of KG-based QA in
real-world scenarios. However, it is non-trivial to solve due to two
challenges: First, annotations of reasoning chains of multi-hop questions,
which could serve as supervision for explanation generation, are usually
lacking. Second, it is difficult to maintain high efficiency when explicit KG
triples need to be retrieved to generate explanations. In this paper, we
propose a novel Graph Neural Network-based Two-Step Reasoning model (GNN2R) to
solve this issue. GNN2R can provide both final answers and reasoning subgraphs
as a rationale behind final answers efficiently with only weak supervision that
is available through question-final answer pairs. We extensively evaluated
GNN2R with detailed analyses in experiments. The results demonstrate that, in
terms of effectiveness, efficiency, and quality of generated explanations,
GNN2R outperforms existing state-of-the-art methods that are applicable to this
task. Our code and pre-trained models are available at
https://github.com/ruijie-wang-uzh/GNN2R
Automated Deduction – CADE 28
This open access book constitutes the proceeding of the 28th International Conference on Automated Deduction, CADE 28, held virtually in July 2021. The 29 full papers and 7 system descriptions presented together with 2 invited papers were carefully reviewed and selected from 76 submissions. CADE is the major forum for the presentation of research in all aspects of automated deduction, including foundations, applications, implementations, and practical experience. The papers are organized in the following topics: Logical foundations; theory and principles; implementation and application; ATP and AI; and system descriptions
EFFECTIVENESS OF PROXIMAL POLICY OPTIMIZATION METHODS FOR NEURAL PROGRAM INDUCTION
The Neural Virtual Machine (NVM) is a novel neurocomputational architecturedesigned to emulate the functionality of a traditional computer. A version of the
NVM called NVM-RL supports reinforcement learning based on standard policy
gradient methods as a mechanism for performing neural program induction. In
this thesis, I modified NVM-RL using one of the most popular reinforcement
learning algorithms, proximal policy optimization (PPO). Surprisingly, using PPO
with the existing all-or-nothing reward function did not improve its effectiveness.
However, I found that PPO did improve the performance of the existing NVM-RL
if one instead used a reward function that grants partial credit for incorrect outputs
based on how much those incorrect outputs differ from the correct targets. I
conclude that, in some situations, PPO can improve the performance of
reinforcement learning during program induction, but that this improvement is
dependent on the quality of the reward function that is used
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics, and Benchmark Datasets
Machine Reading Comprehension (MRC) is a challenging NLP research field with
wide real world applications. The great progress of this field in recent years
is mainly due to the emergence of large-scale datasets and deep learning. At
present, a lot of MRC models have already surpassed the human performance on
many datasets despite the obvious giant gap between existing MRC models and
genuine human-level reading comprehension. This shows the need of improving
existing datasets, evaluation metrics and models to move the MRC models toward
'real' understanding. To address this lack of comprehensive survey of existing
MRC tasks, evaluation metrics and datasets, herein, (1) we analyzed 57 MRC
tasks and datasets; proposed a more precise classification method of MRC tasks
with 4 different attributes (2) we summarized 9 evaluation metrics of MRC tasks
and (3) 7 attributes and 10 characteristics of MRC datasets; (4) We also
discussed some open issues in MRC research and highlight some future research
directions. In addition, to help the community, we have collected, organized,
and published our data on a companion website(https://mrc-datasets.github.io/)
where MRC researchers could directly access each MRC dataset, papers, baseline
projects and browse the leaderboard.Comment: 59 page
JURI SAYS:An Automatic Judgement Prediction System for the European Court of Human Rights
In this paper we present the web platform JURI SAYS that automatically predicts decisions of the European Court of Human Rights based on communicated cases, which are published by the court early in the proceedings and are often available many years before the final decision is made. Our system therefore predicts future judgements of the court. The platform is available at jurisays.com and shows the predictions compared to the actual decisions of the court. It is automatically updated every month by including the prediction for the new cases. Additionally, the system highlights the sentences and paragraphs that are most important for the prediction (i.e. violation vs. no violation of human rights)