762 research outputs found
PowerAqua: fishing the semantic web
The Semantic Web (SW) offers an opportunity to develop novel, sophisticated forms of question answering (QA). Specifically, the availability of distributed semantic markup on a large scale opens the way to QA systems which can make use of such semantic information to provide precise, formally derived answers to questions. At the same time the distributed, heterogeneous, large-scale nature of the semantic information introduces significant challenges. In this paper we describe the design of a QA system, PowerAqua, designed to exploit semantic markup on the web to provide answers to questions posed in natural language. PowerAqua does not assume that the user has any prior information about the semantic resources. The system takes as input a natural language query, translates it into a set of logical queries, which are then answered by consulting and aggregating information derived from multiple heterogeneous semantic sources
Recommended from our members
Analysis of Statistical Question Classification for Fact-based Questions
Question classication systems play an important role in question answering systems and can be used in a wide range of other domains. The goal of question classication is to accurately assign labels to questions based on expected answer type. Most approaches in the past have relied on matching questions against hand-crafted rules. However, rules require laborious eort to create and often suer from being too specic. Statistical question classication methods overcome these issues by employing machine learning techniques. We empirically show that a statistical approach is robust and achieves good performance on three diverse data sets with little or no hand tuning. Furthermore, we examine the role dierent syntactic and semantic features have on performance. We nd that semantic features tend to increase performance more than purely syntactic features. Finally, we analyze common causes of misclassication error and provide insight into ways they may be overcome
Taxonomy Induction using Hypernym Subsequences
We propose a novel, semi-supervised approach towards domain taxonomy
induction from an input vocabulary of seed terms. Unlike all previous
approaches, which typically extract direct hypernym edges for terms, our
approach utilizes a novel probabilistic framework to extract hypernym
subsequences. Taxonomy induction from extracted subsequences is cast as an
instance of the minimumcost flow problem on a carefully designed directed
graph. Through experiments, we demonstrate that our approach outperforms
stateof- the-art taxonomy induction approaches across four languages.
Importantly, we also show that our approach is robust to the presence of noise
in the input vocabulary. To the best of our knowledge, no previous approaches
have been empirically proven to manifest noise-robustness in the input
vocabulary
Smoothing Entailment Graphs with Language Models
The diversity and Zipfian frequency distribution of natural language
predicates in corpora leads to sparsity in Entailment Graphs (EGs) built by
Open Relation Extraction (ORE). EGs are computationally efficient and
explainable models of natural language inference, but as symbolic models, they
fail if a novel premise or hypothesis vertex is missing at test-time. We
present theory and methodology for overcoming such sparsity in symbolic models.
First, we introduce a theory of optimal smoothing of EGs by constructing
transitive chains. We then demonstrate an efficient, open-domain, and
unsupervised smoothing method using an off-the-shelf Language Model to find
approximations of missing premise predicates. This improves recall by 25.1 and
16.3 percentage points on two difficult directional entailment datasets, while
raising average precision and maintaining model explainability. Further, in a
QA task we show that EG smoothing is most useful for answering questions with
lesser supporting text, where missing premise predicates are more costly.
Finally, controlled experiments with WordNet confirm our theory and show that
hypothesis smoothing is difficult, but possible in principle.Comment: Published at AACL 202
- …