225,328 research outputs found
How to Evaluate your Question Answering System Every Day and Still Get Real Work Done
In this paper, we report on Qaviar, an experimental automated evaluation
system for question answering applications. The goal of our research was to
find an automatically calculated measure that correlates well with human
judges' assessment of answer correctness in the context of question answering
tasks. Qaviar judges the response by computing recall against the stemmed
content words in the human-generated answer key. It counts the answer correct
if it exceeds agiven recall threshold. We determined that the answer
correctness predicted by Qaviar agreed with the human 93% to 95% of the time.
41 question-answering systems were ranked by both Qaviar and human assessors,
and these rankings correlated with a Kendall's Tau measure of 0.920, compared
to a correlation of 0.956 between human assessors on the same data.Comment: 6 pages, 3 figures, to appear in Proceedings of the Second
International Conference on Language Resources and Evaluation (LREC 2000
Separating Regular Languages with First-Order Logic
Given two languages, a separator is a third language that contains the first
one and is disjoint from the second one. We investigate the following decision
problem: given two regular input languages of finite words, decide whether
there exists a first-order definable separator. We prove that in order to
answer this question, sufficient information can be extracted from semigroups
recognizing the input languages, using a fixpoint computation. This yields an
EXPTIME algorithm for checking first-order separability. Moreover, the
correctness proof of this algorithm yields a stronger result, namely a
description of a possible separator. Finally, we generalize this technique to
answer the same question for regular languages of infinite words
Improving Retrieval-Based Question Answering with Deep Inference Models
Question answering is one of the most important and difficult applications at
the border of information retrieval and natural language processing, especially
when we talk about complex science questions which require some form of
inference to determine the correct answer. In this paper, we present a two-step
method that combines information retrieval techniques optimized for question
answering with deep learning models for natural language inference in order to
tackle the multi-choice question answering in the science domain. For each
question-answer pair, we use standard retrieval-based models to find relevant
candidate contexts and decompose the main problem into two different
sub-problems. First, assign correctness scores for each candidate answer based
on the context using retrieval models from Lucene. Second, we use deep learning
architectures to compute if a candidate answer can be inferred from some
well-chosen context consisting of sentences retrieved from the knowledge base.
In the end, all these solvers are combined using a simple neural network to
predict the correct answer. This proposed two-step model outperforms the best
retrieval-based solver by over 3% in absolute accuracy.Comment: 8 pages, 2 figures, 8 tables, accepted at IJCNN 201
An Atypical Survey of Typical-Case Heuristic Algorithms
Heuristic approaches often do so well that they seem to pretty much always
give the right answer. How close can heuristic algorithms get to always giving
the right answer, without inducing seismic complexity-theoretic consequences?
This article first discusses how a series of results by Berman, Buhrman,
Hartmanis, Homer, Longpr\'{e}, Ogiwara, Sch\"{o}ening, and Watanabe, from the
early 1970s through the early 1990s, explicitly or implicitly limited how well
heuristic algorithms can do on NP-hard problems. In particular, many desirable
levels of heuristic success cannot be obtained unless severe, highly unlikely
complexity class collapses occur. Second, we survey work initiated by Goldreich
and Wigderson, who showed how under plausible assumptions deterministic
heuristics for randomized computation can achieve a very high frequency of
correctness. Finally, we consider formal ways in which theory can help explain
the effectiveness of heuristics that solve NP-hard problems in practice.Comment: This article is currently scheduled to appear in the December 2012
issue of SIGACT New
The CIFF Proof Procedure for Abductive Logic Programming with Constraints: Theory, Implementation and Experiments
We present the CIFF proof procedure for abductive logic programming with
constraints, and we prove its correctness. CIFF is an extension of the IFF
proof procedure for abductive logic programming, relaxing the original
restrictions over variable quantification (allowedness conditions) and
incorporating a constraint solver to deal with numerical constraints as in
constraint logic programming. Finally, we describe the CIFF system, comparing
it with state of the art abductive systems and answer set solvers and showing
how to use it to program some applications. (To appear in Theory and Practice
of Logic Programming - TPLP)
- …
