225,328 research outputs found

    How to Evaluate your Question Answering System Every Day and Still Get Real Work Done

    Full text link
    In this paper, we report on Qaviar, an experimental automated evaluation system for question answering applications. The goal of our research was to find an automatically calculated measure that correlates well with human judges' assessment of answer correctness in the context of question answering tasks. Qaviar judges the response by computing recall against the stemmed content words in the human-generated answer key. It counts the answer correct if it exceeds agiven recall threshold. We determined that the answer correctness predicted by Qaviar agreed with the human 93% to 95% of the time. 41 question-answering systems were ranked by both Qaviar and human assessors, and these rankings correlated with a Kendall's Tau measure of 0.920, compared to a correlation of 0.956 between human assessors on the same data.Comment: 6 pages, 3 figures, to appear in Proceedings of the Second International Conference on Language Resources and Evaluation (LREC 2000

    Separating Regular Languages with First-Order Logic

    Full text link
    Given two languages, a separator is a third language that contains the first one and is disjoint from the second one. We investigate the following decision problem: given two regular input languages of finite words, decide whether there exists a first-order definable separator. We prove that in order to answer this question, sufficient information can be extracted from semigroups recognizing the input languages, using a fixpoint computation. This yields an EXPTIME algorithm for checking first-order separability. Moreover, the correctness proof of this algorithm yields a stronger result, namely a description of a possible separator. Finally, we generalize this technique to answer the same question for regular languages of infinite words

    Improving Retrieval-Based Question Answering with Deep Inference Models

    Full text link
    Question answering is one of the most important and difficult applications at the border of information retrieval and natural language processing, especially when we talk about complex science questions which require some form of inference to determine the correct answer. In this paper, we present a two-step method that combines information retrieval techniques optimized for question answering with deep learning models for natural language inference in order to tackle the multi-choice question answering in the science domain. For each question-answer pair, we use standard retrieval-based models to find relevant candidate contexts and decompose the main problem into two different sub-problems. First, assign correctness scores for each candidate answer based on the context using retrieval models from Lucene. Second, we use deep learning architectures to compute if a candidate answer can be inferred from some well-chosen context consisting of sentences retrieved from the knowledge base. In the end, all these solvers are combined using a simple neural network to predict the correct answer. This proposed two-step model outperforms the best retrieval-based solver by over 3% in absolute accuracy.Comment: 8 pages, 2 figures, 8 tables, accepted at IJCNN 201

    An Atypical Survey of Typical-Case Heuristic Algorithms

    Full text link
    Heuristic approaches often do so well that they seem to pretty much always give the right answer. How close can heuristic algorithms get to always giving the right answer, without inducing seismic complexity-theoretic consequences? This article first discusses how a series of results by Berman, Buhrman, Hartmanis, Homer, Longpr\'{e}, Ogiwara, Sch\"{o}ening, and Watanabe, from the early 1970s through the early 1990s, explicitly or implicitly limited how well heuristic algorithms can do on NP-hard problems. In particular, many desirable levels of heuristic success cannot be obtained unless severe, highly unlikely complexity class collapses occur. Second, we survey work initiated by Goldreich and Wigderson, who showed how under plausible assumptions deterministic heuristics for randomized computation can achieve a very high frequency of correctness. Finally, we consider formal ways in which theory can help explain the effectiveness of heuristics that solve NP-hard problems in practice.Comment: This article is currently scheduled to appear in the December 2012 issue of SIGACT New

    The CIFF Proof Procedure for Abductive Logic Programming with Constraints: Theory, Implementation and Experiments

    Get PDF
    We present the CIFF proof procedure for abductive logic programming with constraints, and we prove its correctness. CIFF is an extension of the IFF proof procedure for abductive logic programming, relaxing the original restrictions over variable quantification (allowedness conditions) and incorporating a constraint solver to deal with numerical constraints as in constraint logic programming. Finally, we describe the CIFF system, comparing it with state of the art abductive systems and answer set solvers and showing how to use it to program some applications. (To appear in Theory and Practice of Logic Programming - TPLP)
    corecore