625 research outputs found

    Training Curricula for Open Domain Answer Re-Ranking

    Full text link
    In precision-oriented tasks like answer ranking, it is more important to rank many relevant answers highly than to retrieve all relevant answers. It follows that a good ranking strategy would be to learn how to identify the easiest correct answers first (i.e., assign a high ranking score to answers that have characteristics that usually indicate relevance, and a low ranking score to those with characteristics that do not), before incorporating more complex logic to handle difficult cases (e.g., semantic matching or reasoning). In this work, we apply this idea to the training of neural answer rankers using curriculum learning. We propose several heuristics to estimate the difficulty of a given training sample. We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process. As the training process progresses, our approach gradually shifts to weighting all samples equally, regardless of difficulty. We present a comprehensive evaluation of our proposed idea on three answer ranking datasets. Results show that our approach leads to superior performance of two leading neural ranking architectures, namely BERT and ConvKNRM, using both pointwise and pairwise losses. When applied to a BERT-based ranker, our method yields up to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model trained without a curriculum). This results in models that can achieve comparable performance to more expensive state-of-the-art techniques.Comment: Accepted at SIGIR 2020 (long

    Dependency-based Convolutional Neural Networks for Sentence Embedding

    Full text link
    In sentence modeling and classification, convolutional neural network approaches have recently achieved state-of-the-art results, but all such efforts process word vectors sequentially and neglect long-distance dependencies. To exploit both deep learning and linguistic structures, we propose a tree-based convolutional neural network model which exploit various long-distance relationships between words. Our model improves the sequential baselines on all three sentiment and question classification tasks, and achieves the highest published accuracy on TREC.Comment: this paper has been accepted by ACL 201

    Combining information seeking services into a meta supply chain of facts

    Get PDF
    The World Wide Web has become a vital supplier of information that allows organizations to carry on such tasks as business intelligence, security monitoring, and risk assessments. Having a quick and reliable supply of correct facts from perspective is often mission critical. By following design science guidelines, we have explored ways to recombine facts from multiple sources, each with possibly different levels of responsiveness and accuracy, into one robust supply chain. Inspired by prior research on keyword-based meta-search engines (e.g., metacrawler.com), we have adapted the existing question answering algorithms for the task of analysis and triangulation of facts. We present a first prototype for a meta approach to fact seeking. Our meta engine sends a user's question to several fact seeking services that are publicly available on the Web (e.g., ask.com, brainboost.com, answerbus.com, NSIR, etc.) and analyzes the returned results jointly to identify and present to the user those that are most likely to be factually correct. The results of our evaluation on the standard test sets widely used in prior research support the evidence for the following: 1) the value-added of the meta approach: its performance surpasses the performance of each supplier, 2) the importance of using fact seeking services as suppliers to the meta engine rather than keyword driven search portals, and 3) the resilience of the meta approach: eliminating a single service does not noticeably impact the overall performance. We show that these properties make the meta-approach a more reliable supplier of facts than any of the currently available stand-alone services

    Concept-based Interactive Query Expansion Support Tool (CIQUEST)

    Get PDF
    This report describes a three-year project (2000-03) undertaken in the Information Studies Department at The University of Sheffield and funded by Resource, The Council for Museums, Archives and Libraries. The overall aim of the research was to provide user support for query formulation and reformulation in searching large-scale textual resources including those of the World Wide Web. More specifically the objectives were: to investigate and evaluate methods for the automatic generation and organisation of concepts derived from retrieved document sets, based on statistical methods for term weighting; and to conduct user-based evaluations on the understanding, presentation and retrieval effectiveness of concept structures in selecting candidate terms for interactive query expansion. The TREC test collection formed the basis for the seven evaluative experiments conducted in the course of the project. These formed four distinct phases in the project plan. In the first phase, a series of experiments was conducted to investigate further techniques for concept derivation and hierarchical organisation and structure. The second phase was concerned with user-based validation of the concept structures. Results of phases 1 and 2 informed on the design of the test system and the user interface was developed in phase 3. The final phase entailed a user-based summative evaluation of the CiQuest system. The main findings demonstrate that concept hierarchies can effectively be generated from sets of retrieved documents and displayed to searchers in a meaningful way. The approach provides the searcher with an overview of the contents of the retrieved documents, which in turn facilitates the viewing of documents and selection of the most relevant ones. Concept hierarchies are a good source of terms for query expansion and can improve precision. The extraction of descriptive phrases as an alternative source of terms was also effective. With respect to presentation, cascading menus were easy to browse for selecting terms and for viewing documents. In conclusion the project dissemination programme and future work are outlined

    Interaction history based answer formulation for question answering

    Get PDF
    With the rapid growth in information access methodologies, question answering has drawn considerable attention among others. Though question answering has emerged as an interesting new research domain, still it is vastly concentrated on question processing and answer extraction approaches. Latter steps like answer ranking, formulation and presentations are not treated in depth. Weakness we found in this arena is that answers that a particular user has acquired are not considered, when processing new questions. As a result, current systems are not capable of linking two questions such as β€œWhen is the Apple founded?” with a previously processed question β€œWhen is the Microsoft founded?” generating an answer in the form of β€œApple is founded one year later Microsoft founded, in 1976”. In this paper we present an approach towards question answering to devise an answer based on the questions already processed by the system for a particular user which is termed as interaction history for the user. Our approach is a combination of question processing, relation extraction and knowledge representation with inference models. During the process we primarily focus on acquiring knowledge and building up a scalable user model to formulate future answers based on current answers that same user has processed. According to evaluation we carried out based on the TREC resources shows that proposed technology is promising and effective in question answering

    The Web as a Resource for Question Answering: Perspectives and Challenges

    Get PDF
    The vast amounts of information readily available on the World Wide Web can be effectively used for question answering in two fundamentally different ways. In the federated approach, techniques for handling semistructured data are applied to access Web sources as if they were databases, allowing large classes of common questions to be answered uniformly. In the distributed approach, largescale text-processing techniques are used to extract answers directly from unstructured Web documents. Because the Web is orders of magnitude larger than any human-collected corpus, question answering systems can capitalize on its unparalleled-levels of data redundancy. Analysis of real-world user questions reveals that the federated and distributed approaches complement each other nicely, suggesting a hybrid approach in future question answering systems

    μ§ˆμ˜μ‘λ‹΅ μ‹œμŠ€ν…œμ„ μœ„ν•œ ν…μŠ€νŠΈ λž­ν‚Ή 심측 신경망

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·정보곡학뢀, 2020. 8. 정ꡐ민.The question answering (QA) system has attracted huge interests due to its applicability in real-world applications. This dissertation proposes novel ranking algorithms for the QA system based on deep neural networks. We first tackle the long-text QA that requires the model to understand the excessively large sequence of text inputs. To solve this problem, we propose a hierarchical recurrent dual encoder that encodes texts from word-level to paragraph-level. We further propose a latent topic clustering method that utilizes semantic information in the target corpus, and thus it increases the performance of the QA system. Secondly, we investigate the short-text QA, where the information in text pairs are limited. To overcome the insufficiency, we combine a pretrained language model and an enhanced latent clustering method to the QA model. This novel architecture enables the model to utilizes additional information, resulting in achieving state-of-the-art performance for the standard answer-selection tasks (i.e., WikiQA, TREC-QA). Finally, we investigate detecting supporting sentences for complex QA system. As opposed to the previous studies, the model needs to understand the relationship between sentences to answer the question. Inspired by the hierarchical nature of the text, we propose a graph neural network-based model that iteratively propagates necessary information between text nodes and achieve the best performance among existing methods.λ³Έ ν•™μœ„ 논문은 λ”₯ λ‰΄λŸ΄ λ„€νŠΈμ›Œν¬ 기반 μ§ˆμ˜μ‘λ‹΅ μ‹œμŠ€ν…œμ— κ΄€ν•œ λͺ¨λΈμ„ μ œμ•ˆν•œλ‹€. λ¨Όμ € κΈ΄ λ¬Έμž₯에 λŒ€ν•œ μ§ˆμ˜μ‘λ‹΅μ„ ν•˜κΈ° μœ„ν•΄μ„œ 계측 ꡬ쑰의 μž¬κ·€μ‹ κ²½λ§ λͺ¨λΈμ„ μ œμ•ˆν•˜μ˜€λ‹€. 이λ₯Ό 톡해 λͺ¨λΈμ΄ 주어진 λ¬Έμž₯을 짧은 μ‹œν€€μŠ€ λ‹¨μœ„λ‘œ 효율적으둜 λ‹€λ£° 수 있게 ν•˜μ—¬ 큰 μ„±λŠ₯ ν–₯상을 μ–»μ—ˆλ‹€. λ˜ν•œ ν•™μŠ΅ κ³Όμ •μ—μ„œ 데이터 μ•ˆμ— λ‚΄ν¬λœ 토픽을 μžλ™ λΆ„λ₯˜ν•˜λŠ” λͺ¨λΈμ„ μ œμ•ˆν•˜κ³ , 이λ₯Ό κΈ°μ‘΄ μ§ˆμ˜μ‘λ‹΅ λͺ¨λΈμ— λ³‘ν•©ν•˜μ—¬ μΆ”κ°€ μ„±λŠ₯ κ°œμ„ μ„ μ΄λ£¨μ—ˆλ‹€. μ΄μ–΄μ§€λŠ” μ—°κ΅¬λ‘œ 짧은 λ¬Έμž₯에 λŒ€ν•œ μ§ˆμ˜μ‘λ‹΅ λͺ¨λΈμ„ μ œμ•ˆν•˜μ˜€λ‹€. λ¬Έμž₯의 길이가 μ§§μ•„μ§ˆμˆ˜λ‘ λ¬Έμž₯ μ•ˆμ—μ„œ 얻을 수 μžˆλŠ” μ •λ³΄μ˜ 양도 μ€„μ–΄λ“€κ²Œ λœλ‹€. μš°λ¦¬λŠ” μ΄λŸ¬ν•œ 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄, 사전 ν•™μŠ΅λœ μ–Έμ–΄ λͺ¨λΈκ³Ό μƒˆλ‘œμš΄ ν† ν”½ ν΄λŸ¬μŠ€ν„°λ§ 기법을 μ μš©ν•˜μ˜€λ‹€. μ œμ•ˆν•œ λͺ¨λΈμ€ μ’…λž˜ 짧은 λ¬Έμž₯ μ§ˆμ˜μ‘λ‹΅ 연ꡬ 쀑 κ°€μž₯ 쒋은 μ„±λŠ₯을 νšλ“ν•˜μ˜€λ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ μ—¬λŸ¬ λ¬Έμž₯ μ‚¬μ΄μ˜ 관계λ₯Ό μ΄μš©ν•˜μ—¬ 닡변을 μ°Ύμ•„μ•Ό ν•˜λŠ” μ§ˆμ˜μ‘λ‹΅ 연ꡬλ₯Ό μ§„ν–‰ν•˜μ˜€λ‹€. μš°λ¦¬λŠ” λ¬Έμ„œ λ‚΄ 각 λ¬Έμž₯을 κ·Έλž˜ν”„λ‘œ λ„μ‹ν™”ν•œ ν›„ 이λ₯Ό ν•™μŠ΅ν•  수 μžˆλŠ” κ·Έλž˜ν”„ λ‰΄λŸ΄ λ„€νŠΈμ›Œν¬λ₯Ό μ œμ•ˆν•˜μ˜€λ‹€. μ œμ•ˆν•œ λͺ¨λΈμ€ 각 λ¬Έμž₯의 관계성을 μ„±κ³΅μ μœΌλ‘œ κ³„μ‚°ν•˜μ˜€κ³ , 이λ₯Ό 톡해 λ³΅μž‘λ„κ°€ 높은 μ§ˆμ˜μ‘λ‹΅ μ‹œμŠ€ν…œμ—μ„œ 기쑴에 μ œμ•ˆλœ λͺ¨λΈλ“€κ³Ό λΉ„κ΅ν•˜μ—¬ κ°€μž₯ 쒋은 μ„±λŠ₯을 νšλ“ν•˜μ˜€λ‹€.1 Introduction 1 2 Background 8 2.1 Textual Data Representation 8 2.2 Encoding Sequential Information in Text 12 3 Question-Answer Pair Ranking for Long Text 16 3.1 Related Work 18 3.2 Method 19 3.2.1 Baseline Approach 19 3.2.2 Proposed Approaches (HRDE+LTC) 22 3.3 Experimental Setup and Dataset 26 3.3.1 Dataset 26 3.3.2 Consumer Product Question Answering Corpus 30 3.3.3 Implementation Details 32 3.4 Empirical Results 34 3.4.1 Comparison with other methods 35 3.4.2 Degradation Comparison for Longer Texts 37 3.4.3 Effects of the LTC Numbers 38 3.4.4 Comprehensive Analysis of LTC 38 3.5 Further Investigation on Ranking Lengthy Document 40 3.5.1 Problem and Dataset 41 3.5.2 Methods 45 3.5.3 Experimental Results 51 3.6 Conclusion 55 4 Answer-Selection for Short Sentence 56 4.1 Related Work 57 4.2 Method 59 4.2.1 Baseline approach 59 4.2.2 Proposed Approaches (Comp-Clip+LM+LC+TL) 62 4.3 Experimental Setup and Dataset 66 4.3.1 Dataset 66 4.3.2 Implementation Details 68 4.4 Empirical Results 69 4.4.1 Comparison with Other Methods 69 4.4.2 Impact of Latent Clustering 72 4.5 Conclusion 72 5 Supporting Sentence Detection for Question Answering 73 5.1 Related Work 75 5.2 Method 76 5.2.1 Baseline approaches 76 5.2.2 Proposed Approach (Propagate-Selector) 78 5.3 Experimental Setup and Dataset 82 5.3.1 Dataset 82 5.3.2 Implementation Details 83 5.4 Empirical Results 85 5.4.1 Comparisons with Other Methods 85 5.4.2 Hop Analysis 86 5.4.3 Impact of Various Graph Topologies 88 5.4.4 Impact of Node Representation 91 5.5 Discussion 92 5.6 Conclusion 93 6 Conclusion 94Docto
    • …
    corecore