31,745 research outputs found

    Neural Networks for Information Retrieval

    Get PDF
    Machine learning plays a role in many aspects of modern IR systems, and deep learning is applied in all of them. The fast pace of modern-day research has given rise to many different approaches for many different IR problems. The amount of information available can be overwhelming both for junior students and for experienced researchers looking for new research topics and directions. Additionally, it is interesting to see what key insights into IR problems the new technologies are able to give us. The aim of this full-day tutorial is to give a clear overview of current tried-and-trusted neural methods in IR and how they benefit IR research. It covers key architectures, as well as the most promising future directions.Comment: Overview of full-day tutorial at SIGIR 201

    Query Resolution for Conversational Search with Limited Supervision

    Get PDF
    In this work we focus on multi-turn passage retrieval as a crucial component of conversational search. One of the key challenges in multi-turn passage retrieval comes from the fact that the current turn query is often underspecified due to zero anaphora, topic change, or topic return. Context from the conversational history can be used to arrive at a better expression of the current turn query, defined as the task of query resolution. In this paper, we model the query resolution task as a binary term classification problem: for each term appearing in the previous turns of the conversation decide whether to add it to the current turn query or not. We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers. We propose a distant supervision method to automatically generate training data by using query-passage relevance labels. Such labels are often readily available in a collection either as human annotations or inferred from user interactions. We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC. We incorporate QuReTeC in a multi-turn, multi-stage passage retrieval architecture and demonstrate its effectiveness on the TREC CAsT dataset.Comment: SIGIR 2020 full conference pape

    Improving Distributed Representations of Tweets - Present and Future

    Full text link
    Unsupervised representation learning for tweets is an important research field which helps in solving several business applications such as sentiment analysis, hashtag prediction, paraphrase detection and microblog ranking. A good tweet representation learning model must handle the idiosyncratic nature of tweets which poses several challenges such as short length, informal words, unusual grammar and misspellings. However, there is a lack of prior work which surveys the representation learning models with a focus on tweets. In this work, we organize the models based on its objective function which aids the understanding of the literature. We also provide interesting future directions, which we believe are fruitful in advancing this field by building high-quality tweet representation learning models.Comment: To be presented in Student Research Workshop (SRW) at ACL 201

    Improving Distributed Representations of Tweets - Present and Future

    Get PDF
    Unsupervised representation learning for tweets is an important research field which helps in solving several business applications such as sentiment analysis, hashtag prediction, paraphrase detection and microblog ranking. A good tweet representation learning model must handle the idiosyncratic nature of tweets which poses several challenges such as short length, informal words, unusual grammar and misspellings. However, there is a lack of prior work which surveys the representation learning models with a focus on tweets. In this work, we organize the models based on its objective function which aids the understanding of the literature. We also provide interesting future directions, which we believe are fruitful in advancing this field by building high-quality tweet representation learning models.Comment: To be presented in Student Research Workshop (SRW) at ACL 201

    Fourteenth Biennial Status Report: März 2017 - February 2019

    No full text

    A Quantum Many-body Wave Function Inspired Language Modeling Approach

    Full text link
    The recently proposed quantum language model (QLM) aimed at a principled approach to modeling term dependency by applying the quantum probability theory. The latest development for a more effective QLM has adopted word embeddings as a kind of global dependency information and integrated the quantum-inspired idea in a neural network architecture. While these quantum-inspired LMs are theoretically more general and also practically effective, they have two major limitations. First, they have not taken into account the interaction among words with multiple meanings, which is common and important in understanding natural language text. Second, the integration of the quantum-inspired LM with the neural network was mainly for effective training of parameters, yet lacking a theoretical foundation accounting for such integration. To address these two issues, in this paper, we propose a Quantum Many-body Wave Function (QMWF) inspired language modeling approach. The QMWF inspired LM can adopt the tensor product to model the aforesaid interaction among words. It also enables us to reveal the inherent necessity of using Convolutional Neural Network (CNN) in QMWF language modeling. Furthermore, our approach delivers a simple algorithm to represent and match text/sentence pairs. Systematic evaluation shows the effectiveness of the proposed QMWF-LM algorithm, in comparison with the state of the art quantum-inspired LMs and a couple of CNN-based methods, on three typical Question Answering (QA) datasets.Comment: 10 pages,4 figures,CIK

    An inquiry-based learning approach to teaching information retrieval

    Get PDF
    The study of information retrieval (IR) has increased in interest and importance with the explosive growth of online information in recent years. Learning about IR within formal courses of study enables users of search engines to use them more knowledgeably and effectively, while providing the starting point for the explorations of new researchers into novel search technologies. Although IR can be taught in a traditional manner of formal classroom instruction with students being led through the details of the subject and expected to reproduce this in assessment, the nature of IR as a topic makes it an ideal subject for inquiry-based learning approaches to teaching. In an inquiry-based learning approach students are introduced to the principles of a subject and then encouraged to develop their understanding by solving structured or open problems. Working through solutions in subsequent class discussions enables students to appreciate the availability of alternative solutions as proposed by their classmates. Following this approach students not only learn the details of IR techniques, but significantly, naturally learn to apply them in solution of problems. In doing this they not only gain an appreciation of alternative solutions to a problem, but also how to assess their relative strengths and weaknesses. Developing confidence and skills in problem solving enables student assessment to be structured around solution of problems. Thus students can be assessed on the basis of their understanding and ability to apply techniques, rather simply their skill at reciting facts. This has the additional benefit of encouraging general problem solving skills which can be of benefit in other subjects. This approach to teaching IR was successfully implemented in an undergraduate module where students were assessed in a written examination exploring their knowledge and understanding of the principles of IR and their ability to apply them to solving problems, and a written assignment based on developing an individual research proposal
    corecore