Search CORE

35,221 research outputs found

Deep Fusion of Multiple Term-Similarity Measures For Biomedical Passage Retrieval

Author: Dong
Lindberg
Liu
Steinberger
Tsatsaronis
Wasim
Yang
Yin
Publication venue: 'IOS Press'
Publication date: 31/08/2020
Field of study

[EN] Passage retrieval is an important stage of question answering systems. Closed domain passage retrieval, e.g. biomedical passage retrieval presents additional challenges such as specialized terminology, more complex and elaborated queries, scarcity in the amount of available data, among others. However, closed domains also offer some advantages such as the availability of specialized structured information sources, e.g. ontologies and thesauri, that could be used to improve retrieval performance. This paper presents a novel approach for biomedical passage retrieval which is able to combine different information sources using a similarity matrix fusion strategy based on convolutional neural network architecture. The method was evaluated over the standard BioASQ dataset, a dataset specialized on biomedical question answering. The results show that the method is an effective strategy for biomedical passage retrieval able to outperform other state-of-the-art methods in this domain.COLCIENCIAS, REF. Agreement #727, 2016 provided financial as well as logistical and planning support. Mindlab research group (Universidad Nacional de Colombia sede Bogota) with the cooperation of INAOE (Instituto Nacional de Astrofisica, optica y Electronica) and Universitat Politecnica de Valencia wich also provided technical support for this work. The work of Paolo Rosso was carried out in the framework of the research project PROMETEO/2019/121.Rosso-Mateus, A.; Montes Gomez, M.; Rosso, P.; González, F. (2020). Deep Fusion of Multiple Term-Similarity Measures For Biomedical Passage Retrieval. Journal of Intelligent & Fuzzy Systems. 39(2):2239-2248. https://doi.org/10.3233/JIFS-179887S22392248392Humphreys, B. L., McCray, A. T., & Lindberg, D. A. B. (1993). The Unified Medical Language System. Methods of Information in Medicine, 32(04), 281-291. doi:10.1055/s-0038-1634945Malakasiotis P. , Androutsopoulos I. , Bernadou A. , Chatzidiakou N. , Papaki E. , Constantopoulos P. , Pavlopoulos I. , Krithara A. , Almyrantis Y. and Polychronopoulos D. , et al., Challenge evaluation report 2 and roadmap, BioASQ Deliverable D 5 2014.National Institutes of Health. Pubmed baseline repository.Tsatsaronis, G., Balikas, G., Malakasiotis, P., Partalas, I., Zschunke, M., Alvers, M. R., … Paliouras, G. (2015). An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinformatics, 16(1). doi:10.1186/s12859-015-0564-6Wasim, M., Waqar, D., & Usman, D. (2017). A Survey of Datasets for Biomedical Question Answering Systems. International Journal of Advanced Computer Science and Applications, 8(7). doi:10.14569/ijacsa.2017.080767Yin, W., Schütze, H., Xiang, B., & Zhou, B. (2016). ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs. Transactions of the Association for Computational Linguistics, 4, 259-272. doi:10.1162/tacl_a_0009

Crossref

RiuNet

BERT with History Answer Embedding for Conversational Question Answering

Author: Croft W. Bruce
Iyyer Mohit
Qiu Minghui
Qu Chen
Yang Liu
Zhang Yongfeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/10/2019
Field of study

Conversational search is an emerging topic in the information retrieval community. One of the major challenges to multi-turn conversational search is to model the conversation history to answer the current question. Existing methods either prepend history turns to the current question or use complicated attention mechanisms to model the history. We propose a conceptually simple yet highly effective approach referred to as history answer embedding. It enables seamless integration of conversation history into a conversational question answering (ConvQA) model built on BERT (Bidirectional Encoder Representations from Transformers). We first explain our view that ConvQA is a simplified but concrete setting of conversational search, and then we provide a general framework to solve ConvQA. We further demonstrate the effectiveness of our approach under this framework. Finally, we analyze the impact of different numbers of history turns under different settings to provide new insights into conversation history modeling in ConvQA.Comment: Accepted to SIGIR 2019 as a short pape

arXiv.org e-Print Archive

Crossref

Retrieve-and-Read: Multi-task Learning of Information Retrieval and Reading Comprehension

Author: Asano Hisako
Nishida Kyosuke
Otsuka Atsushi
Saito Itsumi
Tomita Junji
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/08/2018
Field of study

This study considers the task of machine reading at scale (MRS) wherein, given a question, a system first performs the information retrieval (IR) task of finding relevant passages in a knowledge source and then carries out the reading comprehension (RC) task of extracting an answer span from the passages. Previous MRS studies, in which the IR component was trained without considering answer spans, struggled to accurately find a small number of relevant passages from a large set of passages. In this paper, we propose a simple and effective approach that incorporates the IR and RC tasks by using supervised multi-task learning in order that the IR component can be trained by considering answer spans. Experimental results on the standard benchmark, answering SQuAD questions using the full Wikipedia as the knowledge source, showed that our model achieved state-of-the-art performance. Moreover, we thoroughly evaluated the individual contributions of our model components with our new Japanese dataset and SQuAD. The results showed significant improvements in the IR task and provided a new perspective on IR for RC: it is effective to teach which part of the passage answers the question rather than to give only a relevance score to the whole passage.Comment: 10 pages, 6 figure. Accepted as a full paper at CIKM 201

arXiv.org e-Print Archive

Crossref

Controlling Risk of Web Question Answering

Author: Devlin Jacob
Dunn Matthew
Ferrucci David
Gal Yarin
Geifman Yonatan
Guo Chuan
Lai Guokun
Levy Omer
Malinin Andrey
Nguyen Tri
Richardson Matthew
Vinyals Oriol
Voorhees Ellen M.
Wang Shuohang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/07/2019
Field of study

Web question answering (QA) has become an indispensable component in modern search systems, which can significantly improve users' search experience by providing a direct answer to users' information need. This could be achieved by applying machine reading comprehension (MRC) models over the retrieved passages to extract answers with respect to the search query. With the development of deep learning techniques, state-of-the-art MRC performances have been achieved by recent deep methods. However, existing studies on MRC seldom address the predictive uncertainty issue, i.e., how likely the prediction of an MRC model is wrong, leading to uncontrollable risks in real-world Web QA applications. In this work, we first conduct an in-depth investigation over the risk of Web QA. We then introduce a novel risk control framework, which consists of a qualify model for uncertainty estimation using the probe idea, and a decision model for selectively output. For evaluation, we introduce risk-related metrics, rather than the traditional EM and F1 in MRC, for the evaluation of risk-aware Web QA. The empirical results over both the real-world Web QA dataset and the academic MRC benchmark collection demonstrate the effectiveness of our approach.Comment: 42nd International ACM SIGIR Conference on Research and Development in Information Retrieva

arXiv.org e-Print Archive

Crossref

The State-of-the-arts in Focused Search

Author: Li Rongmei
Publication venue: University of Twente, Centre for Telematics and Information Technology
Publication date: 01/01/2009
Field of study

The continuous influx of various text data on the Web requires search engines to improve their retrieval abilities for more specific information. The need for relevant results to a user’s topic of interest has gone beyond search for domain or type specific documents to more focused result (e.g. document fragments or answers to a query). The introduction of XML provides a format standard for data representation, storage, and exchange. It helps focused search to be carried out at different granularities of a structured document with XML markups. This report aims at reviewing the state-of-the-arts in focused search, particularly techniques for topic-specific document retrieval, passage retrieval, XML retrieval, and entity ranking. It is concluded with highlight of open problems

University of Twente Research Information