20,065 research outputs found
Top K Relevant Passage Retrieval for Biomedical Question Answering
Question answering is a task that answers factoid questions using a large
collection of documents. It aims to provide precise answers in response to the
user's questions in natural language. Question answering relies on efficient
passage retrieval to select candidate contexts, where traditional sparse vector
space models, such as TF-IDF or BM25, are the de facto method. On the web,
there is no single article that could provide all the possible answers
available on the internet to the question of the problem asked by the user. The
existing Dense Passage Retrieval model has been trained on Wikipedia dump from
Dec. 20, 2018, as the source documents for answering questions. Question
answering (QA) has made big strides with several open-domain and machine
comprehension systems built using large-scale annotated datasets. However, in
the clinical domain, this problem remains relatively unexplored. According to
multiple surveys, Biomedical Questions cannot be answered correctly from
Wikipedia Articles. In this work, we work on the existing DPR framework for the
biomedical domain and retrieve answers from the Pubmed articles which is a
reliable source to answer medical questions. When evaluated on a BioASQ QA
dataset, our fine-tuned dense retriever results in a 0.81 F1 score.Comment: 6 pages, 5 figures. arXiv admin note: text overlap with
arXiv:2004.04906 by other author
Reading Wikipedia to Answer Open-Domain Questions
This paper proposes to tackle open- domain question answering using Wikipedia
as the unique knowledge source: the answer to any factoid question is a text
span in a Wikipedia article. This task of machine reading at scale combines the
challenges of document retrieval (finding the relevant articles) with that of
machine comprehension of text (identifying the answer spans from those
articles). Our approach combines a search component based on bigram hashing and
TF-IDF matching with a multi-layer recurrent neural network model trained to
detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA
datasets indicate that (1) both modules are highly competitive with respect to
existing counterparts and (2) multitask learning using distant supervision on
their combination is an effective complete system on this challenging task.Comment: ACL2017, 10 page
- …