122 research outputs found
Reading Wikipedia to Answer Open-Domain Questions
This paper proposes to tackle open- domain question answering using Wikipedia
as the unique knowledge source: the answer to any factoid question is a text
span in a Wikipedia article. This task of machine reading at scale combines the
challenges of document retrieval (finding the relevant articles) with that of
machine comprehension of text (identifying the answer spans from those
articles). Our approach combines a search component based on bigram hashing and
TF-IDF matching with a multi-layer recurrent neural network model trained to
detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA
datasets indicate that (1) both modules are highly competitive with respect to
existing counterparts and (2) multitask learning using distant supervision on
their combination is an effective complete system on this challenging task.Comment: ACL2017, 10 page
Open-Retrieval Conversational Question Answering
Conversational search is one of the ultimate goals of information retrieval.
Recent research approaches conversational search by simplified settings of
response ranking and conversational question answering, where an answer is
either selected from a given candidate set or extracted from a given passage.
These simplifications neglect the fundamental role of retrieval in
conversational search. To address this limitation, we introduce an
open-retrieval conversational question answering (ORConvQA) setting, where we
learn to retrieve evidence from a large collection before extracting answers,
as a further step towards building functional conversational search systems. We
create a dataset, OR-QuAC, to facilitate research on ORConvQA. We build an
end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader
that are all based on Transformers. Our extensive experiments on OR-QuAC
demonstrate that a learnable retriever is crucial for ORConvQA. We further show
that our system can make a substantial improvement when we enable history
modeling in all system components. Moreover, we show that the reranker
component contributes to the model performance by providing a regularization
effect. Finally, further in-depth analyses are performed to provide new
insights into ORConvQA.Comment: Accepted to SIGIR'2
Towards Zero-Shot Frame Semantic Parsing for Domain Scaling
State-of-the-art slot filling models for goal-oriented human/machine
conversational language understanding systems rely on deep learning methods.
While multi-task training of such models alleviates the need for large
in-domain annotated datasets, bootstrapping a semantic parsing model for a new
domain using only the semantic frame, such as the back-end API or knowledge
graph schema, is still one of the holy grail tasks of language understanding
for dialogue systems. This paper proposes a deep learning based approach that
can utilize only the slot description in context without the need for any
labeled or unlabeled in-domain examples, to quickly bootstrap a new domain. The
main idea of this paper is to leverage the encoding of the slot names and
descriptions within a multi-task deep learned slot filling model, to implicitly
align slots across domains. The proposed approach is promising for solving the
domain scaling problem and eliminating the need for any manually annotated data
or explicit schema alignment. Furthermore, our experiments on multiple domains
show that this approach results in significantly better slot-filling
performance when compared to using only in-domain data, especially in the low
data regime.Comment: 4 pages + 1 reference
Preclinical risk of bias assessment and PICO extraction using natural language processing
Drug development starts with preclinical studies which test the efficacy and
toxicology of potential candidates in living animals, before proceeding to
clinical trials examined on human subjects. Many drugs shown to be effective
in preclinical animal studies fail in clinical trials, indicating the potential
reproducibility issues and translation failure. To obtain less biased research
findings, systematic reviews are performed to collate all relevant evidence from
publications. However, systematic reviews are time-consuming and
researchers have advocated the use of automation techniques to speed the
process and reduce human efforts. Good progress has been made in
implementing automation tools into reviews for clinical trials while the tools
developed for preclinical systematic reviews are scarce. Tools for preclinical
systematic reviews should be designed specifically because preclinical
experiments differ from clinical trials. In this thesis, I explore natural language
processing models for facilitating two stages in preclinical systematic reviews:
risk of bias assessment and PICO extraction.
There are a range of measures used to reduce bias in animal experiments and
many checklist criteria require the reporting of those measures in publications.
In the first part of the thesis, I implement several binary classification models
to indicate the reporting of random allocation to groups, blinded assessment
of outcome, conflict of interests, compliance of animal welfare regulations, and
statement of animal exclusions in preclinical publications. I compare traditional
machine learning classifiers with several text representation methods,
convolutional/recurrent/hierarchical neural networks, and propose two
strategies to adapt BERT models to long documents. My findings indicate that
neural networks and BERT-based models achieve better performance than
traditional classifiers and rule-based approaches. The attention mechanism
and hierarchical architecture in neural networks do not improve performance
but are useful for extracting relevant words or sentences from publications to
inform users’ judgement. The advantages of the transformer structure are
hindered when documents are long and computing resources are limited.
In literature retrieval and citation screening of published evidence, the key
elements of interest are Population, Intervention, Comparator and Outcome,
which compose the framework of PICO. In the second part of the thesis, I first
apply several question answering models based on attention flows and
transformers to extract phrases describing intervention or method of induction
of disease models from clinical abstracts and preclinical full texts. For
preclinical datasets describing multiple interventions or induction methods in
the full texts, I apply additional unsupervised information retrieval methods to
extract relevant sentences. The question answering models achieve good
performance when the text is at abstract-level and contains only one
intervention or induction method, while for truncated documents with multiple
PICO mentions, the performance is less satisfactory. Considering this
limitation, I then collect preclinical abstracts with finer-grained PICO
annotations and develop named entity recognition models for extraction of
preclinical PICO elements including Species, Strain, Induction, Intervention,
Comparator and Outcome. I decompose PICO extraction into two independent
tasks: 1) PICO sentences classification, and 2) PICO elements detection. For
PICO extraction, BERT-based models pre-trained from biomedical corpus
outperform recurrent networks and the conditional probabilistic module only
shows advantages in recurrent networks. Self-training strategy applied to
enlarge training set from unlabelled abstracts yields better performance for
PICO elements which lack enough amount of instances.
Experimental results demonstrate the possibilities of facilitating preclinical risk
of bias assessment and PICO extraction by natural language processing
- …