149 research outputs found
Active Discriminative Text Representation Learning
We propose a new active learning (AL) method for text classification with
convolutional neural networks (CNNs). In AL, one selects the instances to be
manually labeled with the aim of maximizing model performance with minimal
effort. Neural models capitalize on word embeddings as representations
(features), tuning these to the task at hand. We argue that AL strategies for
multi-layered neural models should focus on selecting instances that most
affect the embedding space (i.e., induce discriminative word representations).
This is in contrast to traditional AL approaches (e.g., entropy-based
uncertainty sampling), which specify higher level objectives. We propose a
simple approach for sentence classification that selects instances containing
words whose embeddings are likely to be updated with the greatest magnitude,
thereby rapidly learning discriminative, task-specific embeddings. We extend
this approach to document classification by jointly considering: (1) the
expected changes to the constituent word representations; and (2) the model's
current overall uncertainty regarding the instance. The relative emphasis
placed on these criteria is governed by a stochastic process that favors
selecting instances likely to improve representations at the outset of
learning, and then shifts toward general uncertainty sampling as AL progresses.
Empirical results show that our method outperforms baseline AL approaches on
both sentence and document classification tasks. We also show that, as
expected, the method quickly learns discriminative word embeddings. To the best
of our knowledge, this is the first work on AL addressing neural models for
text classification.Comment: This paper got accepted by AAAI 201
Evidence Inference 2.0: More Data, Better Models
How do we most effectively treat a disease or condition? Ideally, we could
consult a database of evidence gleaned from clinical trials to answer such
questions. Unfortunately, no such database exists; clinical trial results are
instead disseminated primarily via lengthy natural language articles. Perusing
all such articles would be prohibitively time-consuming for healthcare
practitioners; they instead tend to depend on manually compiled systematic
reviews of medical literature to inform care.
NLP may speed this process up, and eventually facilitate immediate consult of
published evidence. The Evidence Inference dataset was recently released to
facilitate research toward this end. This task entails inferring the
comparative performance of two treatments, with respect to a given outcome,
from a particular article (describing a clinical trial) and identifying
supporting evidence. For instance: Does this article report that chemotherapy
performed better than surgery for five-year survival rates of operable cancers?
In this paper, we collect additional annotations to expand the Evidence
Inference dataset by 25\%, provide stronger baseline models, systematically
inspect the errors that these make, and probe dataset quality. We also release
an abstract only (as opposed to full-texts) version of the task for rapid model
prototyping. The updated corpus, documentation, and code for new baselines and
evaluations are available at http://evidence-inference.ebm-nlp.com/.Comment: Accepted as workshop paper into BioNLP Updated results from SciBERT
to Biomed RoBERT
- …