8,307 research outputs found
Simultaneously Linking Entities and Extracting Relations from Biomedical Text Without Mention-level Supervision
Understanding the meaning of text often involves reasoning about entities and
their relationships. This requires identifying textual mentions of entities,
linking them to a canonical concept, and discerning their relationships. These
tasks are nearly always viewed as separate components within a pipeline, each
requiring a distinct model and training data. While relation extraction can
often be trained with readily available weak or distant supervision, entity
linkers typically require expensive mention-level supervision -- which is not
available in many domains. Instead, we propose a model which is trained to
simultaneously produce entity linking and relation decisions while requiring no
mention-level annotations. This approach avoids cascading errors that arise
from pipelined methods and more accurately predicts entity relationships from
text. We show that our model outperforms a state-of-the art entity linking and
relation extraction pipeline on two biomedical datasets and can drastically
improve the overall recall of the system.Comment: Accepted in AAAI 202
Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers
Most approaches to extraction multiple relations from a paragraph require
multiple passes over the paragraph. In practice, multiple passes are
computationally expensive and this makes difficult to scale to longer
paragraphs and larger text corpora. In this work, we focus on the task of
multiple relation extraction by encoding the paragraph only once (one-pass). We
build our solution on the pre-trained self-attentive (Transformer) models,
where we first add a structured prediction layer to handle extraction between
multiple entity pairs, then enhance the paragraph embedding to capture multiple
relational information associated with each entity with an entity-aware
attention technique. We show that our approach is not only scalable but can
also perform state-of-the-art on the standard benchmark ACE 2005.Comment: 7 page
Effective Feature Representation for Clinical Text Concept Extraction
Crucial information about the practice of healthcare is recorded only in
free-form text, which creates an enormous opportunity for high-impact NLP.
However, annotated healthcare datasets tend to be small and expensive to
obtain, which raises the question of how to make maximally efficient uses of
the available data. To this end, we develop an LSTM-CRF model for combining
unsupervised word representations and hand-built feature representations
derived from publicly available healthcare ontologies. We show that this
combined model yields superior performance on five datasets of diverse kinds of
healthcare text (clinical, social, scientific, commercial). Each involves the
labeling of complex, multi-word spans that pick out different healthcare
concepts. We also introduce a new labeled dataset for identifying the treatment
relations between drugs and diseases
- …