11,907 research outputs found
Clinical Concept Extraction with Contextual Word Embedding
Automatic extraction of clinical concepts is an essential step for turning
the unstructured data within a clinical note into structured and actionable
information. In this work, we propose a clinical concept extraction model for
automatic annotation of clinical problems, treatments, and tests in clinical
notes utilizing domain-specific contextual word embedding. A contextual word
embedding model is first trained on a corpus with a mixture of clinical reports
and relevant Wikipedia pages in the clinical domain. Next, a bidirectional
LSTM-CRF model is trained for clinical concept extraction using the contextual
word embedding model. We tested our proposed model on the I2B2 2010 challenge
dataset. Our proposed model achieved the best performance among reported
baseline models and outperformed the state-of-the-art models by 3.4% in terms
of F1-score.Comment: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018
arXiv:1811.0721
Transfer Learning for Scientific Data Chain Extraction in Small Chemical Corpus with BERT-CRF Model
Computational chemistry develops fast in recent years due to the rapid growth
and breakthroughs in AI. Thanks for the progress in natural language
processing, researchers can extract more fine-grained knowledge in publications
to stimulate the development in computational chemistry. While the works and
corpora in chemical entity extraction have been restricted in the biomedicine
or life science field instead of the chemistry field, we build a new corpus in
chemical bond field annotated for 7 types of entities: compound, solvent,
method, bond, reaction, pKa and pKa value. This paper presents a novel BERT-CRF
model to build scientific chemical data chains by extracting 7 chemical
entities and relations from publications. And we propose a joint model to
extract the entities and relations simultaneously. Experimental results on our
Chemical Special Corpus demonstrate that we achieve state-of-art and
competitive NER performance
Advancing NLP with Cognitive Language Processing Signals
When we read, our brain processes language and generates cognitive processing
data such as gaze patterns and brain activity. These signals can be recorded
while reading. Cognitive language processing data such as eye-tracking features
have shown improvements on single NLP tasks. We analyze whether using such
human features can show consistent improvement across tasks and data sources.
We present an extensive investigation of the benefits and limitations of using
cognitive processing data for NLP. Specifically, we use gaze and EEG features
to augment models of named entity recognition, relation classification, and
sentiment analysis. These methods significantly outperform the baselines and
show the potential and current limitations of employing human language
processing data for NLP
Second-Order Word Embeddings from Nearest Neighbor Topological Features
We introduce second-order vector representations of words, induced from
nearest neighborhood topological features in pre-trained contextual word
embeddings. We then analyze the effects of using second-order embeddings as
input features in two deep natural language processing models, for named entity
recognition and recognizing textual entailment, as well as a linear model for
paraphrase recognition. Surprisingly, we find that nearest neighbor information
alone is sufficient to capture most of the performance benefits derived from
using pre-trained word embeddings. Furthermore, second-order embeddings are
able to handle highly heterogeneous data better than first-order
representations, though at the cost of some specificity. Additionally,
augmenting contextual embeddings with second-order information further improves
model performance in some cases. Due to variance in the random initializations
of word embeddings, utilizing nearest neighbor features from multiple
first-order embedding samples can also contribute to downstream performance
gains. Finally, we identify intriguing characteristics of second-order
embedding spaces for further research, including much higher density and
different semantic interpretations of cosine similarity.Comment: Submitted to NIPS 2017. (8 pages + 4 reference
Fine Grained Classification of Personal Data Entities
Entity Type Classification can be defined as the task of assigning category
labels to entity mentions in documents. While neural networks have recently
improved the classification of general entity mentions, pattern matching and
other systems continue to be used for classifying personal data entities (e.g.
classifying an organization as a media company or a government institution for
GDPR, and HIPAA compliance). We propose a neural model to expand the class of
personal data entities that can be classified at a fine grained level, using
the output of existing pattern matching systems as additional contextual
features. We introduce new resources, a personal data entities hierarchy with
134 types, and two datasets from the Wikipedia pages of elected representatives
and Enron emails. We hope these resource will aid research in the area of
personal data discovery, and to that effect, we provide baseline results on
these datasets, and compare our method with state of the art models on
OntoNotes dataset
Seq2Biseq: Bidirectional Output-wise Recurrent Neural Networks for Sequence Modelling
During the last couple of years, Recurrent Neural Networks (RNN) have reached
state-of-the-art performances on most of the sequence modelling problems. In
particular, the "sequence to sequence" model and the neural CRF have proved to
be very effective in this domain. In this article, we propose a new RNN
architecture for sequence labelling, leveraging gated recurrent layers to take
arbitrarily long contexts into account, and using two decoders operating
forward and backward. We compare several variants of the proposed solution and
their performances to the state-of-the-art. Most of our results are better than
the state-of-the-art or very close to it and thanks to the use of recent
technologies, our architecture can scale on corpora larger than those used in
this work.Comment: Slightly improved version of the paper accepted to the CICling 2019
conferenc
Neural Metric Learning for Fast End-to-End Relation Extraction
Relation extraction (RE) is an indispensable information extraction task in
several disciplines. RE models typically assume that named entity recognition
(NER) is already performed in a previous step by another independent model.
Several recent efforts, under the theme of end-to-end RE, seek to exploit
inter-task correlations by modeling both NER and RE tasks jointly. Earlier work
in this area commonly reduces the task to a table-filling problem wherein an
additional expensive decoding step involving beam search is applied to obtain
globally consistent cell labels. In efforts that do not employ table-filling,
global optimization in the form of CRFs with Viterbi decoding for the NER
component is still necessary for competitive performance. We introduce a novel
neural architecture utilizing the table structure, based on repeated
applications of 2D convolutions for pooling local dependency and metric-based
features, that improves on the state-of-the-art without the need for global
optimization. We validate our model on the ADE and CoNLL04 datasets for
end-to-end RE and demonstrate gain (in F-score) over prior best
results with training and testing times that are seven to ten times faster ---
the latter highly advantageous for time-sensitive end user applications
An Attentive Sequence Model for Adverse Drug Event Extraction from Biomedical Text
Adverse reaction caused by drugs is a potentially dangerous problem which may
lead to mortality and morbidity in patients. Adverse Drug Event (ADE)
extraction is a significant problem in biomedical research. We model ADE
extraction as a Question-Answering problem and take inspiration from Machine
Reading Comprehension (MRC) literature, to design our model. Our objective in
designing such a model, is to exploit the local linguistic context in clinical
text and enable intra-sequence interaction, in order to jointly learn to
classify drug and disease entities, and to extract adverse reactions caused by
a given drug. Our model makes use of a self-attention mechanism to facilitate
intra-sequence interaction in a text sequence. This enables us to visualize and
understand how the network makes use of the local and wider context for
classification.Comment: 7 pages, 5 figures, 4 table
Grounded Recurrent Neural Networks
In this work, we present the Grounded Recurrent Neural Network (GRNN), a
recurrent neural network architecture for multi-label prediction which
explicitly ties labels to specific dimensions of the recurrent hidden state (we
call this process "grounding"). The approach is particularly well-suited for
extracting large numbers of concepts from text. We apply the new model to
address an important problem in healthcare of understanding what medical
concepts are discussed in clinical text. Using a publicly available dataset
derived from Intensive Care Units, we learn to label a patient's diagnoses and
procedures from their discharge summary. Our evaluation shows a clear advantage
to using our proposed architecture over a variety of strong baselines
SciBERT: A Pretrained Language Model for Scientific Text
Obtaining large-scale annotated data for NLP tasks in the scientific domain
is challenging and expensive. We release SciBERT, a pretrained language model
based on BERT (Devlin et al., 2018) to address the lack of high-quality,
large-scale labeled scientific data. SciBERT leverages unsupervised pretraining
on a large multi-domain corpus of scientific publications to improve
performance on downstream scientific NLP tasks. We evaluate on a suite of tasks
including sequence tagging, sentence classification and dependency parsing,
with datasets from a variety of scientific domains. We demonstrate
statistically significant improvements over BERT and achieve new
state-of-the-art results on several of these tasks. The code and pretrained
models are available at https://github.com/allenai/scibert/.Comment: https://github.com/allenai/sciber
- …