8,962 research outputs found
Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks
Recent papers have shown that neural networks obtain state-of-the-art
performance on several different sequence tagging tasks. One appealing property
of such systems is their generality, as excellent performance can be achieved
with a unified architecture and without task-specific feature engineering.
However, it is unclear if such systems can be used for tasks without large
amounts of training data. In this paper we explore the problem of transfer
learning for neural sequence taggers, where a source task with plentiful
annotations (e.g., POS tagging on Penn Treebank) is used to improve performance
on a target task with fewer available annotations (e.g., POS tagging for
microblogs). We examine the effects of transfer learning for deep hierarchical
recurrent networks across domains, applications, and languages, and show that
significant improvement can often be obtained. These improvements lead to
improvements over the current state-of-the-art on several well-studied tasks.Comment: Accepted as a conference paper at ICLR 2017. This is an extended
version of the original paper (https://arxiv.org/abs/1603.06270). The
original paper proposes a new architecture, while this version focuses on
transfer learning for a general model clas
SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data
We present SwellShark, a framework for building biomedical named entity
recognition (NER) systems quickly and without hand-labeled data. Our approach
views biomedical resources like lexicons as function primitives for
autogenerating weak supervision. We then use a generative model to unify and
denoise this supervision and construct large-scale, probabilistically labeled
datasets for training high-accuracy NER taggers. In three biomedical NER tasks,
SwellShark achieves competitive scores with state-of-the-art supervised
benchmarks using no hand-labeled training data. In a drug name extraction task
using patient medical records, one domain expert using SwellShark achieved
within 5.1% of a crowdsourced annotation approach -- which originally utilized
20 teams over the course of several weeks -- in 24 hours
An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining
Multi-task learning (MTL) has achieved remarkable success in natural language
processing applications. In this work, we study a multi-task learning model
with multiple decoders on varieties of biomedical and clinical natural language
processing tasks such as text similarity, relation extraction, named entity
recognition, and text inference. Our empirical results demonstrate that the MTL
fine-tuned models outperform state-of-the-art transformer models (e.g., BERT
and its variants) by 2.0% and 1.3% in biomedical and clinical domains,
respectively. Pairwise MTL further demonstrates more details about which tasks
can improve or decrease others. This is particularly helpful in the context
that researchers are in the hassle of choosing a suitable model for new
problems. The code and models are publicly available at
https://github.com/ncbi-nlp/bluebertComment: Accepted by BioNLP 202
Effective Use of Bidirectional Language Modeling for Transfer Learning in Biomedical Named Entity Recognition
Biomedical named entity recognition (NER) is a fundamental task in text
mining of medical documents and has many applications. Deep learning based
approaches to this task have been gaining increasing attention in recent years
as their parameters can be learned end-to-end without the need for
hand-engineered features. However, these approaches rely on high-quality
labeled data, which is expensive to obtain. To address this issue, we
investigate how to use unlabeled text data to improve the performance of NER
models. Specifically, we train a bidirectional language model (BiLM) on
unlabeled data and transfer its weights to "pretrain" an NER model with the
same architecture as the BiLM, which results in a better parameter
initialization of the NER model. We evaluate our approach on four benchmark
datasets for biomedical NER and show that it leads to a substantial improvement
in the F1 scores compared with the state-of-the-art approaches. We also show
that BiLM weight transfer leads to a faster model training and the pretrained
model requires fewer training examples to achieve a particular F1 score.Comment: Machine Learning for Healthcare (MLHC) 2018, Comments: 12 pages,
updated authors affiliation
A Study of Recent Contributions on Information Extraction
This paper reports on modern approaches in Information Extraction (IE) and
its two main sub-tasks of Named Entity Recognition (NER) and Relation
Extraction (RE). Basic concepts and the most recent approaches in this area are
reviewed, which mainly include Machine Learning (ML) based approaches and the
more recent trend to Deep Learning (DL) based methods
Learning Named Entity Tagger using Domain-Specific Dictionary
Recent advances in deep neural models allow us to build reliable named entity
recognition (NER) systems without handcrafting features. However, such methods
require large amounts of manually-labeled training data. There have been
efforts on replacing human annotations with distant supervision (in conjunction
with external dictionaries), but the generated noisy labels pose significant
challenges on learning effective neural models. Here we propose two neural
models to suit noisy distant supervision from the dictionary. First, under the
traditional sequence labeling framework, we propose a revised fuzzy CRF layer
to handle tokens with multiple possible labels. After identifying the nature of
noisy labels in distant supervision, we go beyond the traditional framework and
propose a novel, more effective neural model AutoNER with a new Tie or Break
scheme. In addition, we discuss how to refine distant supervision for better
NER performance. Extensive experiments on three benchmark datasets demonstrate
that AutoNER achieves the best performance when only using dictionaries with no
additional human effort, and delivers competitive results with state-of-the-art
supervised benchmarks
Unlocking the Power of Deep PICO Extraction: Step-wise Medical NER Identification
The PICO framework (Population, Intervention, Comparison, and Outcome) is
usually used to formulate evidence in the medical domain. The major task of
PICO extraction is to extract sentences from medical literature and classify
them into each class. However, in most circumstances, there will be more than
one evidences in an extracted sentence even it has been categorized to a
certain class. In order to address this problem, we propose a step-wise disease
Named Entity Recognition (DNER) extraction and PICO identification method. With
our method, sentences in paper title and abstract are first classified into
different classes of PICO, and medical entities are then identified and
classified into P and O. Different kinds of deep learning frameworks are used
and experimental results show that our method will achieve high performance and
fine-grained extraction results comparing with conventional PICO extraction
works.Comment: 9 pages, 3 figure
Neural Metric Learning for Fast End-to-End Relation Extraction
Relation extraction (RE) is an indispensable information extraction task in
several disciplines. RE models typically assume that named entity recognition
(NER) is already performed in a previous step by another independent model.
Several recent efforts, under the theme of end-to-end RE, seek to exploit
inter-task correlations by modeling both NER and RE tasks jointly. Earlier work
in this area commonly reduces the task to a table-filling problem wherein an
additional expensive decoding step involving beam search is applied to obtain
globally consistent cell labels. In efforts that do not employ table-filling,
global optimization in the form of CRFs with Viterbi decoding for the NER
component is still necessary for competitive performance. We introduce a novel
neural architecture utilizing the table structure, based on repeated
applications of 2D convolutions for pooling local dependency and metric-based
features, that improves on the state-of-the-art without the need for global
optimization. We validate our model on the ADE and CoNLL04 datasets for
end-to-end RE and demonstrate gain (in F-score) over prior best
results with training and testing times that are seven to ten times faster ---
the latter highly advantageous for time-sensitive end user applications
MASK: A flexible framework to facilitate de-identification of clinical texts
Medical health records and clinical summaries contain a vast amount of
important information in textual form that can help advancing research on
treatments, drugs and public health. However, the majority of these information
is not shared because they contain private information about patients, their
families, or medical staff treating them. Regulations such as HIPPA in the US,
PHIPPA in Canada and GDPR regulate the protection, processing and distribution
of this information. In case this information is de-identified and personal
information are replaced or redacted, they could be distributed to the research
community. In this paper, we present MASK, a software package that is designed
to perform the de-identification task. The software is able to perform named
entity recognition using some of the state-of-the-art techniques and then mask
or redact recognized entities. The user is able to select named entity
recognition algorithm (currently implemented are two versions of CRF-based
techniques and BiLSTM-based neural network with pre-trained GLoVe and ELMo
embedding) and masking algorithm (e.g. shift dates, replace names/locations,
totally redact entity)
Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches
This work investigates multiple approaches to Named Entity Recognition (NER)
for text in Electronic Health Record (EHR) data. In particular, we look into
the application of (i) rule-based, (ii) deep learning and (iii) transfer
learning systems for the task of NER on brain imaging reports with a focus on
records from patients with stroke. We explore the strengths and weaknesses of
each approach, develop rules and train on a common dataset, and evaluate each
system's performance on common test sets of Scottish radiology reports from two
sources (brain imaging reports in ESS -- Edinburgh Stroke Study data collected
by NHS Lothian as well as radiology reports created in NHS Tayside). Our
comparison shows that a hand-crafted system is the most accurate way to
automatically label EHR, but machine learning approaches can provide a feasible
alternative where resources for a manual system are not readily available.Comment: 8 pages, presented at HealTAC 2019, Cardiff, 24-25/04/201
- …