3,000 research outputs found
Deep learning from crowds
Over the last few years, deep learning has revolutionized the field of
machine learning by dramatically improving the state-of-the-art in various
domains. However, as the size of supervised artificial neural networks grows,
typically so does the need for larger labeled datasets. Recently, crowdsourcing
has established itself as an efficient and cost-effective solution for labeling
large sets of data in a scalable manner, but it often requires aggregating
labels from multiple noisy contributors with different levels of expertise. In
this paper, we address the problem of learning deep neural networks from
crowds. We begin by describing an EM algorithm for jointly learning the
parameters of the network and the reliabilities of the annotators. Then, a
novel general-purpose crowd layer is proposed, which allows us to train deep
neural networks end-to-end, directly from the noisy labels of multiple
annotators, using only backpropagation. We empirically show that the proposed
approach is able to internally capture the reliability and biases of different
annotators and achieve new state-of-the-art results for various crowdsourced
datasets across different settings, namely classification, regression and
sequence labeling.Comment: 10 pages, The Thirty-Second AAAI Conference on Artificial
Intelligence (AAAI), 201
An attentive neural architecture for joint segmentation and parsing and its application to real estate ads
In processing human produced text using natural language processing (NLP)
techniques, two fundamental subtasks that arise are (i) segmentation of the
plain text into meaningful subunits (e.g., entities), and (ii) dependency
parsing, to establish relations between subunits. In this paper, we develop a
relatively simple and effective neural joint model that performs both
segmentation and dependency parsing together, instead of one after the other as
in most state-of-the-art works. We will focus in particular on the real estate
ad setting, aiming to convert an ad to a structured description, which we name
property tree, comprising the tasks of (1) identifying important entities of a
property (e.g., rooms) from classifieds and (2) structuring them into a tree
format. In this work, we propose a new joint model that is able to tackle the
two tasks simultaneously and construct the property tree by (i) avoiding the
error propagation that would arise from the subtasks one after the other in a
pipelined fashion, and (ii) exploiting the interactions between the subtasks.
For this purpose, we perform an extensive comparative study of the pipeline
methods and the new proposed joint model, reporting an improvement of over
three percentage points in the overall edge F1 score of the property tree.
Also, we propose attention methods, to encourage our model to focus on salient
tokens during the construction of the property tree. Thus we experimentally
demonstrate the usefulness of attentive neural architectures for the proposed
joint model, showcasing a further improvement of two percentage points in edge
F1 score for our application.Comment: Preprint - Accepted for publication in Expert Systems with
Application
LM-Based Word Embeddings Improve Biomedical Named Entity Recognition: A Detailed Analysis
Recent studies have shown that contextualized word embeddings outperform other types of embeddings on a variety of tasks. However, there is little research done to evaluate their effectiveness in the biomedical domain under multi-task settings. We derive the contextualized word embeddings from the Flair framework and apply them to the task of biomedical NER on 5 benchmark datasets, yielding major improvements over the baseline and achieving competitive results over the current best systems. We analyze the sources of these improvements, reporting model performances over different combinations of word embeddings, and fine-tuning and casing modes
A Named Entity Recognition Method Enhanced with Lexicon Information and Text Local Feature
At present, Named Entity Recognition (NER) is one of the fundamental tasks for extracting knowledge from traditional Chinese medicine (TCM) texts. The variability of the length of TCM entities and the characteristics of the language of TCM texts lead to ambiguity of TCM entity boundaries. In addition, better extracting and exploiting local features of text can improve the accuracy of named entity recognition. In this paper, we proposed a TCM NER model with lexicon information and text local feature enhancement of text. In this model, a lexicon is introduced to encode the characters in the text to obtain the context-sensitive global semantic representation of the text. The convolutional neural network (CNN) and gate joined collaborative attention network are used to form a text local feature extraction module to capture the important semantic features of local text. Experiments were conducted on two TCM domain datasets and the F1 values are 91.13% and 90.21% respectively
HunFlair2 in a cross-corpus evaluation of biomedical named entity recognition and normalization tools
With the exponential growth of the life science literature, biomedical text
mining (BTM) has become an essential technology for accelerating the extraction
of insights from publications. Identifying named entities (e.g., diseases,
drugs, or genes) in texts and their linkage to reference knowledge bases are
crucial steps in BTM pipelines to enable information aggregation from different
documents. However, tools for these two steps are rarely applied in the same
context in which they were developed. Instead, they are applied in the wild,
i.e., on application-dependent text collections different from those used for
the tools' training, varying, e.g., in focus, genre, style, and text type. This
raises the question of whether the reported performance of BTM tools can be
trusted for downstream applications. Here, we report on the results of a
carefully designed cross-corpus benchmark for named entity extraction, where
tools were applied systematically to corpora not used during their training.
Based on a survey of 28 published systems, we selected five for an in-depth
analysis on three publicly available corpora encompassing four different entity
types. Comparison between tools results in a mixed picture and shows that, in a
cross-corpus setting, the performance is significantly lower than the one
reported in an in-corpus setting. HunFlair2 showed the best performance on
average, being closely followed by PubTator. Our results indicate that users of
BTM tools should expect diminishing performances when applying them in the wild
compared to original publications and show that further research is necessary
to make BTM tools more robust
Improving Broad-Coverage Medical Entity Linking with Semantic Type Prediction and Large-Scale Datasets
Medical entity linking is the task of identifying and standardizing medical
concepts referred to in an unstructured text. Most of the existing methods
adopt a three-step approach of (1) detecting mentions, (2) generating a list of
candidate concepts, and finally (3) picking the best concept among them. In
this paper, we probe into alleviating the problem of overgeneration of
candidate concepts in the candidate generation module, the most under-studied
component of medical entity linking. For this, we present MedType, a fully
modular system that prunes out irrelevant candidate concepts based on the
predicted semantic type of an entity mention. We incorporate MedType into five
off-the-shelf toolkits for medical entity linking and demonstrate that it
consistently improves entity linking performance across several benchmark
datasets. To address the dearth of annotated training data for medical entity
linking, we present WikiMed and PubMedDS, two large-scale medical entity
linking datasets, and demonstrate that pre-training MedType on these datasets
further improves entity linking performance. We make our source code and
datasets publicly available for medical entity linking research.Comment: 35 page
- …