2,427 research outputs found
Improving Object Detection in Medical Image Analysis through Multiple Expert Annotators: An Empirical Investigation
The work discusses the use of machine learning algorithms for anomaly
detection in medical image analysis and how the performance of these algorithms
depends on the number of annotators and the quality of labels. To address the
issue of subjectivity in labeling with a single annotator, we introduce a
simple and effective approach that aggregates annotations from multiple
annotators with varying levels of expertise. We then aim to improve the
efficiency of predictive models in abnormal detection tasks by estimating
hidden labels from multiple annotations and using a re-weighted loss function
to improve detection performance. Our method is evaluated on a real-world
medical imaging dataset and outperforms relevant baselines that do not consider
disagreements among annotators.Comment: This is a short version submitted to the Midwest Machine Learning
Symposium (MMLS 2023), Chicago, IL, US
Modelling Instance-Level Annotator Reliability for Natural Language Labelling Tasks
When constructing models that learn from noisy labels produced by multiple
annotators, it is important to accurately estimate the reliability of
annotators. Annotators may provide labels of inconsistent quality due to their
varying expertise and reliability in a domain. Previous studies have mostly
focused on estimating each annotator's overall reliability on the entire
annotation task. However, in practice, the reliability of an annotator may
depend on each specific instance. Only a limited number of studies have
investigated modelling per-instance reliability and these only considered
binary labels. In this paper, we propose an unsupervised model which can handle
both binary and multi-class labels. It can automatically estimate the
per-instance reliability of each annotator and the correct label for each
instance. We specify our model as a probabilistic model which incorporates
neural networks to model the dependency between latent variables and instances.
For evaluation, the proposed method is applied to both synthetic and real data,
including two labelling tasks: text classification and textual entailment.
Experimental results demonstrate our novel method can not only accurately
estimate the reliability of annotators across different instances, but also
achieve superior performance in predicting the correct labels and detecting the
least reliable annotators compared to state-of-the-art baselines.Comment: 9 pages, 1 figures, 10 tables, 2019 Annual Conference of the North
American Chapter of the Association for Computational Linguistics (NAACL2019
Fusing Continuous-valued Medical Labels using a Bayesian Model
With the rapid increase in volume of time series medical data available
through wearable devices, there is a need to employ automated algorithms to
label data. Examples of labels include interventions, changes in activity (e.g.
sleep) and changes in physiology (e.g. arrhythmias). However, automated
algorithms tend to be unreliable resulting in lower quality care. Expert
annotations are scarce, expensive, and prone to significant inter- and
intra-observer variance. To address these problems, a Bayesian
Continuous-valued Label Aggregator(BCLA) is proposed to provide a reliable
estimation of label aggregation while accurately infer the precision and bias
of each algorithm. The BCLA was applied to QT interval (pro-arrhythmic
indicator) estimation from the electrocardiogram using labels from the 2006
PhysioNet/Computing in Cardiology Challenge database. It was compared to the
mean, median, and a previously proposed Expectation Maximization (EM) label
aggregation approaches. While accurately predicting each labelling algorithm's
bias and precision, the root-mean-square error of the BCLA was
11.780.63ms, significantly outperforming the best Challenge entry
(15.372.13ms) as well as the EM, mean, and median voting strategies
(14.760.52ms, 17.610.55ms, and 14.430.57ms respectively with
)
Leveraging Crowdsourcing Data For Deep Active Learning - An Application: Learning Intents in Alexa
This paper presents a generic Bayesian framework that enables any deep
learning model to actively learn from targeted crowds. Our framework inherits
from recent advances in Bayesian deep learning, and extends existing work by
considering the targeted crowdsourcing approach, where multiple annotators with
unknown expertise contribute an uncontrolled amount (often limited) of
annotations. Our framework leverages the low-rank structure in annotations to
learn individual annotator expertise, which then helps to infer the true labels
from noisy and sparse annotations. It provides a unified Bayesian model to
simultaneously infer the true labels and train the deep learning model in order
to reach an optimal learning efficacy. Finally, our framework exploits the
uncertainty of the deep learning model during prediction as well as the
annotators' estimated expertise to minimize the number of required annotations
and annotators for optimally training the deep learning model.
We evaluate the effectiveness of our framework for intent classification in
Alexa (Amazon's personal assistant), using both synthetic and real-world
datasets. Experiments show that our framework can accurately learn annotator
expertise, infer true labels, and effectively reduce the amount of annotations
in model training as compared to state-of-the-art approaches. We further
discuss the potential of our proposed framework in bridging machine learning
and crowdsourcing towards improved human-in-the-loop systems
A quick guide for student-driven community genome annotation
High quality gene models are necessary to expand the molecular and genetic
tools available for a target organism, but these are available for only a
handful of model organisms that have undergone extensive curation and
experimental validation over the course of many years. The majority of gene
models present in biological databases today have been identified in draft
genome assemblies using automated annotation pipelines that are frequently
based on orthologs from distantly related model organisms. Manual curation is
time consuming and often requires substantial expertise, but is instrumental in
improving gene model structure and identification. Manual annotation may seem
to be a daunting and cost-prohibitive task for small research communities but
involving undergraduates in community genome annotation consortiums can be
mutually beneficial for both education and improved genomic resources. We
outline a workflow for efficient manual annotation driven by a team of
primarily undergraduate annotators. This model can be scaled to large teams and
includes quality control processes through incremental evaluation. Moreover, it
gives students an opportunity to increase their understanding of genome biology
and to participate in scientific research in collaboration with peers and
senior researchers at multiple institutions
- …