55,586 research outputs found
Suggestive Annotation: A Deep Active Learning Framework for Biomedical Image Segmentation
Image segmentation is a fundamental problem in biomedical image analysis.
Recent advances in deep learning have achieved promising results on many
biomedical image segmentation benchmarks. However, due to large variations in
biomedical images (different modalities, image settings, objects, noise, etc),
to utilize deep learning on a new application, it usually needs a new set of
training data. This can incur a great deal of annotation effort and cost,
because only biomedical experts can annotate effectively, and often there are
too many instances in images (e.g., cells) to annotate. In this paper, we aim
to address the following question: With limited effort (e.g., time) for
annotation, what instances should be annotated in order to attain the best
performance? We present a deep active learning framework that combines fully
convolutional network (FCN) and active learning to significantly reduce
annotation effort by making judicious suggestions on the most effective
annotation areas. We utilize uncertainty and similarity information provided by
FCN and formulate a generalized version of the maximum set cover problem to
determine the most representative and uncertain areas for annotation. Extensive
experiments using the 2015 MICCAI Gland Challenge dataset and a lymph node
ultrasound image segmentation dataset show that, using annotation suggestions
by our method, state-of-the-art segmentation performance can be achieved by
using only 50% of training data.Comment: Accepted at MICCAI 201
FedA3I: Annotation Quality-Aware Aggregation for Federated Medical Image Segmentation against Heterogeneous Annotation Noise
Federated learning (FL) has emerged as a promising paradigm for training
segmentation models on decentralized medical data, owing to its
privacy-preserving property. However, existing research overlooks the prevalent
annotation noise encountered in real-world medical datasets, which limits the
performance ceilings of FL. In this paper, we, for the first time, identify and
tackle this problem. For problem formulation, we propose a contour evolution
for modeling non-independent and identically distributed (Non-IID) noise across
pixels within each client and then extend it to the case of multi-source data
to form a heterogeneous noise model (i.e., Non-IID annotation noise across
clients). For robust learning from annotations with such two-level Non-IID
noise, we emphasize the importance of data quality in model aggregation,
allowing high-quality clients to have a greater impact on FL. To achieve this,
we propose Federated learning with Annotation quAlity-aware AggregatIon, named
FedA3I, by introducing a quality factor based on client-wise noise estimation.
Specifically, noise estimation at each client is accomplished through the
Gaussian mixture model and then incorporated into model aggregation in a
layer-wise manner to up-weight high-quality clients. Extensive experiments on
two real-world medical image segmentation datasets demonstrate the superior
performance of FedAI against the state-of-the-art approaches in dealing
with cross-client annotation noise. The code is available at
https://github.com/wnn2000/FedAAAI.Comment: Accepted at AAAI'2
Learning from Crowds by Modeling Common Confusions
Crowdsourcing provides a practical way to obtain large amounts of labeled
data at a low cost. However, the annotation quality of annotators varies
considerably, which imposes new challenges in learning a high-quality model
from the crowdsourced annotations. In this work, we provide a new perspective
to decompose annotation noise into common noise and individual noise and
differentiate the source of confusion based on instance difficulty and
annotator expertise on a per-instance-annotator basis. We realize this new
crowdsourcing model by an end-to-end learning solution with two types of noise
adaptation layers: one is shared across annotators to capture their commonly
shared confusions, and the other one is pertaining to each annotator to realize
individual confusion. To recognize the source of noise in each annotation, we
use an auxiliary network to choose the two noise adaptation layers with respect
to both instances and annotators. Extensive experiments on both synthesized and
real-world benchmarks demonstrate the effectiveness of our proposed common
noise adaptation solution.Comment: Accepted by AAAI 202
Learning Multimodal Latent Attributes
Abstract—The rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. Attribute learning has emerged as a promising paradigm for bridging the semantic gap and addressing data sparsity via transferring attribute knowledge in object recognition and relatively simple action classification. In this paper, we address the task of attribute learning for understanding multimedia data with sparse and incomplete labels. In particular we focus on videos of social group activities, which are particularly challenging and topical examples of this task because of their multi-modal content and complex and unstructured nature relative to the density of annotations. To solve this problem, we (1) introduce a concept of semi-latent attribute space, expressing user-defined and latent attributes in a unified framework, and (2) propose a novel scalable probabilistic topic model for learning multi-modal semi-latent attributes, which dramatically reduces requirements for an exhaustive accurate attribute ontology and expensive annotation effort. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multimedia sparse data learning tasks including: multi-task learning, learning with label noise, N-shot transfer learning and importantly zero-shot learning
Secost: Sequential co-supervision for large scale weakly labeled audio event detection
Weakly supervised learning algorithms are critical for scaling audio event
detection to several hundreds of sound categories. Such learning models should
not only disambiguate sound events efficiently with minimal class-specific
annotation but also be robust to label noise, which is more apparent with weak
labels instead of strong annotations. In this work, we propose a new framework
for designing learning models with weak supervision by bridging ideas from
sequential learning and knowledge distillation. We refer to the proposed
methodology as SeCoST (pronounced Sequest) -- Sequential Co-supervision for
training generations of Students. SeCoST incrementally builds a cascade of
student-teacher pairs via a novel knowledge transfer method. Our evaluations on
Audioset (the largest weakly labeled dataset available) show that SeCoST
achieves a mean average precision of 0.383 while outperforming prior state of
the art by a considerable margin.Comment: Accepted IEEE ICASSP 202
- …