14 research outputs found
Differentiable Meta-learning Model for Few-shot Semantic Segmentation
To address the annotation scarcity issue in some cases of semantic
segmentation, there have been a few attempts to develop the segmentation model
in the few-shot learning paradigm. However, most existing methods only focus on
the traditional 1-way segmentation setting (i.e., one image only contains a
single object). This is far away from practical semantic segmentation tasks
where the K-way setting (K>1) is usually required by performing the accurate
multi-object segmentation. To deal with this issue, we formulate the few-shot
semantic segmentation task as a learning-based pixel classification problem and
propose a novel framework called MetaSegNet based on meta-learning. In
MetaSegNet, an architecture of embedding module consisting of the global and
local feature branches is developed to extract the appropriate meta-knowledge
for the few-shot segmentation. Moreover, we incorporate a linear model into
MetaSegNet as a base learner to directly predict the label of each pixel for
the multi-object segmentation. Furthermore, our MetaSegNet can be trained by
the episodic training mechanism in an end-to-end manner from scratch.
Experiments on two popular semantic segmentation datasets, i.e., PASCAL VOC and
COCO, reveal the effectiveness of the proposed MetaSegNet in the K-way few-shot
semantic segmentation task.Comment: Accepted by AAAI202
Part-aware Prototype Network for Few-shot Semantic Segmentation
Few-shot semantic segmentation aims to learn to segment new object classes
with only a few annotated examples, which has a wide range of real-world
applications. Most existing methods either focus on the restrictive setting of
one-way few-shot segmentation or suffer from incomplete coverage of object
regions. In this paper, we propose a novel few-shot semantic segmentation
framework based on the prototype representation. Our key idea is to decompose
the holistic class representation into a set of part-aware prototypes, capable
of capturing diverse and fine-grained object features. In addition, we propose
to leverage unlabeled data to enrich our part-aware prototypes, resulting in
better modeling of intra-class variations of semantic objects. We develop a
novel graph neural network model to generate and enhance the proposed
part-aware prototypes based on labeled and unlabeled images. Extensive
experimental evaluations on two benchmarks show that our method outperforms the
prior art with a sizable margin.Comment: ECCV-202
Generalized Few-shot Semantic Segmentation
Training semantic segmentation models requires a large amount of finely
annotated data, making it hard to quickly adapt to novel classes not satisfying
this condition. Few-Shot Segmentation (FS-Seg) tackles this problem with many
constraints. In this paper, we introduce a new benchmark, called Generalized
Few-Shot Semantic Segmentation (GFS-Seg), to analyze the generalization ability
of simultaneously segmenting the novel categories with very few examples and
the base categories with sufficient examples. It is the first study showing
that previous representative state-of-the-art FS-Seg methods fall short in
GFS-Seg and the performance discrepancy mainly comes from the constrained
setting of FS-Seg. To make GFS-Seg tractable, we set up a GFS-Seg baseline that
achieves decent performance without structural change on the original model.
Then, since context is essential for semantic segmentation, we propose the
Context-Aware Prototype Learning (CAPL) that significantly improves performance
by 1) leveraging the co-occurrence prior knowledge from support samples, and 2)
dynamically enriching contextual information to the classifier, conditioned on
the content of each query image. Both two contributions are experimentally
shown to have substantial practical merit. Extensive experiments on Pascal-VOC
and COCO manifest the effectiveness of CAPL, and CAPL generalizes well to
FS-Seg by achieving competitive performance. Code will be made publicly
available
Progressive One-shot Human Parsing
Prior human parsing models are limited to parsing humans into classes
pre-defined in the training data, which is not flexible to generalize to unseen
classes, e.g., new clothing in fashion analysis. In this paper, we propose a
new problem named one-shot human parsing (OSHP) that requires to parse human
into an open set of reference classes defined by any single reference example.
During training, only base classes defined in the training set are exposed,
which can overlap with part of reference classes. In this paper, we devise a
novel Progressive One-shot Parsing network (POPNet) to address two critical
challenges , i.e., testing bias and small sizes. POPNet consists of two
collaborative metric learning modules named Attention Guidance Module and
Nearest Centroid Module, which can learn representative prototypes for base
classes and quickly transfer the ability to unseen classes during testing,
thereby reducing testing bias. Moreover, POPNet adopts a progressive human
parsing framework that can incorporate the learned knowledge of parent classes
at the coarse granularity to help recognize the descendant classes at the fine
granularity, thereby handling the small sizes issue. Experiments on the ATR-OS
benchmark tailored for OSHP demonstrate POPNet outperforms other representative
one-shot segmentation models by large margins and establishes a strong
baseline. Source code can be found at
https://github.com/Charleshhy/One-shot-Human-Parsing.Comment: Accepted in AAAI 2021. 9 pages, 4 figure
Self-supervised learning for few-shot medical image segmentation
Fully-supervised deep learning segmentation models are inflexible when encountering new unseen semantic classes and their fine-tuning often requires significant amounts of annotated data. Few-shot semantic segmentation (FSS) aims to solve this inflexibility by learning to segment an arbitrary unseen semantically meaningful class by referring to only a few labeled examples, without involving fine-tuning. State-of-the-art FSS methods are typically designed for segmenting natural images and rely on abundant annotated data of training classes to learn image representations that generalize well to unseen testing classes. However, such a training mechanism is impractical in annotation-scarce medical imaging scenarios. To address this challenge, in this work, we propose a novel self-supervised FSS framework for medical images, named SSL-ALPNet, in order to bypass the requirement for annotations during training. The proposed method exploits superpixel-based pseudo-labels to provide supervision signals. In addition, we propose a simple yet effective adaptive local prototype pooling module which is plugged into the prototype networks to further boost segmentation accuracy. We demonstrate the general applicability of the proposed approach using three different tasks: organ segmentation of abdominal CT and MRI images respectively, and cardiac segmentation of MRI images. The proposed method yields higher Dice scores than conventional FSS methods which require manual annotations for training in our experiments