8,418 research outputs found
Image segmentation using fuzzy LVQ clustering networks
In this note we formulate image segmentation as a clustering problem. Feature vectors extracted from a raw image are clustered into subregions, thereby segmenting the image. A fuzzy generalization of a Kohonen learning vector quantization (LVQ) which integrates the Fuzzy c-Means (FCM) model with the learning rate and updating strategies of the LVQ is used for this task. This network, which segments images in an unsupervised manner, is thus related to the FCM optimization problem. Numerical examples on photographic and magnetic resonance images are given to illustrate this approach to image segmentation
Deformable Prototypes for Encoding Shape Categories in Image Databases
We describe a method for shape-based image database search that uses deformable prototypes to represent categories. Rather than directly comparing a candidate shape with all shape entries in the database, shapes are compared in terms of the types of nonrigid deformations (differences) that relate them to a small subset of representative prototypes. To solve the shape correspondence and alignment problem, we employ the technique of modal matching, an information-preserving shape decomposition for matching, describing, and comparing shapes despite sensor variations and nonrigid deformations. In modal matching, shape is decomposed into an ordered basis of orthogonal principal components. We demonstrate the utility of this approach for shape comparison in 2-D image databases.Office of Naval Research (Young Investigator Award N00014-06-1-0661
APANet: Adaptive Prototypes Alignment Network for Few-Shot Semantic Segmentation
Few-shot semantic segmentation aims to segment novel-class objects in a given query image with only a few labeled support images. Most advanced solutions exploit a metric learning framework that performs segmentation through matching each query feature to a learned class-specific prototype. However, this framework suffers from biased classification due to incomplete feature comparisons. To address this issue, we present an adaptive prototype representation by introducing class-specific and class-agnostic prototypes and thus construct complete sample pairs for learning semantic alignment with query features. The complementary features learning manner effectively enriches feature comparison and helps yield an unbiased segmentation model in the few-shot setting. It is implemented with a two-branch end-to-end network (\ie, a class-specific branch and a class-agnostic branch), which generates prototypes and then combines query features to perform comparisons. In addition, the proposed class-agnostic branch is simple yet effective. In practice, it can adaptively generate multiple class-agnostic prototypes for query images and learn feature alignment in a self-contrastive manner. Extensive experiments on PASCAL-5 i and COCO-20 i demonstrate the superiority of our method. At no expense of inference efficiency, our model achieves state-of-the-art results in both 1-shot and 5-shot settings for few-shot semantic segmentation
APANet: Adaptive Prototypes Alignment Network for Few-Shot Semantic Segmentation
Few-shot semantic segmentation aims to segment novel-class objects in a given query image with only a few labeled support images. Most advanced solutions exploit a metric learning framework that performs segmentation through matching each query feature to a learned class-specific prototype. However, this framework suffers from biased classification due to incomplete feature comparisons. To address this issue, we present an adaptive prototype representation by introducing class-specific and class-agnostic prototypes and thus construct complete sample pairs for learning semantic alignment with query features. The complementary features learning manner effectively enriches feature comparison and helps yield an unbiased segmentation model in the few-shot setting. It is implemented with a two-branch end-to-end network (\ie, a class-specific branch and a class-agnostic branch), which generates prototypes and then combines query features to perform comparisons. In addition, the proposed class-agnostic branch is simple yet effective. In practice, it can adaptively generate multiple class-agnostic prototypes for query images and learn feature alignment in a self-contrastive manner. Extensive experiments on PASCAL-5 i and COCO-20 i demonstrate the superiority of our method. At no expense of inference efficiency, our model achieves state-of-the-art results in both 1-shot and 5-shot settings for few-shot semantic segmentation
Semantic Autoencoder for Zero-Shot Learning
Existing zero-shot learning (ZSL) models typically learn a projection
function from a feature space to a semantic embedding space (e.g.~attribute
space). However, such a projection function is only concerned with predicting
the training seen class semantic representation (e.g.~attribute prediction) or
classification. When applied to test data, which in the context of ZSL contains
different (unseen) classes without training data, a ZSL model typically suffers
from the project domain shift problem. In this work, we present a novel
solution to ZSL based on learning a Semantic AutoEncoder (SAE). Taking the
encoder-decoder paradigm, an encoder aims to project a visual feature vector
into the semantic space as in the existing ZSL models. However, the decoder
exerts an additional constraint, that is, the projection/code must be able to
reconstruct the original visual feature. We show that with this additional
reconstruction constraint, the learned projection function from the seen
classes is able to generalise better to the new unseen classes. Importantly,
the encoder and decoder are linear and symmetric which enable us to develop an
extremely efficient learning algorithm. Extensive experiments on six benchmark
datasets demonstrate that the proposed SAE outperforms significantly the
existing ZSL models with the additional benefit of lower computational cost.
Furthermore, when the SAE is applied to supervised clustering problem, it also
beats the state-of-the-art.Comment: accepted to CVPR201
GOGGLES: Automatic Image Labeling with Affinity Coding
Generating large labeled training data is becoming the biggest bottleneck in
building and deploying supervised machine learning models. Recently, the data
programming paradigm has been proposed to reduce the human cost in labeling
training data. However, data programming relies on designing labeling functions
which still requires significant domain expertise. Also, it is prohibitively
difficult to write labeling functions for image datasets as it is hard to
express domain knowledge using raw features for images (pixels).
We propose affinity coding, a new domain-agnostic paradigm for automated
training data labeling. The core premise of affinity coding is that the
affinity scores of instance pairs belonging to the same class on average should
be higher than those of pairs belonging to different classes, according to some
affinity functions. We build the GOGGLES system that implements affinity coding
for labeling image datasets by designing a novel set of reusable affinity
functions for images, and propose a novel hierarchical generative model for
class inference using a small development set.
We compare GOGGLES with existing data programming systems on 5 image labeling
tasks from diverse domains. GOGGLES achieves labeling accuracies ranging from a
minimum of 71% to a maximum of 98% without requiring any extensive human
annotation. In terms of end-to-end performance, GOGGLES outperforms the
state-of-the-art data programming system Snuba by 21% and a state-of-the-art
few-shot learning technique by 5%, and is only 7% away from the fully
supervised upper bound.Comment: Published at 2020 ACM SIGMOD International Conference on Management
of Dat
- …