4 research outputs found
Improving Image Clustering With Multiple Pretrained CNN Feature Extractors
For many image clustering problems, replacing raw image data with features
extracted by a pretrained convolutional neural network (CNN), leads to better
clustering performance. However, the specific features extracted, and, by
extension, the selected CNN architecture, can have a major impact on the
clustering results. In practice, this crucial design choice is often decided
arbitrarily due to the impossibility of using cross-validation with
unsupervised learning problems. However, information contained in the different
pretrained CNN architectures may be complementary, even when pretrained on the
same data. To improve clustering performance, we rephrase the image clustering
problem as a multi-view clustering (MVC) problem that considers multiple
different pretrained feature extractors as different "views" of the same data.
We then propose a multi-input neural network architecture that is trained
end-to-end to solve the MVC problem effectively. Our experimental results,
conducted on three different natural image datasets, show that: 1. using
multiple pretrained CNNs jointly as feature extractors improves image
clustering; 2. using an end-to-end approach improves MVC; and 3. combining both
produces state-of-the-art results for the problem of image clustering.Comment: 13 pages, 3 figures, 4 tables. Poster presentation at BMVC 2018
(29.9% acceptance
Big-Data Clustering: K-Means or K-Indicators?
The K-means algorithm is arguably the most popular data clustering method,
commonly applied to processed datasets in some "feature spaces", as is in
spectral clustering. Highly sensitive to initializations, however, K-means
encounters a scalability bottleneck with respect to the number of clusters K as
this number grows in big data applications. In this work, we promote a closely
related model called K-indicators model and construct an efficient,
semi-convex-relaxation algorithm that requires no randomized initializations.
We present extensive empirical results to show advantages of the new algorithm
when K is large. In particular, using the new algorithm to start the K-means
algorithm, without any replication, can significantly outperform the standard
K-means with a large number of currently state-of-the-art random replications
Self-Supervised Learning in Multi-Task Graphs through Iterative Consensus Shift
The human ability to synchronize the feedback from all their senses inspired
recent works in multi-task and multi-modal learning. While these works rely on
expensive supervision, our multi-task graph requires only pseudo-labels from
expert models. Every graph node represents a task, and each edge learns between
tasks transformations. Once initialized, the graph learns self-supervised,
based on a novel consensus shift algorithm that intelligently exploits the
agreement between graph pathways to generate new pseudo-labels for the next
learning cycle. We demonstrate significant improvement from one unsupervised
learning iteration to the next, outperforming related recent methods in
extensive multi-task learning experiments on two challenging datasets. Our code
is available at https://github.com/bit-ml/cshift.Comment: Accepted at The British Machine Vision Conference (BMVC) 2021, 12
pages, 6 figures, 5 table
Combining pretrained CNN feature extractors to enhance clustering of complex natural images
Recently, a common starting point for solving complex unsupervised image
classification tasks is to use generic features, extracted with deep
Convolutional Neural Networks (CNN) pretrained on a large and versatile dataset
(ImageNet). However, in most research, the CNN architecture for feature
extraction is chosen arbitrarily, without justification. This paper aims at
providing insight on the use of pretrained CNN features for image clustering
(IC). First, extensive experiments are conducted and show that, for a given
dataset, the choice of the CNN architecture for feature extraction has a huge
impact on the final clustering. These experiments also demonstrate that proper
extractor selection for a given IC task is difficult. To solve this issue, we
propose to rephrase the IC problem as a multi-view clustering (MVC) problem
that considers features extracted from different architectures as different
"views" of the same data. This approach is based on the assumption that
information contained in the different CNN may be complementary, even when
pretrained on the same data. We then propose a multi-input neural network
architecture that is trained end-to-end to solve the MVC problem effectively.
This approach is tested on nine natural image datasets, and produces
state-of-the-art results for IC.Comment: 21 pages, 16 figures, 10 tables, preprint of our paper published in
Neurocomputin