2,916 research outputs found
Deep Metric Learning via Facility Location
Learning the representation and the similarity metric in an end-to-end
fashion with deep networks have demonstrated outstanding results for clustering
and retrieval. However, these recent approaches still suffer from the
performance degradation stemming from the local metric training procedure which
is unaware of the global structure of the embedding space.
We propose a global metric learning scheme for optimizing the deep metric
embedding with the learnable clustering function and the clustering metric
(NMI) in a novel structured prediction framework.
Our experiments on CUB200-2011, Cars196, and Stanford online products
datasets show state of the art performance both on the clustering and retrieval
tasks measured in the NMI and Recall@K evaluation metrics.Comment: Submission accepted at CVPR 201
Geometric deep learning
The goal of these course notes is to describe the main mathematical ideas behind geometric deep learning and to provide implementation details for several applications in shape analysis and synthesis, computer vision and computer graphics. The text in the course materials is primarily based on previously published work. With these notes we gather and provide a clear picture of the key concepts and techniques that fall under the umbrella of geometric deep learning, and illustrate the applications they enable. We also aim to provide practical implementation details for the methods presented in these works, as well as suggest further readings and extensions of these ideas
Attribute Graph Clustering via Learnable Augmentation
Contrastive deep graph clustering (CDGC) utilizes contrastive learning to
group nodes into different clusters. Better augmentation techniques benefit the
quality of the contrastive samples, thus being one of key factors to improve
performance. However, the augmentation samples in existing methods are always
predefined by human experiences, and agnostic from the downstream task
clustering, thus leading to high human resource costs and poor performance. To
this end, we propose an Attribute Graph Clustering method via Learnable
Augmentation (\textbf{AGCLA}), which introduces learnable augmentors for
high-quality and suitable augmented samples for CDGC. Specifically, we design
two learnable augmentors for attribute and structure information, respectively.
Besides, two refinement matrices, including the high-confidence pseudo-label
matrix and the cross-view sample similarity matrix, are generated to improve
the reliability of the learned affinity matrix. During the training procedure,
we notice that there exist differences between the optimization goals for
training learnable augmentors and contrastive learning networks. In other
words, we should both guarantee the consistency of the embeddings as well as
the diversity of the augmented samples. Thus, an adversarial learning mechanism
is designed in our method. Moreover, a two-stage training strategy is leveraged
for the high-confidence refinement matrices. Extensive experimental results
demonstrate the effectiveness of AGCLA on six benchmark datasets
DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning
Deep learning has proved to be very effective in learning with a large amount
of labelled data. Few-shot learning in contrast attempts to learn with only a
few labelled data. In this work, we develop methods for few-shot image
classification from a new perspective of optimal matching between image
regions. We employ the Earth Mover's Distance (EMD) as a metric to compute a
structural distance between dense image representations to determine image
relevance. The EMD generates the optimal matching flows between structural
elements that have the minimum matching cost, which is used to calculate the
image distance for classification. To generate the important weights of
elements in the EMD formulation, we design a cross-reference mechanism, which
can effectively alleviate the adverse impact caused by the cluttered background
and large intra-class appearance variations. To handle -shot classification,
we propose to learn a structured fully connected layer that can directly
classify dense image representations with the proposed EMD. Based on the
implicit function theorem, the EMD can be inserted as a layer into the network
for end-to-end training. Our extensive experiments validate the effectiveness
of our algorithm which outperforms state-of-the-art methods by a significant
margin on four widely used few-shot classification benchmarks, namely,
miniImageNet, tieredImageNet, Fewshot-CIFAR100 (FC100) and Caltech-UCSD
Birds-200-2011 (CUB).Comment: Extended version of DeepEMD in CVPR2020 (oral
- …