2 research outputs found
Manifold Contrastive Learning with Variational Lie Group Operators
Self-supervised learning of deep neural networks has become a prevalent
paradigm for learning representations that transfer to a variety of downstream
tasks. Similar to proposed models of the ventral stream of biological vision,
it is observed that these networks lead to a separation of category manifolds
in the representations of the penultimate layer. Although this observation
matches the manifold hypothesis of representation learning, current
self-supervised approaches are limited in their ability to explicitly model
this manifold. Indeed, current approaches often only apply augmentations from a
pre-specified set of "positive pairs" during learning. In this work, we propose
a contrastive learning approach that directly models the latent manifold using
Lie group operators parameterized by coefficients with a sparsity-promoting
prior. A variational distribution over these coefficients provides a generative
model of the manifold, with samples which provide feature augmentations
applicable both during contrastive training and downstream tasks. Additionally,
learned coefficient distributions provide a quantification of which
transformations are most likely at each point on the manifold while preserving
identity. We demonstrate benefits in self-supervised benchmarks for image
datasets, as well as a downstream semi-supervised task. In the former case, we
demonstrate that the proposed methods can effectively apply manifold feature
augmentations and improve learning both with and without a projection head. In
the latter case, we demonstrate that feature augmentations sampled from learned
Lie group operators can improve classification performance when using few
labels
Active Learning of Ordinal Embeddings: A User Study on Football Data
Humans innately measure distance between instances in an unlabeled dataset
using an unknown similarity function. Distance metrics can only serve as proxy
for similarity in information retrieval of similar instances. Learning a good
similarity function from human annotations improves the quality of retrievals.
This work uses deep metric learning to learn these user-defined similarity
functions from few annotations for a large football trajectory dataset. We
adapt an entropy-based active learning method with recent work from triplet
mining to collect easy-to-answer but still informative annotations from human
participants and use them to train a deep convolutional network that
generalizes to unseen samples. Our user study shows that our approach improves
the quality of the information retrieval compared to a previous deep metric
learning approach that relies on a Siamese network. Specifically, we shed light
on the strengths and weaknesses of passive sampling heuristics and active
learners alike by analyzing the participants' response efficacy. To this end,
we collect accuracy, algorithmic time complexity, the participants' fatigue and
time-to-response, qualitative self-assessment and statements, as well as the
effects of mixed-expertise annotators and their consistency on model
performance and transfer-learning.Comment: 23 pages, 17 figure