2 research outputs found
Towards Understanding the Mechanism of Contrastive Learning via Similarity Structure: A Theoretical Analysis
Contrastive learning is an efficient approach to self-supervised
representation learning. Although recent studies have made progress in the
theoretical understanding of contrastive learning, the investigation of how to
characterize the clusters of the learned representations is still limited. In
this paper, we aim to elucidate the characterization from theoretical
perspectives. To this end, we consider a kernel-based contrastive learning
framework termed Kernel Contrastive Learning (KCL), where kernel functions play
an important role when applying our theoretical results to other frameworks. We
introduce a formulation of the similarity structure of learned representations
by utilizing a statistical dependency viewpoint. We investigate the theoretical
properties of the kernel-based contrastive loss via this formulation. We first
prove that the formulation characterizes the structure of representations
learned with the kernel-based contrastive learning framework. We show a new
upper bound of the classification error of a downstream task, which explains
that our theory is consistent with the empirical success of contrastive
learning. We also establish a generalization error bound of KCL. Finally, we
show a guarantee for the generalization ability of KCL to the downstream
classification task via a surrogate bound
Denoising Cosine Similarity: A Theory-Driven Approach for Efficient Representation Learning
Representation learning has been increasing its impact on the research and
practice of machine learning, since it enables to learn representations that
can apply to various downstream tasks efficiently. However, recent works pay
little attention to the fact that real-world datasets used during the stage of
representation learning are commonly contaminated by noise, which can degrade
the quality of learned representations. This paper tackles the problem to learn
robust representations against noise in a raw dataset. To this end, inspired by
recent works on denoising and the success of the cosine-similarity-based
objective functions in representation learning, we propose the denoising
Cosine-Similarity (dCS) loss. The dCS loss is a modified cosine-similarity loss
and incorporates a denoising property, which is supported by both our
theoretical and empirical findings. To make the dCS loss implementable, we also
construct the estimators of the dCS loss with statistical guarantees. Finally,
we empirically show the efficiency of the dCS loss over the baseline objective
functions in vision and speech domains