Contrastive learning is an efficient approach to self-supervised
representation learning. Although recent studies have made progress in the
theoretical understanding of contrastive learning, the investigation of how to
characterize the clusters of the learned representations is still limited. In
this paper, we aim to elucidate the characterization from theoretical
perspectives. To this end, we consider a kernel-based contrastive learning
framework termed Kernel Contrastive Learning (KCL), where kernel functions play
an important role when applying our theoretical results to other frameworks. We
introduce a formulation of the similarity structure of learned representations
by utilizing a statistical dependency viewpoint. We investigate the theoretical
properties of the kernel-based contrastive loss via this formulation. We first
prove that the formulation characterizes the structure of representations
learned with the kernel-based contrastive learning framework. We show a new
upper bound of the classification error of a downstream task, which explains
that our theory is consistent with the empirical success of contrastive
learning. We also establish a generalization error bound of KCL. Finally, we
show a guarantee for the generalization ability of KCL to the downstream
classification task via a surrogate bound