Contrastive learning usually compares one positive anchor sample with lots of
negative samples to perform Self-Supervised Learning (SSL). Alternatively,
non-contrastive learning, as exemplified by methods like BYOL, SimSiam, and
Barlow Twins, accomplishes SSL without the explicit use of negative samples.
Inspired by the existing analysis for contrastive learning, we provide a
reproducing kernel Hilbert space (RKHS) understanding of many existing
non-contrastive learning methods. Subsequently, we propose a novel loss
function, Kernel-SSL, which directly optimizes the mean embedding and the
covariance operator within the RKHS. In experiments, our method Kernel-SSL
outperforms state-of-the-art methods by a large margin on ImageNet datasets
under the linear evaluation settings. Specifically, when performing 100 epochs
pre-training, our method outperforms SimCLR by 4.6%