This paper presents a new way to identify additional positive pairs for BYOL,
a state-of-the-art (SOTA) self-supervised learning framework, to improve its
representation learning ability. Unlike conventional BYOL which relies on only
one positive pair generated by two augmented views of the same image, we argue
that information from different images with the same label can bring more
diversity and variations to the target features, thus benefiting representation
learning. To identify such pairs without any label, we investigate TracIn, an
instance-based and computationally efficient influence function, for BYOL
training. Specifically, TracIn is a gradient-based method that reveals the
impact of a training sample on a test sample in supervised learning. We extend
it to the self-supervised learning setting and propose an efficient batch-wise
per-sample gradient computation method to estimate the pairwise TracIn to
represent the similarity of samples in the mini-batch during training. For each
image, we select the most similar sample from other images as the additional
positive and pull their features together with BYOL loss. Experimental results
on two public medical datasets (i.e., ISIC 2019 and ChestX-ray) demonstrate
that the proposed method can improve the classification performance compared to
other competitive baselines in both semi-supervised and transfer learning
settings.Comment: 8 page