Remote photoplethysmography (rPPG) is an important technique for perceiving
human vital signs, which has received extensive attention. For a long time,
researchers have focused on supervised methods that rely on large amounts of
labeled data. These methods are limited by the requirement for large amounts of
data and the difficulty of acquiring ground truth physiological signals. To
address these issues, several self-supervised methods based on contrastive
learning have been proposed. However, they focus on the contrastive learning
between samples, which neglect the inherent self-similar prior in physiological
signals and seem to have a limited ability to cope with noisy. In this paper, a
linear self-supervised reconstruction task was designed for extracting the
inherent self-similar prior in physiological signals. Besides, a specific
noise-insensitive strategy was explored for reducing the interference of motion
and illumination. The proposed framework in this paper, namely rPPG-MAE,
demonstrates excellent performance even on the challenging VIPL-HR dataset. We
also evaluate the proposed method on two public datasets, namely PURE and
UBFC-rPPG. The results show that our method not only outperforms existing
self-supervised methods but also exceeds the state-of-the-art (SOTA) supervised
methods. One important observation is that the quality of the dataset seems
more important than the size in self-supervised pre-training of rPPG. The
source code is released at https://github.com/linuxsino/rPPG-MAE