2,966 research outputs found
Facial Video-based Remote Physiological Measurement via Self-supervised Learning
Facial video-based remote physiological measurement aims to estimate remote
photoplethysmography (rPPG) signals from human face videos and then measure
multiple vital signs (e.g. heart rate, respiration frequency) from rPPG
signals. Recent approaches achieve it by training deep neural networks, which
normally require abundant facial videos and synchronously recorded
photoplethysmography (PPG) signals for supervision. However, the collection of
these annotated corpora is not easy in practice. In this paper, we introduce a
novel frequency-inspired self-supervised framework that learns to estimate rPPG
signals from facial videos without the need of ground truth PPG signals. Given
a video sample, we first augment it into multiple positive/negative samples
which contain similar/dissimilar signal frequencies to the original one.
Specifically, positive samples are generated using spatial augmentation.
Negative samples are generated via a learnable frequency augmentation module,
which performs non-linear signal frequency transformation on the input without
excessively changing its visual appearance. Next, we introduce a local rPPG
expert aggregation module to estimate rPPG signals from augmented samples. It
encodes complementary pulsation information from different face regions and
aggregate them into one rPPG prediction. Finally, we propose a series of
frequency-inspired losses, i.e. frequency contrastive loss, frequency ratio
consistency loss, and cross-video frequency agreement loss, for the
optimization of estimated rPPG signals from multiple augmented video samples
and across temporally neighboring video samples. We conduct rPPG-based heart
rate, heart rate variability and respiration frequency estimation on four
standard benchmarks. The experimental results demonstrate that our method
improves the state of the art by a large margin.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligenc
Remote Heart Rate Monitoring in Smart Environments from Videos with Self-supervised Pre-training
Recent advances in deep learning have made it increasingly feasible to
estimate heart rate remotely in smart environments by analyzing videos.
However, a notable limitation of deep learning methods is their heavy reliance
on extensive sets of labeled data for effective training. To address this
issue, self-supervised learning has emerged as a promising avenue. Building on
this, we introduce a solution that utilizes self-supervised contrastive
learning for the estimation of remote photoplethysmography (PPG) and heart rate
monitoring, thereby reducing the dependence on labeled data and enhancing
performance. We propose the use of 3 spatial and 3 temporal augmentations for
training an encoder through a contrastive framework, followed by utilizing the
late-intermediate embeddings of the encoder for remote PPG and heart rate
estimation. Our experiments on two publicly available datasets showcase the
improvement of our proposed approach over several related works as well as
supervised learning baselines, as our results approach the state-of-the-art. We
also perform thorough experiments to showcase the effects of using different
design choices such as the video representation learning method, the
augmentations used in the pre-training stage, and others. We also demonstrate
the robustness of our proposed method over the supervised learning approaches
on reduced amounts of labeled data.Comment: Accepted in IEEE Internet of Things Journal 202
Contrast-Phys+: Unsupervised and Weakly-supervised Video-based Remote Physiological Measurement via Spatiotemporal Contrast
Video-based remote physiological measurement utilizes facial videos to
measure the blood volume change signal, which is also called remote
photoplethysmography (rPPG). Supervised methods for rPPG measurements have been
shown to achieve good performance. However, the drawback of these methods is
that they require facial videos with ground truth (GT) physiological signals,
which are often costly and difficult to obtain. In this paper, we propose
Contrast-Phys+, a method that can be trained in both unsupervised and
weakly-supervised settings. We employ a 3DCNN model to generate multiple
spatiotemporal rPPG signals and incorporate prior knowledge of rPPG into a
contrastive loss function. We further incorporate the GT signals into
contrastive learning to adapt to partial or misaligned labels. The contrastive
loss encourages rPPG/GT signals from the same video to be grouped together,
while pushing those from different videos apart. We evaluate our methods on
five publicly available datasets that include both RGB and Near-infrared
videos. Contrast-Phys+ outperforms the state-of-the-art supervised methods,
even when using partially available or misaligned GT signals, or no labels at
all. Additionally, we highlight the advantages of our methods in terms of
computational efficiency, noise robustness, and generalization
rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for Remote Physiological Measurement
Remote photoplethysmography (rPPG) is an important technique for perceiving
human vital signs, which has received extensive attention. For a long time,
researchers have focused on supervised methods that rely on large amounts of
labeled data. These methods are limited by the requirement for large amounts of
data and the difficulty of acquiring ground truth physiological signals. To
address these issues, several self-supervised methods based on contrastive
learning have been proposed. However, they focus on the contrastive learning
between samples, which neglect the inherent self-similar prior in physiological
signals and seem to have a limited ability to cope with noisy. In this paper, a
linear self-supervised reconstruction task was designed for extracting the
inherent self-similar prior in physiological signals. Besides, a specific
noise-insensitive strategy was explored for reducing the interference of motion
and illumination. The proposed framework in this paper, namely rPPG-MAE,
demonstrates excellent performance even on the challenging VIPL-HR dataset. We
also evaluate the proposed method on two public datasets, namely PURE and
UBFC-rPPG. The results show that our method not only outperforms existing
self-supervised methods but also exceeds the state-of-the-art (SOTA) supervised
methods. One important observation is that the quality of the dataset seems
more important than the size in self-supervised pre-training of rPPG. The
source code is released at https://github.com/linuxsino/rPPG-MAE
Comparative evaluation of the applicability of self-organized operational neural networks to remote photoplethysmography
Abstract. Photoplethysmography (PPG) is a widely applied means of obtaining blood volume pulse (BVP) information from subjects which can be used for monitoring numerous physiological signs such as heart rate and respiration. Following observations that blood volume information can also be retrieved from videos recorded of the human face, several approaches for the remote extraction of PPG signals have been proposed in literature. These methods are collectively referred to as remote photoplethysmography (rPPG). The current state of the art of rPPG approaches is represented by deep convolutional neural network (CNN) models, which have been successfully applied in a wide range of computer vision tasks.
A novel technology called operational neural networks (ONNs) has recently been proposed in literature as an extension of convolutional neural networks. ONNs attempt to overcome the limitations of conventional CNN models which are primarily caused by exclusively employing the linear neuron model. In addition, to address certain drawbacks of ONNs, a technology called self- organized operational neural networks (Self-ONNs) have recently been proposed as an extension of ONNs.
This thesis presents a novel method for rPPG extraction based on self-organized operational neural networks. To comprehensively evaluate the applicability of Self-ONNs as an approach for rPPG extraction, three Self-ONN models with varying number of layers are implemented and evaluated on test data from three data sets representing different distributions. The performance of the proposed models are compared against corresponding CNN architectures as well as a typical unsupervised rPPG pipeline. The performance of the methods is evaluated based on heart rate estimations calculated from the extracted rPPG signals.
In the presented experimental setup, Self-ONN models did not result in improved heart rate estimation performance over parameter-equivalent CNN alternatives. However, every Self-ONN model showed superior ability to fit the train target, which both shows promise for the applicability of Self-ONNs as well as suggests inherent problems in the training setup. Additionally, when taking into account the required computational resources in addition to raw HR estimation performance, certain Self-ONN models showcased improved efficiency over CNN alternatives. As such, the experiments nonetheless present a promising proof of concept which can serve as grounds for future research.Vertaileva arviointi itseorganisoituvien operationaalisten neuroverkkojen soveltuvuudesta etäfotopletysmografiaan. Tiivistelmä. Fotopletysmografia on laajasti sovellettu menetelmä veritilavuuspulssi-informaation saamiseksi kohteista, jota voidaan käyttää useiden fysiologisten arvojen, kuten sydämensykkeen ja hengityksen, seurannassa. Seuraten havainnoista, että veritilavuusinformaatiota on mahdollista palauttaa myös ihmiskasvoista kuvatuista videoista, useita menetelmiä fotopletysmografiasignaalien erottamiseksi etänä on esitetty kirjallisuudessa. Yhteisnimitys näille menetelmille on etäfotopletysmografia (remote photoplethysmography, rPPG). Syvät konvolutionaaliset neuroverkkomallit (convolutional neural networks, CNNs), joita on onnistuneesti sovellettu laajaan valikoimaan tietokonenäön tehtäviä, edustavat nykyistä rPPG-lähestymistapojen huippua.
Uusi teknologia nimeltään operationaaliset neuroverkot (operational neural networks, ONNs) on hiljattain esitetty kirjallisuudessa konvolutionaalisten neuroverkkojen laajennukseksi. ONN:t pyrkivät eroon tavanomaisten CNN-mallien rajoitteista, jotka johtuvat pääasiassa lineaarisen neuronimallin yksinomaisesta käytöstä. Lisäksi tietyistä ONN-mallien heikkouksista eroon pääsemiseksi, teknologia nimeltään itseorganisoituvat operationaaliset neuroverkot (self-organized operational neural networks, Self-ONNs) on hiljattain esitetty lajeennuksena ONN:ille.
Tämä tutkielma esittelee uudenlaisen menetelmän rPPG-erotukselle pohjautuen itseorganisoituviin operationaalisiin neuroverkkoihin. Self-ONN:ien soveltuvuuden rPPG-erotukseen perusteelliseksi arvioimiseksi kolme Self-ONN -mallia vaihtelevalla määrällä kerroksia toteutetaan ja arvioidaan testidatalla kolmesta eri datajoukosta, jotka edustavat eri jakaumia. Esitettyjen mallien suorituskykyä verrataan vastaaviin CNN-arkkitehtuureihin sekä tyypilliseen ohjaamattomaan rPPG-liukuhihnaan. Menetelmien suorituskykyä arvioidaan perustuen rPPG-signaaleista laskettuihin sydämensykearvioihin.
Esitellyssä kokeellisessa asetelmassa Self-ONN:t eivät johtaneet parempiin sykearvioihin verrattuna parametrivastaaviin CNN-vaihtoehtoihin. Self-ONN:t kuitenkin osoittivat ylivertaista kykyä sovittaa opetuskohteen, mikä sekä on lupaavaa Self-ONN:ien soveltuvuuden kannalta että viittaa luontaisiin ongelmiin opetusasetelmassa. Lisäksi, kun huomioon otetaan vaaditut laskentaresurssit raa’an sykkeen arvioinnin suorituskyvyn lisäksi, tietyt Self-ONN -mallit osoittivat parempaa tehokkuutta CNN-vaihtoehtoihin verrattuna. Näin ollen kokeet joka tapauksessa tarjoavat lupaavan konseptitodistuksen, joka voi toimia perustana tulevalle tutkimukselle
Privacy-Preserving Remote Heart Rate Estimation from Facial Videos
Remote Photoplethysmography (rPPG) is the process of estimating PPG from
facial videos. While this approach benefits from contactless interaction, it is
reliant on videos of faces, which often constitutes an important privacy
concern. Recent research has revealed that deep learning techniques are
vulnerable to attacks, which can result in significant data breaches making
deep rPPG estimation even more sensitive. To address this issue, we propose a
data perturbation method that involves extraction of certain areas of the face
with less identity-related information, followed by pixel shuffling and
blurring. Our experiments on two rPPG datasets (PURE and UBFC) show that our
approach reduces the accuracy of facial recognition algorithms by over 60%,
with minimal impact on rPPG extraction. We also test our method on three facial
recognition datasets (LFW, CALFW, and AgeDB), where our approach reduced
performance by nearly 50%. Our findings demonstrate the potential of our
approach as an effective privacy-preserving solution for rPPG estimation.Comment: Accepted in IEEE International Conference on Systems, Man, and
Cybernetics (SMC) 202
Camera-Based Heart Rate Extraction in Noisy Environments
Remote photoplethysmography (rPPG) is a non-invasive technique that benefits from video to measure vital signs such as the heart rate (HR). In rPPG estimation, noise can introduce artifacts that distort rPPG signal and jeopardize accurate HR measurement. Considering that most rPPG studies occurred in lab-controlled environments, the issue of noise in realistic conditions remains open.
This thesis aims to examine the challenges of noise in rPPG estimation in realistic scenarios, specifically investigating the effect of noise arising from illumination variation and motion artifacts on the predicted rPPG HR. To mitigate the impact of noise, a modular rPPG measurement framework, comprising data preprocessing, region of interest, signal extraction, preparation, processing, and HR extraction is developed. The proposed pipeline is tested on the LGI-PPGI-Face-Video-Database public dataset, hosting four different candidates and real-life scenarios. In the RoI module, raw rPPG signals were extracted from the dataset using three machine learning-based face detectors, namely Haarcascade, Dlib, and MediaPipe, in parallel. Subsequently, the collected signals underwent preprocessing, independent component analysis, denoising, and frequency domain conversion for peak detection.
Overall, the Dlib face detector leads to the most successful HR for the majority of scenarios. In 50% of all scenarios and candidates, the average predicted HR for Dlib is either in line or very close to the average reference HR. The extracted HRs from the Haarcascade and MediaPipe architectures make up 31.25% and 18.75% of plausible results, respectively. The analysis highlighted the importance of fixated facial landmarks in collecting quality raw data and reducing noise
Non-Contrastive Unsupervised Learning of Physiological Signals from Video
Subtle periodic signals such as blood volume pulse and respiration can be
extracted from RGB video, enabling remote health monitoring at low cost.
Advancements in remote pulse estimation -- or remote photoplethysmography
(rPPG) -- are currently driven by deep learning solutions. However, modern
approaches are trained and evaluated on benchmark datasets with associated
ground truth from contact-PPG sensors. We present the first non-contrastive
unsupervised learning framework for signal regression to break free from the
constraints of labelled video data. With minimal assumptions of periodicity and
finite bandwidth, our approach is capable of discovering the blood volume pulse
directly from unlabelled videos. We find that encouraging sparse power spectra
within normal physiological bandlimits and variance over batches of power
spectra is sufficient for learning visual features of periodic signals. We
perform the first experiments utilizing unlabelled video data not specifically
created for rPPG to train robust pulse rate estimators. Given the limited
inductive biases and impressive empirical results, the approach is
theoretically capable of discovering other periodic signals from video,
enabling multiple physiological measurements without the need for ground truth
signals. Codes to fully reproduce the experiments are made available along with
the paper.Comment: Accepted to CVPR 202
- …