61 research outputs found

    Reading subtle information from human faces

    No full text
    Abstract The face plays an important role in our social interactions as it conveys rich sources of information. We can read a lot from one face image, but there is also information we cannot perceive without special devices. The thesis concerns using computer vision methodologies to analyse two kinds of subtle facial information that can hardly be perceived by naked eyes: the micro-expression (ME), and the heart rate (HR). MEs are rapid, involuntary facial expressions which reveal emotions people do not intend to show. It is difficult for people to perceive MEs as they are too fast and subtle, thus automatic ME analysis is valuable work which may lead to important applications. In the thesis, the progresses of ME studies are reviewed, and four parts of work are described. 1) We introduce the first spontaneous ME database, the SMIC. The lacking of data is hindering ME analysis research, as it is difficult to collect spontaneous MEs. The protocol for inducing and annotating SMIC is introduced to help future ME collections. 2) A framework including three features and a video magnification process is introduced for ME recognition, which outperforms other state-of-the-art methods on two ME databases. 3) An ME spotting method based on feature difference analysis is described, which can spot MEs from spontaneous long videos. 4) An automatic ME analysis system (MESR) was proposed for firstly spotting and then recognising MEs. The HR is an important indicator of our health and emotional status. Traditional HR measurements require skin-contact which cannot be applied remotely. We propose a method which can counter for illumination changes and head motions and measure HR remotely from color facial videos. We also apply the method for solving the face anti-spoofing problem. We show that the pulse-based feature is more robust than traditional texture-based features against unseen mask spoofs. We also show that the proposed pulse-based feature can be combined with other features to build a cascade system for detecting multiple types of attacks. At last, we summarize the contributions of the work, and propose future plans about ME and HR studies based on limitations of the current work. It is also planned to combine the ME and HR (maybe also other subtle signals from face) to build a multimodal system for affective status analysis.Tiivistelmä Kasvot ovat monipuolinen informaatiolähde ja keskeinen ihmisten välisessä vuorovaikutuksessa. Pystymme päättelemään paljon yhdestäkin kasvokuvasta, mutta kasvoissa on paljon tietoa, jota ei pysty irrottamaan ilman erityiskeinoja. Tässä työssä analysoidaan konenäöllä ihmiselle vaikeasti havaittavaa tietoa: mikroilmeitä ja sydämen sykettä. Tahdosta riippumattomat mikroilmeet paljastavat tunteita, joita ihmiset pyrkivät piilottamaan. Mikroilmeiden havaitseminen on vaikeaa niiden nopeuden ja pienuuden vuoksi, joten automaattinen analyysi voi johtaa uusiin merkittäviin sovelluksiin. Tämä työ tarkastelee mikroilmetutkimuksen edistysaskeleita ja sisältää neljä uutta tulosta. 1) Spontaanien mikroilmeiden tietokanta (Spontaneous MIcroexpression Corpus, SMIC). Spontaanien mikroilmeiden aiheuttaminen datan saamiseksi on oma haasteensa. SMIC:n keräämisessä ja mikroilmeiden annotoinnissa käytetty menettely on kuvattu myöhemmän datan keruun ohjeistukseksi. 2) Aiempia mikroilmeiden tunnistusmenetelmiä paremmaksi kahden testitietokannan avulla todennettu ratkaisu, joka käyttää kolmea eri piirrettä ja videon suurennusta. 3) Piirre-eroanalyysiin perustuva mikroilmeiden havaitsemismenetelmä, joka havaitsee ne pitkistä realistisista videoista. 4) Automaattinen analyysijärjestelmä (Micro-Expression Spotting and Recognition, MESR), jossa mikroilmeet havaitaan ja tunnistetaan. Sydämen syke on tärkeä terveyden ja tunteiden indikoija. Perinteiset sykkeenmittausmenetelmät vaativat ihokontaktia, eivätkä siten toimii etäältä. Tässä työssä esitetään sykkeen videolta pienistä värimuutoksista mittaava menetelmä, joka sietää valaistusmuutoksia ja sallii pään liikkeet. Menetelmä on monikäyttöinen ja sen sovelluksena kuvataan todellisten kasvojen varmentaminen sykemittauksella. Tulokset osoittavat sykepiirteiden toimivan perinteisiä tekstuuripiirteitä paremmin uudenlaisia naamarihuijauksia vastaan. Syketietoa voidaan myös käyttää osana sarjatyyppisissä ratkaisuissa havaitsemaan useanlaisia huijausyrityksiä. Työn yhteenveto keskittyy suunnitelmiin parantaa mikroilmeiden ja sydämen sykkeen analyysimenetelmiä nykyisen tutkimuksen rajoitteiden pohjalta. Tavoitteena on yhdistää mikroilmeiden ja sydämen sykkeen analyysit, sekä mahdollisesti muuta kasvoista saatavaa tietoa, multimodaaliseksi affektiivisen tilan määrittäväksi ratkaisuksi

    Privacy-Phys:facial video-based physiological modification for privacy protection

    No full text
    Abstract The invisible remote photoplethysmography (rPPG) signals in facial videos can reveal the cardiac rhythm and physiological status. Recent studies show that rPPG is a non-contact way for emotion recognition, disease detection, and biometric identification, which means there is a potential privacy problem about physiological information leakage from facial videos. Therefore, it is essential to process facial videos to prevent rPPG extraction in privacy-sensitive situations such as online video meetings. In this letter, we propose Privacy-Phys, a novel method based on a pre-trained 3D convolutional neural network, to modify rPPG in facial videos for privacy protection. Our experimental results show that our approach can modify rPPG signals in facial videos more effectively and efficiently than the previous baseline. Our method can be applied to process facial videos in online video meetings or video-sharing platforms to prevent rPPG from being captured maliciously

    Contrast-Phys:unsupervised video-based remote physiological measurement via spatiotemporal contrast

    No full text
    Abstract Video-based remote physiological measurement utilizes face videos to measure the blood volume change signal, which is also called remote photoplethysmography (rPPG). Supervised methods for rPPG measurements achieve state-of-the-art performance. However, supervised rPPG methods require face videos and ground truth physiological signals for model training. In this paper, we propose an unsupervised rPPG measurement method that does not require ground truth signals for training. We use a 3DCNN model to generate multiple rPPG signals from each video in different spatiotemporal locations and train the model with a contrastive loss where rPPG signals from the same video are pulled together while those from different videos are pushed away. We test on five public datasets, including RGB videos and NIR videos. The results show that our method outperforms the previous unsupervised baseline and achieves accuracies very close to the current best supervised rPPG methods on all five datasets. Furthermore, we also demonstrate that our approach can run at a much faster speed and is more robust to noises than the previous unsupervised baseline. Our code is available at https://github.com/zhaodongsun/contrast-phys

    Facial-video-based physiological signal measurement:recent advances and affective applications

    No full text
    Abstract Monitoring physiological changes [e.g., heart rate (HR), respiration, and HR variability (HRV)] is important for measuring human emotions. Physiological responses are more reliable and harder to alter compared to explicit behaviors (such as facial expressions and speech), but they require special contact sensors to obtain. Research in the last decade has shown that photoplethysmography (PPG) signals can be remotely measured (rPPG) from facial videos under ambient light, from which physiological changes can be extracted. This promising finding has attracted much interest from researchers, and the field of rPPG measurement has been growing fast. In this article, we review current progress on intelligent signal processing approaches for rPPG measurement, including earlier works on unsupervised approaches and recently proposed supervised models, benchmark data sets, and performance evaluation. We also review studies on rPPG-based affective applications and compare them with other affective computing modalities. We conclude this article by emphasizing the current main challenges and highlighting future directions

    Disentangling 3D/4D facial affect recognition with faster multi-view transformer

    No full text
    Abstract In this paper, we propose MiT: a novel multi-view transformer model 1 for 3D/4D facial affect recognition. MiT incorporates patch and position embeddings from various patches of multi-views and uses them for learning various facial muscle movements to showcase an effective recognition performance. We also propose a multi-view loss function that is not only gradient-friendly, and hence speeds up the gradient computation during back-propagation, but it also leverages the correlation associated with the underlying facial patterns among multi-views. Additionally, we offer multi-view weights that are trainable and learnable, and help substantially in training. Finally, we equip our model with distributed performance for faster learning and computational convenience. With the help of extensive experiments, we show that our model outperform the existing methods on widely-used datasets for 3D/4D FER

    Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks

    No full text
    Abstract Recent studies demonstrated that the average heart rate (HR) can be measured from facial videos based on non-contact remote photoplethysmography (rPPG). However for many medical applications (e.g., atrial fibrillation (AF) detection) knowing only the average HR is not sufficient, and measuring precise rPPG signals from face for heart rate variability (HRV) analysis is needed. Here we propose an rPPG measurement method, which is the first work to use deep spatio-temporal networks for reconstructing precise rPPG signals from raw facial videos. With the constraint of trend-consistency with ground truth pulse curves, our method is able to recover rPPG signals with accurate pulse peaks. Comprehensive experiments are conducted on two benchmark datasets, and results demonstrate that our method can achieve superior performance on both HR and HRV levels comparing to the state-of-the-art methods. We also achieve promising results of using reconstructed rPPG signals for AF detection and emotion recognition

    Deep learning-based remote-photoplethysmography measurement from short-time facial video

    No full text
    Abstract Objective: Efficient non-contact heart rate (HR) measurement from facial video has received much attention in health monitoring. Past methods relied on prior knowledge and an unproven hypothesis to extract remote photoplethysmography (rPPG) signals, e.g. manually designed regions of interest (ROIs) and the skin reflection model. Approach: This paper presents a short-time end to end HR estimation framework based on facial features and temporal relationships of video frames. In the proposed method, a deep 3D multi-scale network with cross-layer residual structure is designed to construct an autoencoder and extract robust rPPG features. Then, a spatial-temporal fusion mechanism is proposed to help the network focus on features related to rPPG signals. Both shallow and fused 3D spatial-temporal features are distilled to suppress redundant information in the complex environment. Finally, a data augmentation strategy is presented to solve the problem of uneven distribution of HR in existing datasets. Main results: The experimental results on four face-rPPG datasets show that our method overperforms the state-of-the-art methods and requires fewer video frames. Compared with the previous best results, the proposed method improves the root mean square error (RMSE) by 5.9%, 3.4% and 21.4% on the OBF dataset (intra-test), COHFACE dataset (intra-test) and UBFC dataset (cross-test), respectively. Significance: Our method achieves good results on diverse datasets (i.e. highly compressed video, low-resolution and illumination variation), demonstrating that our method can extract stable rPPG signals in short time
    corecore