180 research outputs found
PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer
Remote photoplethysmography (rPPG), which aims at measuring heart activities
and physiological signals from facial video without any contact, has great
potential in many applications (e.g., remote healthcare and affective
computing). Recent deep learning approaches focus on mining subtle rPPG clues
using convolutional neural networks with limited spatio-temporal receptive
fields, which neglect the long-range spatio-temporal perception and interaction
for rPPG modeling. In this paper, we propose the PhysFormer, an end-to-end
video transformer based architecture, to adaptively aggregate both local and
global spatio-temporal features for rPPG representation enhancement. As key
modules in PhysFormer, the temporal difference transformers first enhance the
quasi-periodic rPPG features with temporal difference guided global attention,
and then refine the local spatio-temporal representation against interference.
Furthermore, we also propose the label distribution learning and a curriculum
learning inspired dynamic constraint in frequency domain, which provide
elaborate supervisions for PhysFormer and alleviate overfitting. Comprehensive
experiments are performed on four benchmark datasets to show our superior
performance on both intra- and cross-dataset testings. One highlight is that,
unlike most transformer networks needed pretraining from large-scale datasets,
the proposed PhysFormer can be easily trained from scratch on rPPG datasets,
which makes it promising as a novel transformer baseline for the rPPG
community. The codes will be released at
https://github.com/ZitongYu/PhysFormer.Comment: Accepted by CVPR202
Learning Meta Model for Zero- and Few-shot Face Anti-spoofing
Face anti-spoofing is crucial to the security of face recognition systems.
Most previous methods formulate face anti-spoofing as a supervised learning
problem to detect various predefined presentation attacks, which need large
scale training data to cover as many attacks as possible. However, the trained
model is easy to overfit several common attacks and is still vulnerable to
unseen attacks. To overcome this challenge, the detector should: 1) learn
discriminative features that can generalize to unseen spoofing types from
predefined presentation attacks; 2) quickly adapt to new spoofing types by
learning from both the predefined attacks and a few examples of the new
spoofing types. Therefore, we define face anti-spoofing as a zero- and few-shot
learning problem. In this paper, we propose a novel Adaptive Inner-update Meta
Face Anti-Spoofing (AIM-FAS) method to tackle this problem through
meta-learning. Specifically, AIM-FAS trains a meta-learner focusing on the task
of detecting unseen spoofing types by learning from predefined living and
spoofing faces and a few examples of new attacks. To assess the proposed
approach, we propose several benchmarks for zero- and few-shot FAS. Experiments
show its superior performances on the presented benchmarks to existing methods
in existing zero-shot FAS protocols.Comment: Accepted by AAAI202
Deep Learning for Face Anti-Spoofing: A Survey
Face anti-spoofing (FAS) has lately attracted increasing attention due to its
vital role in securing face recognition systems from presentation attacks
(PAs). As more and more realistic PAs with novel types spring up, traditional
FAS methods based on handcrafted features become unreliable due to their
limited representation capacity. With the emergence of large-scale academic
datasets in the recent decade, deep learning based FAS achieves remarkable
performance and dominates this area. However, existing reviews in this field
mainly focus on the handcrafted features, which are outdated and uninspiring
for the progress of FAS community. In this paper, to stimulate future research,
we present the first comprehensive review of recent advances in deep learning
based FAS. It covers several novel and insightful components: 1) besides
supervision with binary label (e.g., '0' for bonafide vs. '1' for PAs), we also
investigate recent methods with pixel-wise supervision (e.g., pseudo depth
map); 2) in addition to traditional intra-dataset evaluation, we collect and
analyze the latest methods specially designed for domain generalization and
open-set FAS; and 3) besides commercial RGB camera, we summarize the deep
learning applications under multi-modal (e.g., depth and infrared) or
specialized (e.g., light field and flash) sensors. We conclude this survey by
emphasizing current open issues and highlighting potential prospects.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
Learning from Sparse Offline Datasets via Conservative Density Estimation
Offline reinforcement learning (RL) offers a promising direction for learning
policies from pre-collected datasets without requiring further interactions
with the environment. However, existing methods struggle to handle
out-of-distribution (OOD) extrapolation errors, especially in sparse reward or
scarce data settings. In this paper, we propose a novel training algorithm
called Conservative Density Estimation (CDE), which addresses this challenge by
explicitly imposing constraints on the state-action occupancy stationary
distribution. CDE overcomes the limitations of existing approaches, such as the
stationary distribution correction method, by addressing the support mismatch
issue in marginal importance sampling. Our method achieves state-of-the-art
performance on the D4RL benchmark. Notably, CDE consistently outperforms
baselines in challenging tasks with sparse rewards or insufficient data,
demonstrating the advantages of our approach in addressing the extrapolation
error problem in offline RL.Comment: ICLR 202
Hydrogen-induced degradation dynamics in silicon heterojunction solar cells via machine learning
Among silicon-based solar cells, heterojunction cells hold the world
efficiency record. However, their market acceptance is hindered by an initial
0.5\% per year degradation of their open circuit voltage which doubles the
overall cell degradation rate. Here, we study the performance degradation of
crystalline-Si/amorphous-Si:H heterojunction stacks. First, we experimentally
measure the interface defect density over a year, the primary driver of the
degradation. Second, we develop SolDeg, a multiscale, hierarchical simulator to
analyze this degradation by combining Machine Learning, Molecular Dynamics,
Density Functional Theory, and Nudged Elastic Band methods with analytical
modeling. We discover that the chemical potential for mobile hydrogen develops
a gradient, forcing the hydrogen to drift from the interface, leaving behind
recombination-active defects. We find quantitative correspondence between the
calculated and experimentally determined defect generation dynamics. Finally,
we propose a reversed Si-density gradient architecture for the amorphous-Si:H
layer that promises to reduce the initial open circuit voltage degradation from
0.5\% per year to 0.1\% per year
- …