11 research outputs found
Video Representation Learning by Dense Predictive Coding
The objective of this paper is self-supervised learning of spatio-temporal
embeddings from video, suitable for human action recognition. We make three
contributions: First, we introduce the Dense Predictive Coding (DPC) framework
for self-supervised representation learning on videos. This learns a dense
encoding of spatio-temporal blocks by recurrently predicting future
representations; Second, we propose a curriculum training scheme to predict
further into the future with progressively less temporal context. This
encourages the model to only encode slowly varying spatial-temporal signals,
therefore leading to semantic representations; Third, we evaluate the approach
by first training the DPC model on the Kinetics-400 dataset with
self-supervised learning, and then finetuning the representation on a
downstream task, i.e. action recognition. With single stream (RGB only), DPC
pretrained representations achieve state-of-the-art self-supervised performance
on both UCF101(75.7% top1 acc) and HMDB51(35.7% top1 acc), outperforming all
previous learning methods by a significant margin, and approaching the
performance of a baseline pre-trained on ImageNet
Recent Advances in Deep Learning Techniques for Face Recognition
In recent years, researchers have proposed many deep learning (DL) methods
for various tasks, and particularly face recognition (FR) made an enormous leap
using these techniques. Deep FR systems benefit from the hierarchical
architecture of the DL methods to learn discriminative face representation.
Therefore, DL techniques significantly improve state-of-the-art performance on
FR systems and encourage diverse and efficient real-world applications. In this
paper, we present a comprehensive analysis of various FR systems that leverage
the different types of DL techniques, and for the study, we summarize 168
recent contributions from this area. We discuss the papers related to different
algorithms, architectures, loss functions, activation functions, datasets,
challenges, improvement ideas, current and future trends of DL-based FR
systems. We provide a detailed discussion of various DL methods to understand
the current state-of-the-art, and then we discuss various activation and loss
functions for the methods. Additionally, we summarize different datasets used
widely for FR tasks and discuss challenges related to illumination, expression,
pose variations, and occlusion. Finally, we discuss improvement ideas, current
and future trends of FR tasks.Comment: 32 pages and citation: M. T. H. Fuad et al., "Recent Advances in Deep
Learning Techniques for Face Recognition," in IEEE Access, vol. 9, pp.
99112-99142, 2021, doi: 10.1109/ACCESS.2021.309613