38,353 research outputs found
FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces
With recent advances in computer vision and graphics, it is now possible to
generate videos with extremely realistic synthetic faces, even in real time.
Countless applications are possible, some of which raise a legitimate alarm,
calling for reliable detectors of fake videos. In fact, distinguishing between
original and manipulated video can be a challenge for humans and computers
alike, especially when the videos are compressed or have low resolution, as it
often happens on social networks. Research on the detection of face
manipulations has been seriously hampered by the lack of adequate datasets. To
this end, we introduce a novel face manipulation dataset of about half a
million edited images (from over 1000 videos). The manipulations have been
generated with a state-of-the-art face editing approach. It exceeds all
existing video manipulation datasets by at least an order of magnitude. Using
our new dataset, we introduce benchmarks for classical image forensic tasks,
including classification and segmentation, considering videos compressed at
various quality levels. In addition, we introduce a benchmark evaluation for
creating indistinguishable forgeries with known ground truth; for instance with
generative refinement models.Comment: Video: https://youtu.be/Tle7YaPkO_
Evaluation of the Spatio-Temporal features and GAN for Micro-expression Recognition System
Owing to the development and advancement of artificial intelligence, numerous
works were established in the human facial expression recognition system.
Meanwhile, the detection and classification of micro-expressions are attracting
attentions from various research communities in the recent few years. In this
paper, we first review the processes of a conventional optical-flow-based
recognition system, which comprised of facial landmarks annotations, optical
flow guided images computation, features extraction and emotion class
categorization. Secondly, a few approaches have been proposed to improve the
feature extraction part, such as exploiting GAN to generate more image samples.
Particularly, several variations of optical flow are computed in order to
generate optimal images to lead to high recognition accuracy. Next, GAN, a
combination of Generator and Discriminator, is utilized to generate new "fake"
images to increase the sample size. Thirdly, a modified state-of-the-art
Convolutional neural networks is proposed. To verify the effectiveness of the
the proposed method, the results are evaluated on spontaneous micro-expression
databases, namely SMIC, CASME II and SAMM. Both the F1-score and accuracy
performance metrics are reported in this paper.Comment: 15 pages, 16 figures, 6 table
Anisotropic Diffusion-based Kernel Matrix Model for Face Liveness Detection
Facial recognition and verification is a widely used biometric technology in
security system. Unfortunately, face biometrics is vulnerable to spoofing
attacks using photographs or videos. In this paper, we present an anisotropic
diffusion-based kernel matrix model (ADKMM) for face liveness detection to
prevent face spoofing attacks. We use the anisotropic diffusion to enhance the
edges and boundary locations of a face image, and the kernel matrix model to
extract face image features which we call the diffusion-kernel (D-K) features.
The D-K features reflect the inner correlation of the face image sequence. We
introduce convolution neural networks to extract the deep features, and then,
employ a generalized multiple kernel learning method to fuse the D-K features
and the deep features to achieve better performance. Our experimental
evaluation on the two publicly available datasets shows that the proposed
method outperforms the state-of-art face liveness detection methods
Learn Convolutional Neural Network for Face Anti-Spoofing
Though having achieved some progresses, the hand-crafted texture features,
e.g., LBP [23], LBP-TOP [11] are still unable to capture the most
discriminative cues between genuine and fake faces. In this paper, instead of
designing feature by ourselves, we rely on the deep convolutional neural
network (CNN) to learn features of high discriminative ability in a supervised
manner. Combined with some data pre-processing, the face anti-spoofing
performance improves drastically. In the experiments, over 70% relative
decrease of Half Total Error Rate (HTER) is achieved on two challenging
datasets, CASIA [36] and REPLAY-ATTACK [7] compared with the state-of-the-art.
Meanwhile, the experimental results from inter-tests between two datasets
indicates CNN can obtain features with better generalization ability. Moreover,
the nets trained using combined data from two datasets have less biases between
two datasets.Comment: 8 pages, 9 figures, 7 table
Deep Learning in Information Security
Machine learning has a long tradition of helping to solve complex information
security problems that are difficult to solve manually. Machine learning
techniques learn models from data representations to solve a task. These data
representations are hand-crafted by domain experts. Deep Learning is a
sub-field of machine learning, which uses models that are composed of multiple
layers. Consequently, representations that are used to solve a task are learned
from the data instead of being manually designed.
In this survey, we study the use of DL techniques within the domain of
information security. We systematically reviewed 77 papers and presented them
from a data-centric perspective. This data-centric perspective reflects one of
the most crucial advantages of DL techniques -- domain independence. If
DL-methods succeed to solve problems on a data type in one domain, they most
likely will also succeed on similar data from another domain. Other advantages
of DL methods are unrivaled scalability and efficiency, both regarding the
number of examples that can be analyzed as well as with respect of
dimensionality of the input data. DL methods generally are capable of achieving
high-performance and generalize well.
However, information security is a domain with unique requirements and
challenges. Based on an analysis of our reviewed papers, we point out
shortcomings of DL-methods to those requirements and discuss further research
opportunities
An Overview of Face Liveness Detection
Face recognition is a widely used biometric approach. Face recognition
technology has developed rapidly in recent years and it is more direct, user
friendly and convenient compared to other methods. But face recognition systems
are vulnerable to spoof attacks made by non-real faces. It is an easy way to
spoof face recognition systems by facial pictures such as portrait photographs.
A secure system needs Liveness detection in order to guard against such
spoofing. In this work, face liveness detection approaches are categorized
based on the various types techniques used for liveness detection. This
categorization helps understanding different spoof attacks scenarios and their
relation to the developed solutions. A review of the latest works regarding
face liveness detection works is presented. The main aim is to provide a simple
path for the future development of novel and more secured face liveness
detection approach.Comment: International Journal on Information Theory (IJIT), Vol.3, No.2,
April 201
A Hierarchical Fuzzy System for an Advanced Driving Assistance System
In this study, we present a hierarchical fuzzy system by evaluating the risk
state for a Driver Assistance System in order to contribute in reducing the
road accident's number. A key component of this system is its ability to
continually detect and test the inside and outside risks in real time: The
outside car risks by detecting various road moving objects; this proposed
system stands on computer vision approaches. The inside risks by presenting an
automatic system for drowsy driving identification or detection by evaluating
EEG signals of the driver; this developed system is based on computer vision
techniques and biometrics factors (electroencephalogram EEG). This proposed
system is then composed of three main modules. The first module is responsible
for identifying the driver drowsiness state through his eye movements (physical
drowsiness). The second one is responsible for detecting and analysing his
physiological signals to also identify his drowsiness state (moral drowsiness).
The third module is responsible to evaluate the road driving risks by detecting
of the road different moving objects in a real time. The final decision will be
obtained by merging of the three detection systems through the use of fuzzy
decision rules. Finally, the proposed approach has been improved on ten samples
from a proposed dataset
Discriminative Representation Combinations for Accurate Face Spoofing Detection
Three discriminative representations for face presentation attack detection
are introduced in this paper. Firstly we design a descriptor called spatial
pyramid coding micro-texture (SPMT) feature to characterize local appearance
information. Secondly we utilize the SSD, which is a deep learning framework
for detection, to excavate context cues and conduct end-to-end face
presentation attack detection. Finally we design a descriptor called template
face matched binocular depth (TFBD) feature to characterize stereo structures
of real and fake faces. For accurate presentation attack detection, we also
design two kinds of representation combinations. Firstly, we propose a
decision-level cascade strategy to combine SPMT with SSD. Secondly, we use a
simple score fusion strategy to combine face structure cues (TFBD) with local
micro-texture features (SPMT). To demonstrate the effectiveness of our design,
we evaluate the representation combination of SPMT and SSD on three public
datasets, which outperforms all other state-of-the-art methods. In addition, we
evaluate the representation combination of SPMT and TFBD on our dataset and
excellent performance is also achieved.Comment: To be published in Pattern Recognitio
People Counting in Crowded and Outdoor Scenes using a Hybrid Multi-Camera Approach
This paper presents two novel approaches for people counting in crowded and
open environments that combine the information gathered by multiple views.
Multiple camera are used to expand the field of view as well as to mitigate the
problem of occlusion that commonly affects the performance of counting methods
using single cameras. The first approach is regarded as a direct approach and
it attempts to segment and count each individual in the crowd. For such an aim,
two head detectors trained with head images are employed: one based on support
vector machines and another based on Adaboost perceptron. The second approach,
regarded as an indirect approach employs learning algorithms and statistical
analysis on the whole crowd to achieve counting. For such an aim, corner points
are extracted from groups of people in a foreground image and computed by a
learning algorithm which estimates the number of people in the scene. Both
approaches count the number of people on the scene and not only on a given
image or video frame of the scene. The experimental results obtained on the
benchmark PETS2009 video dataset show that proposed indirect method surpasses
other methods with improvements of up to 46.7% and provides accurate counting
results for the crowded scenes. On the other hand, the direct method shows high
error rates due to the fact that the latter has much more complex problems to
solve, such as segmentation of heads
Attended End-to-end Architecture for Age Estimation from Facial Expression Videos
The main challenges of age estimation from facial expression videos lie not
only in the modeling of the static facial appearance, but also in the capturing
of the temporal facial dynamics. Traditional techniques to this problem focus
on constructing handcrafted features to explore the discriminative information
contained in facial appearance and dynamics separately. This relies on
sophisticated feature-refinement and framework-design. In this paper, we
present an end-to-end architecture for age estimation, called Spatially-Indexed
Attention Model (SIAM), which is able to simultaneously learn both the
appearance and dynamics of age from raw videos of facial expressions.
Specifically, we employ convolutional neural networks to extract effective
latent appearance representations and feed them into recurrent networks to
model the temporal dynamics. More importantly, we propose to leverage attention
models for salience detection in both the spatial domain for each single image
and the temporal domain for the whole video as well. We design a specific
spatially-indexed attention mechanism among the convolutional layers to extract
the salient facial regions in each individual image, and a temporal attention
layer to assign attention weights to each frame. This two-pronged approach not
only improves the performance by allowing the model to focus on informative
frames and facial areas, but it also offers an interpretable correspondence
between the spatial facial regions as well as temporal frames, and the task of
age estimation. We demonstrate the strong performance of our model in
experiments on a large, gender-balanced database with 400 subjects with ages
spanning from 8 to 76 years. Experiments reveal that our model exhibits
significant superiority over the state-of-the-art methods given sufficient
training data.Comment: Accepted by Transactions on Image Processing (TIP
- …