Search CORE

38,353 research outputs found

FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces

Author: Cozzolino Davide
Nießner Matthias
Riess Christian
Rössler Andreas
Thies Justus
Verdoliva Luisa
Publication venue
Publication date: 24/03/2018
Field of study

With recent advances in computer vision and graphics, it is now possible to generate videos with extremely realistic synthetic faces, even in real time. Countless applications are possible, some of which raise a legitimate alarm, calling for reliable detectors of fake videos. In fact, distinguishing between original and manipulated video can be a challenge for humans and computers alike, especially when the videos are compressed or have low resolution, as it often happens on social networks. Research on the detection of face manipulations has been seriously hampered by the lack of adequate datasets. To this end, we introduce a novel face manipulation dataset of about half a million edited images (from over 1000 videos). The manipulations have been generated with a state-of-the-art face editing approach. It exceeds all existing video manipulation datasets by at least an order of magnitude. Using our new dataset, we introduce benchmarks for classical image forensic tasks, including classification and segmentation, considering videos compressed at various quality levels. In addition, we introduce a benchmark evaluation for creating indistinguishable forgeries with known ground truth; for instance with generative refinement models.Comment: Video: https://youtu.be/Tle7YaPkO_

arXiv.org e-Print Archive

Evaluation of the Spatio-Temporal features and GAN for Micro-expression Recognition System

Author: Gan Y. S.
Lic Shu-Meng
Liong Sze-Teng
Liu Kun-Hong
Lyu Ran-Ke
Xua Hao-Xuan
Zhang Han-Zhe
Zheng Danna
Publication venue
Publication date: 02/04/2019
Field of study

Owing to the development and advancement of artificial intelligence, numerous works were established in the human facial expression recognition system. Meanwhile, the detection and classification of micro-expressions are attracting attentions from various research communities in the recent few years. In this paper, we first review the processes of a conventional optical-flow-based recognition system, which comprised of facial landmarks annotations, optical flow guided images computation, features extraction and emotion class categorization. Secondly, a few approaches have been proposed to improve the feature extraction part, such as exploiting GAN to generate more image samples. Particularly, several variations of optical flow are computed in order to generate optimal images to lead to high recognition accuracy. Next, GAN, a combination of Generator and Discriminator, is utilized to generate new "fake" images to increase the sample size. Thirdly, a modified state-of-the-art Convolutional neural networks is proposed. To verify the effectiveness of the the proposed method, the results are evaluated on spontaneous micro-expression databases, namely SMIC, CASME II and SAMM. Both the F1-score and accuracy performance metrics are reported in this paper.Comment: 15 pages, 16 figures, 6 table

arXiv.org e-Print Archive

Anisotropic Diffusion-based Kernel Matrix Model for Face Liveness Detection

Author: Jia Yunde
Yu Changyong
Publication venue
Publication date: 10/07/2017
Field of study

Facial recognition and verification is a widely used biometric technology in security system. Unfortunately, face biometrics is vulnerable to spoofing attacks using photographs or videos. In this paper, we present an anisotropic diffusion-based kernel matrix model (ADKMM) for face liveness detection to prevent face spoofing attacks. We use the anisotropic diffusion to enhance the edges and boundary locations of a face image, and the kernel matrix model to extract face image features which we call the diffusion-kernel (D-K) features. The D-K features reflect the inner correlation of the face image sequence. We introduce convolution neural networks to extract the deep features, and then, employ a generalized multiple kernel learning method to fuse the D-K features and the deep features to achieve better performance. Our experimental evaluation on the two publicly available datasets shows that the proposed method outperforms the state-of-art face liveness detection methods

arXiv.org e-Print Archive

Learn Convolutional Neural Network for Face Anti-Spoofing

Author: Lei Zhen
Li Stan Z.
Yang Jianwei
Publication venue
Publication date: 25/08/2014
Field of study

Though having achieved some progresses, the hand-crafted texture features, e.g., LBP [23], LBP-TOP [11] are still unable to capture the most discriminative cues between genuine and fake faces. In this paper, instead of designing feature by ourselves, we rely on the deep convolutional neural network (CNN) to learn features of high discriminative ability in a supervised manner. Combined with some data pre-processing, the face anti-spoofing performance improves drastically. In the experiments, over 70% relative decrease of Half Total Error Rate (HTER) is achieved on two challenging datasets, CASIA [36] and REPLAY-ATTACK [7] compared with the state-of-the-art. Meanwhile, the experimental results from inter-tests between two datasets indicates CNN can obtain features with better generalization ability. Moreover, the nets trained using combined data from two datasets have less biases between two datasets.Comment: 8 pages, 9 figures, 7 table

arXiv.org e-Print Archive

Deep Learning in Information Security

Author: Menkovski Vlado
Petkovic Milan
Thaler Stefan
Publication venue
Publication date: 12/09/2018
Field of study

Machine learning has a long tradition of helping to solve complex information security problems that are difficult to solve manually. Machine learning techniques learn models from data representations to solve a task. These data representations are hand-crafted by domain experts. Deep Learning is a sub-field of machine learning, which uses models that are composed of multiple layers. Consequently, representations that are used to solve a task are learned from the data instead of being manually designed. In this survey, we study the use of DL techniques within the domain of information security. We systematically reviewed 77 papers and presented them from a data-centric perspective. This data-centric perspective reflects one of the most crucial advantages of DL techniques -- domain independence. If DL-methods succeed to solve problems on a data type in one domain, they most likely will also succeed on similar data from another domain. Other advantages of DL methods are unrivaled scalability and efficiency, both regarding the number of examples that can be analyzed as well as with respect of dimensionality of the input data. DL methods generally are capable of achieving high-performance and generalize well. However, information security is a domain with unique requirements and challenges. Based on an analysis of our reviewed papers, we point out shortcomings of DL-methods to those requirements and discuss further research opportunities

arXiv.org e-Print Archive

An Overview of Face Liveness Detection

Author: Chakraborty Saptarshi
Das Dhrubajyoti
Publication venue
Publication date: 09/05/2014
Field of study

Face recognition is a widely used biometric approach. Face recognition technology has developed rapidly in recent years and it is more direct, user friendly and convenient compared to other methods. But face recognition systems are vulnerable to spoof attacks made by non-real faces. It is an easy way to spoof face recognition systems by facial pictures such as portrait photographs. A secure system needs Liveness detection in order to guard against such spoofing. In this work, face liveness detection approaches are categorized based on the various types techniques used for liveness detection. This categorization helps understanding different spoof attacks scenarios and their relation to the developed solutions. A review of the latest works regarding face liveness detection works is presented. The main aim is to provide a simple path for the future development of novel and more secured face liveness detection approach.Comment: International Journal on Information Theory (IJIT), Vol.3, No.2, April 201

arXiv.org e-Print Archive

A Hierarchical Fuzzy System for an Advanced Driving Assistance System

Author: Alimi Adel M.
Dkhil Mejdi Ben
Wali Ali
Publication venue
Publication date: 02/06/2018
Field of study

In this study, we present a hierarchical fuzzy system by evaluating the risk state for a Driver Assistance System in order to contribute in reducing the road accident's number. A key component of this system is its ability to continually detect and test the inside and outside risks in real time: The outside car risks by detecting various road moving objects; this proposed system stands on computer vision approaches. The inside risks by presenting an automatic system for drowsy driving identification or detection by evaluating EEG signals of the driver; this developed system is based on computer vision techniques and biometrics factors (electroencephalogram EEG). This proposed system is then composed of three main modules. The first module is responsible for identifying the driver drowsiness state through his eye movements (physical drowsiness). The second one is responsible for detecting and analysing his physiological signals to also identify his drowsiness state (moral drowsiness). The third module is responsible to evaluate the road driving risks by detecting of the road different moving objects in a real time. The final decision will be obtained by merging of the three detection systems through the use of fuzzy decision rules. Finally, the proposed approach has been improved on ten samples from a proposed dataset

arXiv.org e-Print Archive

Discriminative Representation Combinations for Accurate Face Spoofing Detection

Author: Fang Liangji
Lin Tianwei
Song Xiao
Zhao Xu
Publication venue
Publication date: 27/08/2018
Field of study

Three discriminative representations for face presentation attack detection are introduced in this paper. Firstly we design a descriptor called spatial pyramid coding micro-texture (SPMT) feature to characterize local appearance information. Secondly we utilize the SSD, which is a deep learning framework for detection, to excavate context cues and conduct end-to-end face presentation attack detection. Finally we design a descriptor called template face matched binocular depth (TFBD) feature to characterize stereo structures of real and fake faces. For accurate presentation attack detection, we also design two kinds of representation combinations. Firstly, we propose a decision-level cascade strategy to combine SPMT with SSD. Secondly, we use a simple score fusion strategy to combine face structure cues (TFBD) with local micro-texture features (SPMT). To demonstrate the effectiveness of our design, we evaluate the representation combination of SPMT and SSD on three public datasets, which outperforms all other state-of-the-art methods. In addition, we evaluate the representation combination of SPMT and TFBD on our dataset and excellent performance is also achieved.Comment: To be published in Pattern Recognitio

arXiv.org e-Print Archive

People Counting in Crowded and Outdoor Scenes using a Hybrid Multi-Camera Approach

Author: Britto Jr. Alceu S.
de Oliveira Luiz E. S.
Dittrich Fabio
Koerich Alessandro L.
Publication venue
Publication date: 08/05/2017
Field of study

This paper presents two novel approaches for people counting in crowded and open environments that combine the information gathered by multiple views. Multiple camera are used to expand the field of view as well as to mitigate the problem of occlusion that commonly affects the performance of counting methods using single cameras. The first approach is regarded as a direct approach and it attempts to segment and count each individual in the crowd. For such an aim, two head detectors trained with head images are employed: one based on support vector machines and another based on Adaboost perceptron. The second approach, regarded as an indirect approach employs learning algorithms and statistical analysis on the whole crowd to achieve counting. For such an aim, corner points are extracted from groups of people in a foreground image and computed by a learning algorithm which estimates the number of people in the scene. Both approaches count the number of people on the scene and not only on a given image or video frame of the scene. The experimental results obtained on the benchmark PETS2009 video dataset show that proposed indirect method surpasses other methods with improvements of up to 46.7% and provides accurate counting results for the crowded scenes. On the other hand, the direct method shows high error rates due to the fact that the latter has much more complex problems to solve, such as segmentation of heads

arXiv.org e-Print Archive

Attended End-to-end Architecture for Age Estimation from Facial Expression Videos

Author: Baltrušaitis Tadas
Dibeklioğlu Hamdi
Pei Wenjie
Tax David M. J.
Publication venue
Publication date: 30/11/2019
Field of study

The main challenges of age estimation from facial expression videos lie not only in the modeling of the static facial appearance, but also in the capturing of the temporal facial dynamics. Traditional techniques to this problem focus on constructing handcrafted features to explore the discriminative information contained in facial appearance and dynamics separately. This relies on sophisticated feature-refinement and framework-design. In this paper, we present an end-to-end architecture for age estimation, called Spatially-Indexed Attention Model (SIAM), which is able to simultaneously learn both the appearance and dynamics of age from raw videos of facial expressions. Specifically, we employ convolutional neural networks to extract effective latent appearance representations and feed them into recurrent networks to model the temporal dynamics. More importantly, we propose to leverage attention models for salience detection in both the spatial domain for each single image and the temporal domain for the whole video as well. We design a specific spatially-indexed attention mechanism among the convolutional layers to extract the salient facial regions in each individual image, and a temporal attention layer to assign attention weights to each frame. This two-pronged approach not only improves the performance by allowing the model to focus on informative frames and facial areas, but it also offers an interpretable correspondence between the spatial facial regions as well as temporal frames, and the task of age estimation. We demonstrate the strong performance of our model in experiments on a large, gender-balanced database with 400 subjects with ages spanning from 8 to 76 years. Experiments reveal that our model exhibits significant superiority over the state-of-the-art methods given sufficient training data.Comment: Accepted by Transactions on Image Processing (TIP

arXiv.org e-Print Archive