Search CORE

200 research outputs found

Multi-task Learning For Detecting and Segmenting Manipulated Facial Images and Videos

Author: Echizen Isao
Fang Fuming
H. Nguyen Huy
Yamagishi Junichi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/06/2019
Field of study

Detecting manipulated images and videos is an important topic in digital media forensics. Most detection methods use binary classification to determine the probability of a query being manipulated. Another important topic is locating manipulated regions (i.e., performing segmentation), which are mostly created by three commonly used attacks: removal, copy-move, and splicing. We have designed a convolutional neural network that uses the multi-task learning approach to simultaneously detect manipulated images and videos and locate the manipulated regions for each query. Information gained by performing one task is shared with the other task and thereby enhance the performance of both tasks. A semi-supervised learning approach is used to improve the network's generability. The network includes an encoder and a Y-shaped decoder. Activation of the encoded features is used for the binary classification. The output of one branch of the decoder is used for segmenting the manipulated regions while that of the other branch is used for reconstructing the input, which helps improve overall performance. Experiments using the FaceForensics and FaceForensics++ databases demonstrated the network's effectiveness against facial reenactment attacks and face swapping attacks as well as its ability to deal with the mismatch condition for previously seen attacks. Moreover, fine-tuning using just a small amount of data enables the network to deal with unseen attacks.Comment: Accepted to be Published in Proceedings of the IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS) 2019, Florida, US

arXiv.org e-Print Archive

Edinburgh Research Explorer

Generating Master Faces for Use in Performing Wolf Attacks on Face Recognition Systems

Author: Echizen Isao
Marcel Sébastien
Nguyen Huy H.
Yamagishi Junichi
Publication venue
Publication date: 15/06/2020
Field of study

Due to its convenience, biometric authentication, especial face authentication, has become increasingly mainstream and thus is now a prime target for attackers. Presentation attacks and face morphing are typical types of attack. Previous research has shown that finger-vein- and fingerprint-based authentication methods are susceptible to wolf attacks, in which a wolf sample matches many enrolled user templates. In this work, we demonstrated that wolf (generic) faces, which we call "master faces," can also compromise face recognition systems and that the master face concept can be generalized in some cases. Motivated by recent similar work in the fingerprint domain, we generated high-quality master faces by using the state-of-the-art face generator StyleGAN in a process called latent variable evolution. Experiments demonstrated that even attackers with limited resources using only pre-trained models available on the Internet can initiate master face attacks. The results, in addition to demonstrating performance from the attacker's point of view, can also be used to clarify and improve the performance of face recognition systems and harden face authentication systems.Comment: Accepted to be Published in Proceedings of the 2020 International Joint Conference on Biometrics (IJCB 2020), Houston, US

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Edinburgh Research Explorer

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama’s voice using GAN, WaveNet and low-quality found data

Author: Echizen Isao
Fang Fuming
Kinnunen Tomi
Lorenzo-Trueba Jaime
Wang Xin
Yamagishi Junichi
Publication venue: 'International Speech Communication Association'
Publication date: 02/03/2018
Field of study

Thanks to the growing availability of spoofing databases and rapid advances in using them, systems for detecting voice spoofing attacks are becoming more and more capable, and error rates close to zero are being reached for the ASVspoof2015 database. However, speech synthesis and voice conversion paradigms that are not considered in the ASVspoof2015 database are appearing. Such examples include direct waveform modelling and generative adversarial networks. We also need to investigate the feasibility of training spoofing systems using only low-quality found data. For that purpose, we developed a generative adversarial network-based speech enhancement system that improves the quality of speech data found in publicly available sources. Using the enhanced data, we trained state-of-the-art text-to-speech and voice conversion models and evaluated them in terms of perceptual speech quality and speaker similarity. The results show that the enhancement models significantly improved the SNR of low-quality degraded data found in publicly available sources and that they significantly improved the perceptual cleanliness of the source speech without significantly degrading the naturalness of the voice. However, the results also show limitations when generating speech with the low-quality found data.Comment: conference manuscript submitted to Speaker Odyssey 201

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos

Author: Echizen Isao
Nguyen Huy H.
Yamagishi Junichi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/10/2018
Field of study

Recent advances in media generation techniques have made it easier for attackers to create forged images and videos. State-of-the-art methods enable the real-time creation of a forged version of a single video obtained from a social network. Although numerous methods have been developed for detecting forged images and videos, they are generally targeted at certain domains and quickly become obsolete as new kinds of attacks appear. The method introduced in this paper uses a capsule network to detect various kinds of spoofs, from replay attacks using printed images or recorded videos to computer-generated videos using deep convolutional neural networks. It extends the application of capsule networks beyond their original intention to the solving of inverse graphics problems

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Distinguishing Computer Graphics from Natural Images Using Convolution Neural Networks

Author: Echizen Isao
Nozick Vincent
Rahmouni Nicolas
Yamagishi Junichi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2017
Field of study

International audienceThis paper presents a deep-learning method for distinguishing computer generated graphics from real photographic images. The proposed method uses a Convolutional Neural Network (CNN) with a custom pooling layer to optimize current best-performing algorithms feature extraction scheme. Local estimates of class probabilities are computed and aggregated to predict the label of the whole picture. We evaluate our work on recent photo-realistic computer graphics and show that it outperforms state of the art methods for both local and full image classification

Crossref

Edinburgh Research Explorer

Hal-Diderot

HAL-Polytechnique

HAL - UPEC / UPEM