144 research outputs found

    A review of digital video tampering: from simple editing to full synthesis.

    Get PDF
    Video tampering methods have witnessed considerable progress in recent years. This is partly due to the rapid development of advanced deep learning methods, and also due to the large volume of video footage that is now in the public domain. Historically, convincing video tampering has been too labour intensive to achieve on a large scale. However, recent developments in deep learning-based methods have made it possible not only to produce convincing forged video but also to fully synthesize video content. Such advancements provide new means to improve visual content itself, but at the same time, they raise new challenges for state-of-the-art tampering detection methods. Video tampering detection has been an active field of research for some time, with periodic reviews of the subject. However, little attention has been paid to video tampering techniques themselves. This paper provides an objective and in-depth examination of current techniques related to digital video manipulation. We thoroughly examine their development, and show how current evaluation techniques provide opportunities for the advancement of video tampering detection. A critical and extensive review of photo-realistic video synthesis is provided with emphasis on deep learning-based methods. Existing tampered video datasets are also qualitatively reviewed and critically discussed. Finally, conclusions are drawn upon an exhaustive and thorough review of tampering methods with discussions of future research directions aimed at improving detection methods

    LatentKeypointGAN: Controlling GANs via Latent Keypoints

    Full text link
    Generative adversarial networks (GANs) have attained photo-realistic quality in image generation. However, how to best control the image content remains an open challenge. We introduce LatentKeypointGAN, a two-stage GAN which is trained end-to-end on the classical GAN objective with internal conditioning on a set of space keypoints. These keypoints have associated appearance embeddings that respectively control the position and style of the generated objects and their parts. A major difficulty that we address with suitable network architectures and training schemes is disentangling the image into spatial and appearance factors without domain knowledge and supervision signals. We demonstrate that LatentKeypointGAN provides an interpretable latent space that can be used to re-arrange the generated images by re-positioning and exchanging keypoint embeddings, such as generating portraits by combining the eyes, nose, and mouth from different images. In addition, the explicit generation of keypoints and matching images enables a new, GAN-based method for unsupervised keypoint detection

    GradICON\texttt{GradICON}: Approximate Diffeomorphisms via Gradient Inverse Consistency

    Full text link
    We present an approach to learning regular spatial transformations between image pairs in the context of medical image registration. Contrary to optimization-based registration techniques and many modern learning-based methods, we do not directly penalize transformation irregularities but instead promote transformation regularity via an inverse consistency penalty. We use a neural network to predict a map between a source and a target image as well as the map when swapping the source and target images. Different from existing approaches, we compose these two resulting maps and regularize deviations of the Jacobian\bf{Jacobian} of this composition from the identity matrix. This regularizer -- GradICON\texttt{GradICON} -- results in much better convergence when training registration models compared to promoting inverse consistency of the composition of maps directly while retaining the desirable implicit regularization effects of the latter. We achieve state-of-the-art registration performance on a variety of real-world medical image datasets using a single set of hyperparameters and a single non-dataset-specific training protocol.Comment: 29 pages, 16 figures, CVPR 202

    Beyond the pixels: learning and utilising video compression features for localisation of digital tampering.

    Get PDF
    Video compression is pervasive in digital society. With rising usage of deep convolutional neural networks (CNNs) in the fields of computer vision, video analysis and video tampering detection, it is important to investigate how patterns invisible to human eyes may be influencing modern computer vision techniques and how they can be used advantageously. This work thoroughly explores how video compression influences accuracy of CNNs and shows how optimal performance is achieved when compression levels in the training set closely match those of the test set. A novel method is then developed, using CNNs, to derive compression features directly from the pixels of video frames. It is then shown that these features can be readily used to detect inauthentic video content with good accuracy across multiple different video tampering techniques. Moreover, the ability to explain these features allows predictions to be made about their effectiveness against future tampering methods. The problem is motivated with a novel investigation into recent video manipulation methods, which shows that there is a consistent drive to produce convincing, photorealistic, manipulated or synthetic video. Humans, blind to the presence of video tampering, are also blind to the type of tampering. New detection techniques are required and, in order to compensate for human limitations, they should be broadly applicable to multiple tampering types. This thesis details the steps necessary to develop and evaluate such techniques

    Unsupervised landmark discovery via self-training correspondence

    Get PDF
    Object parts, also known as landmarks, convey information about an object’s shape and spatial configuration in 3D space, especially for deformable objects. The goal of landmark detection is to have a model that, for a particular object instance, can estimate the locations of its parts. Research in this field is mainly driven by supervised approaches, where a sufficient amount of human-annotated data is available. As annotating landmarks for all objects is impractical, this thesis focuses on learning landmark detectors without supervision. Despite good performance on limited scenarios (objects showcasing minor rigid deformation), unsupervised landmark discovery mostly remains an open problem. Existing work fails to capture semantic landmarks, i.e. points similar to the ones assigned by human annotators and may not generalise well to highly articulated objects like the human body, complicated backgrounds or large viewpoint variations. In this thesis, we propose a novel self-training framework for the discovery of unsupervised landmarks. Contrary to existing methods that build on auxiliary tasks such as image generation or equivariance, we depart from generic keypoints and train a landmark detector and descriptor to improve itself, tuning the keypoints into distinctive landmarks. We propose an iterative algorithm that alternates between producing new pseudo-labels through feature clustering and learning distinctive features for each pseudo-class through contrastive learning. Our detector can discover highly semantic landmarks, that are more flexible in terms of capturing large viewpoint changes and out-of-plane rotations (3D rotations). New state-of-the-art performance is achieved in multiple challenging datasets
    • …
    corecore