1,982 research outputs found
DeepFake detection based on high-frequency enhancement network for highly compressed content
The DeepFake, which generates synthetic content, has sparked a revolution in the fight against deception and forgery. However, most existing DeepFake detection methods mainly focus on improving detection performance with high-quality data while ignoring low-quality synthetic content that suffers from high compression. To address this issue, we propose a novel High-Frequency Enhancement framework, which leverages a learnable adaptive high-frequency enhancement network to enrich weak high-frequency information in compressed content without uncompressed data supervision. The framework consists of three branches, i.e., the Basic branch with RGB domain, the Local High-Frequency Enhancement branch with Block-wise Discrete Cosine Transform, and the Global High-Frequency Enhancement branch with Multi-level Discrete Wavelet Transform. Among them, the local branch utilizes the Discrete Cosine Transform coefficient and channel attention mechanism to indirectly achieve adaptive frequency-aware multi-spatial attention, while the global branch supplements the high-frequency information by extracting coarse-to-fine multi-scale high-frequency cues and cascade-residual-based multi-level fusion by Discrete Wavelet Transform coefficients. In addition, we design a Two-Stage Cross-Fusion module to effectively integrate all information, thereby greatly enhancing weak high-frequency information in low-quality data. Experimental results on FaceForensics++, Celeb-DF, and OpenForensics datasets show that the proposed method outperforms the existing state-of-the-art methods and can effectively improve the detection performance of DeepFakes, especially on low-quality data. The code is available here
Datasets, Clues and State-of-the-Arts for Multimedia Forensics: An Extensive Review
With the large chunks of social media data being created daily and the
parallel rise of realistic multimedia tampering methods, detecting and
localising tampering in images and videos has become essential. This survey
focusses on approaches for tampering detection in multimedia data using deep
learning models. Specifically, it presents a detailed analysis of benchmark
datasets for malicious manipulation detection that are publicly available. It
also offers a comprehensive list of tampering clues and commonly used deep
learning architectures. Next, it discusses the current state-of-the-art
tampering detection methods, categorizing them into meaningful types such as
deepfake detection methods, splice tampering detection methods, copy-move
tampering detection methods, etc. and discussing their strengths and
weaknesses. Top results achieved on benchmark datasets, comparison of deep
learning approaches against traditional methods and critical insights from the
recent tampering detection methods are also discussed. Lastly, the research
gaps, future direction and conclusion are discussed to provide an in-depth
understanding of the tampering detection research arena
MaLP: Manipulation Localization Using a Proactive Scheme
Advancements in the generation quality of various Generative Models (GMs) has
made it necessary to not only perform binary manipulation detection but also
localize the modified pixels in an image. However, prior works termed as
passive for manipulation localization exhibit poor generalization performance
over unseen GMs and attribute modifications. To combat this issue, we propose a
proactive scheme for manipulation localization, termed MaLP. We encrypt the
real images by adding a learned template. If the image is manipulated by any
GM, this added protection from the template not only aids binary detection but
also helps in identifying the pixels modified by the GM. The template is
learned by leveraging local and global-level features estimated by a two-branch
architecture. We show that MaLP performs better than prior passive works. We
also show the generalizability of MaLP by testing on 22 different GMs,
providing a benchmark for future research on manipulation localization.
Finally, we show that MaLP can be used as a discriminator for improving the
generation quality of GMs. Our models/codes are available at
www.github.com/vishal3477/pro_loc.Comment: Published at Conference on Computer Vision and Pattern Recognition
202
Infrared Image Super-Resolution: Systematic Review, and Future Trends
Image Super-Resolution (SR) is essential for a wide range of computer vision
and image processing tasks. Investigating infrared (IR) image (or thermal
images) super-resolution is a continuing concern within the development of deep
learning. This survey aims to provide a comprehensive perspective of IR image
super-resolution, including its applications, hardware imaging system dilemmas,
and taxonomy of image processing methodologies. In addition, the datasets and
evaluation metrics in IR image super-resolution tasks are also discussed.
Furthermore, the deficiencies in current technologies and possible promising
directions for the community to explore are highlighted. To cope with the rapid
development in this field, we intend to regularly update the relevant excellent
work at \url{https://github.com/yongsongH/Infrared_Image_SR_SurveyComment: Submitted to IEEE TNNL
Probabilistic-based Feature Embedding of 4-D Light Fields for Compressive Imaging and Denoising
The high-dimensional nature of the 4-D light field (LF) poses great
challenges in achieving efficient and effective feature embedding, that
severely impacts the performance of downstream tasks. To tackle this crucial
issue, in contrast to existing methods with empirically-designed architectures,
we propose a probabilistic-based feature embedding (PFE), which learns a
feature embedding architecture by assembling various low-dimensional
convolution patterns in a probability space for fully capturing spatial-angular
information. Building upon the proposed PFE, we then leverage the intrinsic
linear imaging model of the coded aperture camera to construct a
cycle-consistent 4-D LF reconstruction network from coded measurements.
Moreover, we incorporate PFE into an iterative optimization framework for 4-D
LF denoising. Our extensive experiments demonstrate the significant superiority
of our methods on both real-world and synthetic 4-D LF images, both
quantitatively and qualitatively, when compared with state-of-the-art methods.
The source code will be publicly available at
https://github.com/lyuxianqiang/LFCA-CR-NET
Deep panoramic depth prediction and completion for indoor scenes
We introduce a novel end-to-end deep-learning solution for rapidly estimating a dense spherical depth map of an indoor environment. Our input is a single equirectangular image registered with a sparse depth map, as provided by a variety of common capture setups. Depth is inferred by an efficient and lightweight single-branch network, which employs a dynamic gating system to process together dense visual data and sparse geometric data. We exploit the characteristics of typical man-made environments to efficiently compress multi-resolution features and find short- and long-range relations among scene parts. Furthermore, we introduce a new augmentation strategy to make the model robust to different types of sparsity, including those generated by various structured light sensors and LiDAR setups. The experimental results demonstrate that our method provides interactive performance and outperforms state-of-the-art solutions in computational efficiency, adaptivity to variable depth sparsity patterns, and prediction accuracy for challenging indoor data, even when trained solely on synthetic data without any fine tuning. (Figure presented.
A Survey of Deep Face Restoration: Denoise, Super-Resolution, Deblur, Artifact Removal
Face Restoration (FR) aims to restore High-Quality (HQ) faces from
Low-Quality (LQ) input images, which is a domain-specific image restoration
problem in the low-level computer vision area. The early face restoration
methods mainly use statistic priors and degradation models, which are difficult
to meet the requirements of real-world applications in practice. In recent
years, face restoration has witnessed great progress after stepping into the
deep learning era. However, there are few works to study deep learning-based
face restoration methods systematically. Thus, this paper comprehensively
surveys recent advances in deep learning techniques for face restoration.
Specifically, we first summarize different problem formulations and analyze the
characteristic of the face image. Second, we discuss the challenges of face
restoration. Concerning these challenges, we present a comprehensive review of
existing FR methods, including prior based methods and deep learning-based
methods. Then, we explore developed techniques in the task of FR covering
network architectures, loss functions, and benchmark datasets. We also conduct
a systematic benchmark evaluation on representative methods. Finally, we
discuss future directions, including network designs, metrics, benchmark
datasets, applications,etc. We also provide an open-source repository for all
the discussed methods, which is available at
https://github.com/TaoWangzj/Awesome-Face-Restoration.Comment: 21 pages, 19 figure
- …