488 research outputs found
Preserving low-quality video through deep learning
Lossy video stream compression is performed to reduce the bandwidth and storage requirements. Moreover also image compression is a need that arises in many circumstances.It is often the case that older archive are stored at low resolution and with a compression rate suitable for the technology available at the time the video was created. Unfortunately, lossy compression algorithms cause artifact. Such artifacts, usually damage higher frequency details also adding noise or novel image patterns. There are several issues with this phenomenon. Low-quality images can be less pleasant to persons. Object detectors algorithms may have their performance reduced. As a result, given a perturbed version of it, we aim at removing such artifacts to recover the original image. To obtain that, one should reverse the compression process through a complicated non-linear image transformation. We propose a deep neural network able to improve image quality. We show that this model can be optimized either traditionally, directly optimizing an image similarity loss (SSIM), or using a generative adversarial approach (GAN). Our restored images have more photorealistic details with respect to traditional image enhancement networks. Our training procedure based on sub-patches is novel. Moreover, we propose novel testing protocol to evaluate restored images quantitatively. Differently from previously proposed approaches we are able to remove artifacts generated at any quality by inferring the image quality directly from data. Human evaluation and quantitative experiments in object detection show that our GAN generates images with finer consistent details and these details make a difference both for machines and humans
Increasing Video Perceptual Quality with GANs and Semantic Coding
We have seen a rise in video based user communication in the last year, unfortunately fueled by the spread of COVID-19 disease. Efficient low-latency delay of transmission of video is a challenging problem which must also deal with the segmented nature of network infrastructure not always allowing a high throughput. Lossy video compression is a basic requirement to enable such technology widely. While this may compromise the quality of the streamed video there are recent deep learning based solutions to restore quality of a lossy compressed video. Considering the very nature of video conferencing, bitrate allocation in video streaming could be driven semantically, differentiating quality between the talking subjects and the background. Currently there have not been any work studying the restoration of semantically coded video using deep learning. In this work we show how such videos can be efficiently generated by shifting bitrate with masks derived via computer vision and how a deep generative adversarial network can be trained to restore video quality. Our study shows that the combination of semantic coding and learning based video restoration can provide superior results
Recent Advances in Digital Image and Video Forensics, Anti-forensics and Counter Anti-forensics
Image and video forensics have recently gained increasing attention due to
the proliferation of manipulated images and videos, especially on social media
platforms, such as Twitter and Instagram, which spread disinformation and fake
news. This survey explores image and video identification and forgery detection
covering both manipulated digital media and generative media. However, media
forgery detection techniques are susceptible to anti-forensics; on the other
hand, such anti-forensics techniques can themselves be detected. We therefore
further cover both anti-forensics and counter anti-forensics techniques in
image and video. Finally, we conclude this survey by highlighting some open
problems in this domain
(Compress and Restore)^N: a Robust Defense Against Adversarial Attacks on Image Classification
Modern image classification approaches often rely on deep neural networks, which have shown pronounced weakness to
adversarial examples: images corrupted with specifically designed yet imperceptible noise that causes the network to misclassify.
In this paper, we propose a conceptually simple yet robust solution to tackle adversarial attacks on image classification. Our
defense works by first applying a JPEG compression with a random quality factor; compression artifacts are subsequently
removed by means of a generative model (AR-GAN). The process can be iterated ensuring the image is not degraded and hence
the classification not compromised. We train different AR-GANs for different compression factors, so that we can change its
parameters dynamically at each iteration depending on the current compression, making the gradient approximation difficult.
We experiment our defense against three white-box and two black-box attacks, with a particular focus on the state-of-the-art
BPDA attack. Our method does not require any adversarial training, and is independent of both the classifier and the attack.
Experiments demonstrate that dynamically changing the AR-GAN parameters is of fundamental importance to obtain significant
robustness
Palette: Image-to-Image Diffusion Models
This paper develops a unified framework for image-to-image translation based
on conditional diffusion models and evaluates this framework on four
challenging image-to-image translation tasks, namely colorization, inpainting,
uncropping, and JPEG restoration. Our simple implementation of image-to-image
diffusion models outperforms strong GAN and regression baselines on all tasks,
without task-specific hyper-parameter tuning, architecture customization, or
any auxiliary loss or sophisticated new techniques needed. We uncover the
impact of an L2 vs. L1 loss in the denoising diffusion objective on sample
diversity, and demonstrate the importance of self-attention in the neural
architecture through empirical studies. Importantly, we advocate a unified
evaluation protocol based on ImageNet, with human evaluation and sample quality
scores (FID, Inception Score, Classification Accuracy of a pre-trained
ResNet-50, and Perceptual Distance against original images). We expect this
standardized evaluation protocol to play a role in advancing image-to-image
translation research. Finally, we show that a generalist, multi-task diffusion
model performs as well or better than task-specific specialist counterparts.
Check out https://diffusion-palette.github.io for an overview of the results
Boosting Cross-Quality Face Verification using Blind Face Restoration
In recent years, various Blind Face Restoration (BFR) techniques were
developed. These techniques transform low quality faces suffering from multiple
degradations to more realistic and natural face images with high perceptual
quality. However, it is crucial for the task of face verification to not only
enhance the perceptual quality of the low quality images but also to improve
the biometric-utility face quality metrics. Furthermore, preserving the
valuable identity information is of great importance. In this paper, we
investigate the impact of applying three state-of-the-art blind face
restoration techniques namely, GFP-GAN, GPEN and SGPN on the performance of
face verification system under very challenging environment characterized by
very low quality images. Extensive experimental results on the recently
proposed cross-quality LFW database using three state-of-the-art deep face
recognition models demonstrate the effectiveness of GFP-GAN in boosting
significantly the face verification accuracy.Comment: paper accepted at BIOSIG 2023 conferenc
- …