24 research outputs found
Deep Gaussian Denoiser Epistemic Uncertainty and Decoupled Dual-Attention Fusion
Following the performance breakthrough of denoising networks, improvements
have come chiefly through novel architecture designs and increased depth. While
novel denoising networks were designed for real images coming from different
distributions, or for specific applications, comparatively small improvement
was achieved on Gaussian denoising. The denoising solutions suffer from
epistemic uncertainty that can limit further advancements. This uncertainty is
traditionally mitigated through different ensemble approaches. However, such
ensembles are prohibitively costly with deep networks, which are already large
in size.
Our work focuses on pushing the performance limits of state-of-the-art
methods on Gaussian denoising. We propose a model-agnostic approach for
reducing epistemic uncertainty while using only a single pretrained network. We
achieve this by tapping into the epistemic uncertainty through augmented and
frequency-manipulated images to obtain denoised images with varying error. We
propose an ensemble method with two decoupled attention paths, over the pixel
domain and over that of our different manipulations, to learn the final fusion.
Our results significantly improve over the state-of-the-art baselines and
across varying noise levels.Comment: Code and models are publicly available on https://github.com/IVRL/DE
Boosting Cross-Quality Face Verification using Blind Face Restoration
In recent years, various Blind Face Restoration (BFR) techniques were
developed. These techniques transform low quality faces suffering from multiple
degradations to more realistic and natural face images with high perceptual
quality. However, it is crucial for the task of face verification to not only
enhance the perceptual quality of the low quality images but also to improve
the biometric-utility face quality metrics. Furthermore, preserving the
valuable identity information is of great importance. In this paper, we
investigate the impact of applying three state-of-the-art blind face
restoration techniques namely, GFP-GAN, GPEN and SGPN on the performance of
face verification system under very challenging environment characterized by
very low quality images. Extensive experimental results on the recently
proposed cross-quality LFW database using three state-of-the-art deep face
recognition models demonstrate the effectiveness of GFP-GAN in boosting
significantly the face verification accuracy.Comment: paper accepted at BIOSIG 2023 conferenc
SVW-UCF Dataset for Video Domain Adaptation
Unsupervised video domain adaptation (DA) has recently seen a lot of success, achieving almost if not perfect results on the majority of various benchmark datasets. Therefore, the next natural step for the field is to come up with new, more challenging problems that call for creative solutions. By combining two well known sets of data - SVW and UCF, we propose a large-scale video domain adaptation dataset that is not only larger in terms of samples and average video length, but also presents additional obstacles, such as orientation and intra-class variations, differences in resolution, and greater domain discrepancy, both in terms of content and capturing conditions. We perform an accuracy gap comparison which shows that both SVW→UCF and UCF→SVW are empirically more difficult to solve than existing adaptation paths. Finally, we evaluate two state of the art video DA algorithms on the dataset to present the benchmark results and provide a discussion on the properties which create the most confusion for modern video domain adaptation methods
Two-stage Progressive Residual Dense Attention Network for Image Denoising
Deep convolutional neural networks (CNNs) for image denoising can effectively
exploit rich hierarchical features and have achieved great success. However,
many deep CNN-based denoising models equally utilize the hierarchical features
of noisy images without paying attention to the more important and useful
features, leading to relatively low performance. To address the issue, we
design a new Two-stage Progressive Residual Dense Attention Network
(TSP-RDANet) for image denoising, which divides the whole process of denoising
into two sub-tasks to remove noise progressively. Two different attention
mechanism-based denoising networks are designed for the two sequential
sub-tasks: the residual dense attention module (RDAM) is designed for the first
stage, and the hybrid dilated residual dense attention module (HDRDAM) is
proposed for the second stage. The proposed attention modules are able to learn
appropriate local features through dense connection between different
convolutional layers, and the irrelevant features can also be suppressed. The
two sub-networks are then connected by a long skip connection to retain the
shallow feature to enhance the denoising performance. The experiments on seven
benchmark datasets have verified that compared with many state-of-the-art
methods, the proposed TSP-RDANet can obtain favorable results both on synthetic
and real noisy image denoising. The code of our TSP-RDANet is available at
https://github.com/WenCongWu/TSP-RDANet
EWT: Efficient Wavelet-Transformer for Single Image Denoising
Transformer-based image denoising methods have achieved encouraging results
in the past year. However, it must uses linear operations to model long-range
dependencies, which greatly increases model inference time and consumes GPU
storage space. Compared with convolutional neural network-based methods,
current Transformer-based image denoising methods cannot achieve a balance
between performance improvement and resource consumption. In this paper, we
propose an Efficient Wavelet Transformer (EWT) for image denoising.
Specifically, we use Discrete Wavelet Transform (DWT) and Inverse Wavelet
Transform (IWT) for downsampling and upsampling, respectively. This method can
fully preserve the image features while reducing the image resolution, thereby
greatly reducing the device resource consumption of the Transformer model.
Furthermore, we propose a novel Dual-stream Feature Extraction Block (DFEB) to
extract image features at different levels, which can further reduce model
inference time and GPU memory usage. Experiments show that our method speeds up
the original Transformer by more than 80%, reduces GPU memory usage by more
than 60%, and achieves excellent denoising results. All code will be public.Comment: 12 pages, 11 figur
Impact of traditional and embedded image denoising on CNN-based deep learning
In digital image processing, filtering noise is an important step for reconstructing a high-quality image for further processing such as object segmentation, object detection, and object recognition. Various image-denoising approaches, including median, Gaussian, and bilateral filters, are available in the literature. Since convolutional neural networks (CNN) are able to directly learn complex patterns and features from data, they have become a popular choice for image-denoising tasks. As a result of their ability to learn and adapt to various denoising scenarios, CNNs are powerful tools for image denoising. Some deep learning techniques such as CNN incorporate denoising strategies directly into the CNN model layers. A primary limitation of these methods is their necessity to resize images to a consistent size. This resizing can result in a loss of vital image details, which might compromise CNN’s effectiveness. Because of this issue, we utilize a traditional denoising method as a preliminary step for noise reduction before applying CNN. To our knowledge, a comparative performance study of CNN using traditional and embedded denoising against a baseline approach (without denoising) is yet to be performed. To analyze the impact of denoising on the CNN performance, in this paper, firstly, we filter the noise from the images using traditional means of denoising method before their use in the CNN model. Secondly, we embed a denoising layer in the CNN model. To validate the performance of image denoising, we performed extensive experiments for both traffic sign and object recognition datasets. To decide whether denoising will be adopted and to decide on the type of filter to be used, we also present an approach exploiting the peak-signal-to-noise-ratio (PSNRs) distribution of images. Both CNN accuracy and PSNRs distribution are used to evaluate the effectiveness of the denoising approaches. As expected, the results vary with the type of filter, impact, and dataset used in both traditional and embedded denoising approaches. However, traditional denoising shows better accuracy, while embedded denoising shows lower computational time for most of the cases. Overall, this comparative study gives insights into whether denoising will be adopted in various CNN-based image analyses, including autonomous driving, animal detection, and facial recognition