72 research outputs found
Joint-SRVDNet: Joint Super Resolution and Vehicle Detection Network
In many domestic and military applications, aerial vehicle detection and
super-resolutionalgorithms are frequently developed and applied independently.
However, aerial vehicle detection on super-resolved images remains a
challenging task due to the lack of discriminative information in the
super-resolved images. To address this problem, we propose a Joint
Super-Resolution and Vehicle DetectionNetwork (Joint-SRVDNet) that tries to
generate discriminative, high-resolution images of vehicles fromlow-resolution
aerial images. First, aerial images are up-scaled by a factor of 4x using a
Multi-scaleGenerative Adversarial Network (MsGAN), which has multiple
intermediate outputs with increasingresolutions. Second, a detector is trained
on super-resolved images that are upscaled by factor 4x usingMsGAN architecture
and finally, the detection loss is minimized jointly with the super-resolution
loss toencourage the target detector to be sensitive to the subsequent
super-resolution training. The network jointlylearns hierarchical and
discriminative features of targets and produces optimal super-resolution
results. Weperform both quantitative and qualitative evaluation of our proposed
network on VEDAI, xView and DOTAdatasets. The experimental results show that
our proposed framework achieves better visual quality than thestate-of-the-art
methods for aerial super-resolution with 4x up-scaling factor and improves the
accuracy ofaerial vehicle detection
Generative Adversarial Network and Its Application in Aerial Vehicle Detection and Biometric Identification System
In recent years, generative adversarial networks (GANs) have shown great potential in advancing the state-of-the-art in many areas of computer vision, most notably in image synthesis and manipulation tasks. GAN is a generative model which simultaneously trains a generator and a discriminator in an adversarial manner to produce real-looking synthetic data by capturing the underlying data distribution. Due to its powerful ability to generate high-quality and visually pleasingresults, we apply it to super-resolution and image-to-image translation techniques to address vehicle detection in low-resolution aerial images and cross-spectral cross-resolution iris recognition. First, we develop a Multi-scale GAN (MsGAN) with multiple intermediate outputs, which progressively learns the details and features of the high-resolution aerial images at different scales. Then the upscaled super-resolved aerial images are fed to a You Only Look Once-version 3 (YOLO-v3) object detector and the detection loss is jointly optimized along with a super-resolution loss to emphasize target vehicles sensitive to the super-resolution process. There is another problem that remains unsolved when detection takes place at night or in a dark environment, which requires an IR detector. Training such a detector needs a lot of infrared (IR) images. To address these challenges, we develop a GAN-based joint cross-modal super-resolution framework where low-resolution (LR) IR images are translated and super-resolved to high-resolution (HR) visible (VIS) images before applying detection. This approach significantly improves the accuracy of aerial vehicle detection by leveraging the benefits of super-resolution techniques in a cross-modal domain. Second, to increase the performance and reliability of deep learning-based biometric identification systems, we focus on developing conditional GAN (cGAN) based cross-spectral cross-resolution iris recognition and offer two different frameworks. The first approach trains a cGAN to jointly translate and super-resolve LR near-infrared (NIR) iris images to HR VIS iris images to perform cross-spectral cross-resolution iris matching to the same resolution and within the same spectrum. In the second approach, we design a coupled GAN (cpGAN) architecture to project both VIS and NIR iris images into a low-dimensional embedding domain. The goal of this architecture is to ensure maximum pairwise similarity between the feature vectors from the two iris modalities of the same subject. We have also proposed a pose attention-guided coupled profile-to-frontal face recognition network to learn discriminative and pose-invariant features in an embedding subspace. To show that the feature vectors learned by this deep subspace can be used for other tasks beyond recognition, we implement a GAN architecture which is able to reconstruct a frontal face from its corresponding profile face. This capability can be used in various face analysis tasks, such as emotion detection and expression tracking, where having a frontal face image can improve accuracy and reliability. Overall, our research works have shown its efficacy by achieving new state-of-the-art results through extensive experiments on publicly available datasets reported in the literature
Implementation of Super Resolution Techniques in Geospatial Satellite Imagery
The potential for more precise land cover classifications and pattern analysis is provided by technological advancements and the growing accessibility of high-resolution satellite images, which might significantly improve the detection and quantification of land cover change for conservation. A group of methods known as "super-resolution imaging" use generative modelling to increase the resolution of an imaging system. Super-Resolution Imaging, which falls under the category of sophisticated computer vision and image processing, has a variety of practical uses, including astronomical imaging, surveillance and security, medical imaging, and satellite imaging. As computer vision is where deep learning algorithms for super-resolution first appeared, they were mostly created on RGB images in 8-bit colour depth, where the sensor and camera are separated by a few meters. But no evaluation of these methods has been done
Underwater Image Super-Resolution using Deep Residual Multipliers
We present a deep residual network-based generative model for single image
super-resolution (SISR) of underwater imagery for use by autonomous underwater
robots. We also provide an adversarial training pipeline for learning SISR from
paired data. In order to supervise the training, we formulate an objective
function that evaluates the \textit{perceptual quality} of an image based on
its global content, color, and local style information. Additionally, we
present USR-248, a large-scale dataset of three sets of underwater images of
'high' (640x480) and 'low' (80x60, 160x120, and 320x240) spatial resolution.
USR-248 contains paired instances for supervised training of 2x, 4x, or 8x SISR
models. Furthermore, we validate the effectiveness of our proposed model
through qualitative and quantitative experiments and compare the results with
several state-of-the-art models' performances. We also analyze its practical
feasibility for applications such as scene understanding and attention modeling
in noisy visual conditions
Generative Adversarial Super-Resolution at the Edge with Knowledge Distillation
Single-Image Super-Resolution can support robotic tasks in environments where
a reliable visual stream is required to monitor the mission, handle
teleoperation or study relevant visual details. In this work, we propose an
efficient Generative Adversarial Network model for real-time Super-Resolution.
We adopt a tailored architecture of the original SRGAN and model quantization
to boost the execution on CPU and Edge TPU devices, achieving up to 200 fps
inference. We further optimize our model by distilling its knowledge to a
smaller version of the network and obtain remarkable improvements compared to
the standard training approach. Our experiments show that our fast and
lightweight model preserves considerably satisfying image quality compared to
heavier state-of-the-art models. Finally, we conduct experiments on image
transmission with bandwidth degradation to highlight the advantages of the
proposed system for mobile robotic applications
License Plate Super-Resolution Using Diffusion Models
In surveillance, accurately recognizing license plates is hindered by their
often low quality and small dimensions, compromising recognition precision.
Despite advancements in AI-based image super-resolution, methods like
Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs)
still fall short in enhancing license plate images. This study leverages the
cutting-edge diffusion model, which has consistently outperformed other deep
learning techniques in image restoration. By training this model using a
curated dataset of Saudi license plates, both in low and high resolutions, we
discovered the diffusion model's superior efficacy. The method achieves a
12.55\% and 37.32% improvement in Peak Signal-to-Noise Ratio (PSNR) over SwinIR
and ESRGAN, respectively. Moreover, our method surpasses these techniques in
terms of Structural Similarity Index (SSIM), registering a 4.89% and 17.66%
improvement over SwinIR and ESRGAN, respectively. Furthermore, 92% of human
evaluators preferred our images over those from other algorithms. In essence,
this research presents a pioneering solution for license plate
super-resolution, with tangible potential for surveillance systems
- …