Search CORE

1,058 research outputs found

Deep Adversarial Frameworks for Visually Explainable Periocular Recognition

Author: Brito João Pedro da Cruz
Publication venue
Publication date: 13/07/2021
Field of study

Machine Learning (ML) models have pushed stateoftheart performance closer to (and even beyond) human level. However, the core of such algorithms is usually latent and hardly understandable. Thus, the field of Explainability focuses on researching and adopting techniques that can explain the reasons that support a model’s predictions. Such explanations of the decisionmaking process would help to build trust between said model and the human(s) using it. An explainable system also allows for better debugging, during the training phase, and fixing, upon deployment. But why should a developer devote time and effort into refactoring or rethinking Artificial Intelligence (AI) systems, to make them more transparent? Don’t they work just fine? Despite the temptation to answer ”yes”, are we really considering the cases where these systems fail? Are we assuming that ”almost perfect” accuracy is good enough? What if, some of the cases where these systems get it right, were just a small margin away from a complete miss? Does that even matter? Considering the evergrowing presence of ML models in crucial areas like forensics, security and healthcare services, it clearly does. Motivating these concerns is the fact that powerful systems often operate as blackboxes, hiding the core reasoning underneath layers of abstraction [Gue]. In this scenario, there could be some seriously negative outcomes if opaque algorithms gamble on the presence of tumours in Xray images or the way autonomous vehicles behave in traffic. It becomes clear, then, that incorporating explainability with AI is imperative. More recently, the politicians have addressed this urgency through the General Data Protection Regulation (GDPR) [Com18]. With this document, the European Union (EU) brings forward several important concepts, amongst which, the ”right to an explanation”. The definition and scope are still subject to debate [MF17], but these are definite strides to formally regulate the explainable depth of autonomous systems. Based on the preface above, this work describes a periocular recognition framework that not only performs biometric recognition but also provides clear representations of the features/regions that support a prediction. Being particularly designed to explain nonmatch (”impostors”) decisions, our solution uses adversarial generative techniques to synthesise a large set of ”genuine” image pairs, from where the most similar elements with respect to a query are retrieved. Then, assuming the alignment between the query/retrieved pairs, the elementwise differences between the query and a weighted average of the retrieved elements yields a visual explanation of the regions in the query pair that would have to be different to transform it into a ”genuine” pair. Our quantitative and qualitative experiments validate the proposed solution, yielding recognition rates that are similar to the stateoftheart, while adding visually pleasing explanations

UBibliorum repositorio digital da ubi

Understanding How Image Quality Affects Deep Neural Networks

Author: Dodge Samuel
Karam Lina
Publication venue
Publication date: 21/04/2016
Field of study

Image quality is an important practical challenge that is often overlooked in the design of machine vision systems. Commonly, machine vision systems are trained and tested on high quality image datasets, yet in practical applications the input images can not be assumed to be of high quality. Recently, deep neural networks have obtained state-of-the-art performance on many machine vision tasks. In this paper we provide an evaluation of 4 state-of-the-art deep neural network models for image classification under quality distortions. We consider five types of quality distortions: blur, noise, contrast, JPEG, and JPEG2000 compression. We show that the existing networks are susceptible to these quality distortions, particularly to blur and noise. These results enable future work in developing deep neural networks that are more invariant to quality distortions.Comment: Final version will appear in IEEE Xplore in the Proceedings of the Conference on the Quality of Multimedia Experience (QoMEX), June 6-8, 201

arXiv.org e-Print Archive

Crossref

Generative Adversarial Network based machine for fake data generation

Author: Piacentino Esteban
Publication venue: Universitat Politècnica de Catalunya
Publication date: 09/10/2019
Field of study

This paper introduces a first approach on using Generative Adversarial Networks (GANs) for the generation of fake data, with the objective of anonymizing patients information in the health sector. This is intended to create valuable data that can be used both, in educational and research areas, while avoiding the risk of a sensitive data leakage. For this purpose, firstly a thorough research on GAN’s state of the art and available databases has been developed. The outcome of the project is a GAN system prototype adapted to generate raw data that imitates samples such as users variable status on hypothyroidism or a cardiogram report. The performance of this prototype has been checked and satisfactory results have been obtained for this first phase. Moreover, a novel research pathway has been opened so further research can be developed

UPCommons. Portal del coneixement obert de la UPC

Generative Adversarial Network and Its Application in Aerial Vehicle Detection and Biometric Identification System

Author: Mostofa Moktari
Publication venue: 'West Virginia University Libraries'
Publication date: 01/01/2023
Field of study

In recent years, generative adversarial networks (GANs) have shown great potential in advancing the state-of-the-art in many areas of computer vision, most notably in image synthesis and manipulation tasks. GAN is a generative model which simultaneously trains a generator and a discriminator in an adversarial manner to produce real-looking synthetic data by capturing the underlying data distribution. Due to its powerful ability to generate high-quality and visually pleasingresults, we apply it to super-resolution and image-to-image translation techniques to address vehicle detection in low-resolution aerial images and cross-spectral cross-resolution iris recognition. First, we develop a Multi-scale GAN (MsGAN) with multiple intermediate outputs, which progressively learns the details and features of the high-resolution aerial images at different scales. Then the upscaled super-resolved aerial images are fed to a You Only Look Once-version 3 (YOLO-v3) object detector and the detection loss is jointly optimized along with a super-resolution loss to emphasize target vehicles sensitive to the super-resolution process. There is another problem that remains unsolved when detection takes place at night or in a dark environment, which requires an IR detector. Training such a detector needs a lot of infrared (IR) images. To address these challenges, we develop a GAN-based joint cross-modal super-resolution framework where low-resolution (LR) IR images are translated and super-resolved to high-resolution (HR) visible (VIS) images before applying detection. This approach significantly improves the accuracy of aerial vehicle detection by leveraging the benefits of super-resolution techniques in a cross-modal domain. Second, to increase the performance and reliability of deep learning-based biometric identification systems, we focus on developing conditional GAN (cGAN) based cross-spectral cross-resolution iris recognition and offer two different frameworks. The first approach trains a cGAN to jointly translate and super-resolve LR near-infrared (NIR) iris images to HR VIS iris images to perform cross-spectral cross-resolution iris matching to the same resolution and within the same spectrum. In the second approach, we design a coupled GAN (cpGAN) architecture to project both VIS and NIR iris images into a low-dimensional embedding domain. The goal of this architecture is to ensure maximum pairwise similarity between the feature vectors from the two iris modalities of the same subject. We have also proposed a pose attention-guided coupled profile-to-frontal face recognition network to learn discriminative and pose-invariant features in an embedding subspace. To show that the feature vectors learned by this deep subspace can be used for other tasks beyond recognition, we implement a GAN architecture which is able to reconstruct a frontal face from its corresponding profile face. This capability can be used in various face analysis tasks, such as emotion detection and expression tracking, where having a frontal face image can improve accuracy and reliability. Overall, our research works have shown its efficacy by achieving new state-of-the-art results through extensive experiments on publicly available datasets reported in the literature

The Research Repository @ WVU (West Virginia University)