123 research outputs found

    Comparative analysis of universal methods no reference quality assessment of digital images

    Get PDF
    The main purpose of this article is to conduct a comparative study of two well-known no-reference image quality assessment algorithms BRISQUE and NIQE in order to analyze the relationship between subjective and quantitative assessments of image quality. As experimental data, we used images with artificially created distortions and mean expert assessments of their quality from the public databases TID2013, CISQ and LIVE. Image quality scores were calculated using the NIQE, BRISQUE functions and their average. The correlation coefficients of Pearson, Spearman and Kendall were analyzed between expert visual assessments and quantitative scores of the image quality, as well as between the values of three compared indicators. For the experiments, the Matlab system and values of its functions niqe and brisque normalized to the range [0, 1] were used. The computation time of niqe is slightly less. The investigated functions poorly estimate the contrast of images, but the additive Gaussian noise, Gaussian blur and loss in compression by the JPEG2000 algorithm are better. The BRISQUE measure shows slightly better results when evaluating images with additive Gaussian noise, while NIQE for blurred by Gaussian. The average of the normalized values of NIQE and BRISQUE is a good compromise. The results of this work may be of interest for the practical implementations of digital image analysis

    Towards a Robust Thermal-Visible Heterogeneous Face Recognition Approach Based on a Cycle Generative Adversarial Network

    Get PDF
    Security is a sensitive area that concerns all authorities around the world due to the emerging terrorism phenomenon. Contactless biometric technologies such as face recognition have grown in interest for their capacity to identify probe subjects without any human interaction. Since traditional face recognition systems use visible spectrum sensors, their performances decrease rapidly when some visible imaging phenomena occur, mainly illumination changes. Unlike the visible spectrum, Infrared spectra are invariant to light changes, which makes them an alternative solution for face recognition. However, in infrared, the textural information is lost. We aim, in this paper, to benefit from visible and thermal spectra by proposing a new heterogeneous face recognition approach. This approach includes four scientific contributions. The first one is the annotation of a thermal face database, which has been shared via Github with all the scientific community. The second is the proposition of a multi-sensors face detector model based on the last YOLO v3 architecture, able to detect simultaneously faces captured in visible and thermal images. The third contribution takes up the challenge of modality gap reduction between visible and thermal spectra, by applying a new structure of CycleGAN, called TV-CycleGAN, which aims to synthesize visible-like face images from thermal face images. This new thermal-visible synthesis method includes all extreme poses and facial expressions in color space. To show the efficacy and the robustness of the proposed TV-CycleGAN, experiments have been applied on three challenging benchmark databases, including different real-world scenarios: TUFTS and its aligned version, NVIE and PUJ. The qualitative evaluation shows that our method generates more realistic faces. The quantitative one demonstrates that the proposed TV -CycleGAN gives the best improvement on face recognition rates. Therefore, instead of applying a direct matching from thermal to visible images which allows a recognition rate of 47,06% for TUFTS Database, a proposed TV-CycleGAN ensures accuracy of 57,56% for the same database. It contributes to a rate enhancement of 29,16%, and 15,71% for NVIE and PUJ databases, respectively. It reaches an accuracy enhancement of 18,5% for the aligned TUFTS database. It also outperforms some recent state of the art methods in terms of F1-Score, AUC/EER and other evaluation metrics. Furthermore, it should be mentioned that the obtained visible synthesized face images using TV-CycleGAN method are very promising for thermal facial landmark detection as a fourth contribution of this paper

    Deep Learning Architectures for Heterogeneous Face Recognition

    Get PDF
    Face recognition has been one of the most challenging areas of research in biometrics and computer vision. Many face recognition algorithms are designed to address illumination and pose problems for visible face images. In recent years, there has been significant amount of research in Heterogeneous Face Recognition (HFR). The large modality gap between faces captured in different spectrum as well as lack of training data makes heterogeneous face recognition (HFR) quite a challenging problem. In this work, we present different deep learning frameworks to address the problem of matching non-visible face photos against a gallery of visible faces. Algorithms for thermal-to-visible face recognition can be categorized as cross-spectrum feature-based methods, or cross-spectrum image synthesis methods. In cross-spectrum feature-based face recognition a thermal probe is matched against a gallery of visible faces corresponding to the real-world scenario, in a feature subspace. The second category synthesizes a visible-like image from a thermal image which can then be used by any commercial visible spectrum face recognition system. These methods also beneficial in the sense that the synthesized visible face image can be directly utilized by existing face recognition systems which operate only on the visible face imagery. Therefore, using this approach one can leverage the existing commercial-off-the-shelf (COTS) and government-off-the-shelf (GOTS) solutions. In addition, the synthesized images can be used by human examiners for different purposes. There are some informative traits, such as age, gender, ethnicity, race, and hair color, which are not distinctive enough for the sake of recognition, but still can act as complementary information to other primary information, such as face and fingerprint. These traits, which are known as soft biometrics, can improve recognition algorithms while they are much cheaper and faster to acquire. They can be directly used in a unimodal system for some applications. Usually, soft biometric traits have been utilized jointly with hard biometrics (face photo) for different tasks in the sense that they are considered to be available both during the training and testing phases. In our approaches we look at this problem in a different way. We consider the case when soft biometric information does not exist during the testing phase, and our method can predict them directly in a multi-tasking paradigm. There are situations in which training data might come equipped with additional information that can be modeled as an auxiliary view of the data, and that unfortunately is not available during testing. This is the LUPI scenario. We introduce a novel framework based on deep learning techniques that leverages the auxiliary view to improve the performance of recognition system. We do so by introducing a formulation that is general, in the sense that can be used with any visual classifier. Every use of auxiliary information has been validated extensively using publicly available benchmark datasets, and several new state-of-the-art accuracy performance values have been set. Examples of application domains include visual object recognition from RGB images and from depth data, handwritten digit recognition, and gesture recognition from video. We also design a novel aggregation framework which optimizes the landmark locations directly using only one image without requiring any extra prior which leads to robust alignment given arbitrary face deformations. Three different approaches are employed to generate the manipulated faces and two of them perform the manipulation via the adversarial attacks to fool a face recognizer. This step can decouple from our framework and potentially used to enhance other landmark detectors. Aggregation of the manipulated faces in different branches of proposed method leads to robust landmark detection. Finally we focus on the generative adversarial networks which is a very powerful tool in synthesizing a visible-like images from the non-visible images. The main goal of a generative model is to approximate the true data distribution which is not known. In general, the choice for modeling the density function is challenging. Explicit models have the advantage of explicitly calculating the probability densities. There are two well-known implicit approaches, namely the Generative Adversarial Network (GAN) and Variational AutoEncoder (VAE) which try to model the data distribution implicitly. The VAEs try to maximize the data likelihood lower bound, while a GAN performs a minimax game between two players during its optimization. GANs overlook the explicit data density characteristics which leads to undesirable quantitative evaluations and mode collapse. This causes the generator to create similar looking images with poor diversity of samples. In the last chapter of thesis, we focus to address this issue in GANs framework

    Orthorectification of helicopter-borne high resolution experimental burn observation from infra red handheld imagers

    Get PDF
    To pursue the development and validation of coupled fire-atmosphere models, the wildland fire modeling community needs validation data sets with scenarios where fire-induced winds influence fire front behavior, and with high temporal and spatial resolution. Helicopter-borne infrared thermal cameras have the potential to monitor landscape-scale wildland fires at a high resolution during experimental burns. To extract valuable information from those observations, three-step image processing is required: (a) Orthorectification to warp raw images on a fixed coordinate system grid, (b) segmentation to delineate the fire front location out of the orthorectified images, and (c) computation of fire behavior metrics such as the rate of spread from the time-evolving fire front location. This work is dedicated to the first orthorectification step, and presents a series of algorithms that are designed to process handheld helicopter-borne thermal images collected during savannah experimental burns. The novelty in the approach lies on its recursive design, which does not require the presence of fixed ground control points, hence relaxing the constraint on field of view coverage and helping the acquisition of high-frequency observations. For four burns ranging from four to eight hectares, long-wave and mid infra red images were collected at 1 and 3 Hz, respectively, and orthorectified at a high spatial resolution (<1 m) with an absolute accuracy estimated to be lower than 4 m. Subsequent computation of fire radiative power is discussed with comparison to concurrent space-borne measurementsPeer ReviewedPostprint (published version

    Uncooled Microbolometer Imaging Systems for Machine Vision

    Get PDF
    Over the last 20 years, the cost of uncooled microbolometer-based imaging systems has drastically decreased while performance has increased. In the simplest terms, the figure of merit for these types of thermal detectors is given in terms of the τ-NETD product, the combination of the thermal time constant and the noise equivalent temperature difference. Considering these factors, optimal system design parameters are investigated to maximize visual information content. This dissertation focuses on improving scene information in the longwave infrared (LWIR) spectrum that has had its validity and quality degraded by noise, blur, and reflected radiance. Taken together, noise and blur degrade image quality, directly affecting system performance for object detectors trained with deep learning. Representing noise with NETD and blur in terms of equivalent angular resolution, this research provides a systematic method for relating design parameters to specific machine vision tasks that are difficult to define in a traditional imaging sense. This method provides for a system design approach based on information requirements rather than improvements to machine vision algorithms. As a machine vision function, automated target recognition (ATR) has improved with new technologies and the wide proliferation of infrared staring focal planes. Infrared search and track (IRST), which is target detection and localization at long ranges of unresolved targets, can be performed by both photon counting and microbolometer systems. The transition from broadband system design to one that involves spectral characterizations of components provides a better understanding of the performance and capabilities of new technologies. Unlike reflective bands such as visible and shortwave infrared (SWIR), reflected radiance reduces contrast in the LWIR, resulting in lost information. This research considers the sky path radiance contribution to the radiant exitance of a scene that reduces contrast, and consequently, information. Results show that reduced contrast can be overcome by utilizing multiband spectral imaging systems to remove the reflected component, thus increasing the scene information available. In addition, better scene consistency can be achieved between day and night when reflected radiance is removed. The multiband LWIR system designs presented take advantage of the low τ-NETD of modern microbolometers and demonstrate feasibility in future multiband applications

    MidWave vs LongWave Infrared Search and Track and Aerosol Scattering Target Acquisition Performance

    Get PDF
    The decision on whether to use a mid wave infrared (MWIR) or long wave infrared (LWIR) sensor for a given task can be a formidable verdict. The scope entails facts about the observable source, the atmospheric interactions, and the sensor parameters within the hardware device. Even when all the individual metrics are known, the combination ultimately determines whether a MWIR or LWIR sensor is more appropriate. Despite the vast number of variables at play, the reduction of inputs through focused studies can provide essential insight into MWIR and LWIR comparisons. This dissertation focuses on the roles of point source target detection, atmospheric scattering and absorption effects, and target identification has for MWIR vs LWIR performance. The point source analysis details the Pulse Visibility Factor (PVF) and how it affects the Signal to Noise (SNR) for Infrared Search and Track (IRST) tasks. The PVF is an essential parameter that not only depends upon camera system hardware but also the dynamics of the imaged point source target. The numerical predictions of the PVF show how the hardware transfer function spreads the point source object across the detector array. As a result, it is a critical aspect for MWIR vs LWIR IRST system performance. Atmospheric effects are another essential study for MWIR and LWIR imaging performance. Given the magnitude of atmospheric variables, the focus here is to reduce the atmospheric conditions with known particulates and concentrations to provide predictable results. The analysis details how a sparse aerosol medium can absorb and scatter incident light to produce a blur and compromise image quality. Predictions of the aerosol Modulation Transfer Function (MTF) detail the differences in MWIR vs LWIR performance due to aerosols. The MTFs are then added into the Night Vision Integrated Performance Model (NVIPM) to calculate the ability to identify a target at range for typical MWIR and LWIR sensors

    Evaluation of machine learning classifiers for mineralogy mapping based on near infrared hyperspectral imaging

    Get PDF
    The exploration of mineral resources is a major challenge in a world that seeks sustainable energy, renewable energy, advanced engineering, and new commercial technological devices. The rapid decrease in mineral reserves shifted the focus to under-explored and low accessibility areas that led to the use of on-site portable techniques for mineral mapping purposes, such as near infrared hyperspectral image sensors. The large datasets acquired with these instruments needs data pre-processing, a series of mathematical manipulations that can be achieved using machine learning. The aim of this thesis is to improve an existing method for mineralogy mapping, by focusing on the mineral classification phase. More specifically, a spectral similarity index was utilized to support machine learning classifiers. This was introduced because of the inability of the employed classification models to recognize samples that are not part of a given database; the models always classified samples based on one of the known labels of the database. This could be a problem in hyperspectral images as the pure component found in a sample could correspond to a mineral but also to noise or artefacts due to a variety of reasons, such as baseline correction. The spectral similarity index calculates the similarity between a sample spectrum and its assigned database class spectrum; this happens through the use of a threshold that defines whether the sample belongs to a class or not. The metrics utilized in the spectral similarity index were the spectral angler mapper, the correlation coefficient and five different distances. The machine learning classifiers used to evaluate the spectral similarity index were the decision tree, k-nearest neighbor, and support vector machine. Simulated distortions were also introduced in the dataset to test the robustness of the indexes and to choose the best classifier. The spectral similarity index was assessed with a dataset of nine minerals acquired from the Geological Survey of Finland retrieved from a Specim SWIR camera. The validation of the indexes was assessed with two mine samples obtained with a VTT active hyperspectral sensor prototype. The support vector machine was chosen after the comparison between the three classifiers as it showed higher tolerance to distorted data. With the evaluation of the spectral similarity indexes, was found out that the best performances were achieved with SAM and Chebyshev distance, which maintained high stability with smaller and bigger threshold changes. The best threshold value found is the one that, in the dataset analysed, corresponded to the number of spectra available for each class. As for the validation procedure no reference was available; because of this reason, the results of the mine samples obtained with the spectral similarity index were compared with results that can be obtained through visual interpretation, which were in agreement. The method proposed can be useful to future mineral exploration as it is of great importance to correctly classify minerals found during explorations, regardless the database utilized

    Reduced reference image and video quality assessments: review of methods

    Get PDF
    With the growing demand for image and video-based applications, the requirements of consistent quality assessment metrics of image and video have increased. Different approaches have been proposed in the literature to estimate the perceptual quality of images and videos. These approaches can be divided into three main categories; full reference (FR), reduced reference (RR) and no-reference (NR). In RR methods, instead of providing the original image or video as a reference, we need to provide certain features (i.e., texture, edges, etc.) of the original image or video for quality assessment. During the last decade, RR-based quality assessment has been a popular research area for a variety of applications such as social media, online games, and video streaming. In this paper, we present review and classification of the latest research work on RR-based image and video quality assessment. We have also summarized different databases used in the field of 2D and 3D image and video quality assessment. This paper would be helpful for specialists and researchers to stay well-informed about recent progress of RR-based image and video quality assessment. The review and classification presented in this paper will also be useful to gain understanding of multimedia quality assessment and state-of-the-art approaches used for the analysis. In addition, it will help the reader select appropriate quality assessment methods and parameters for their respective applications

    Face recognition by means of advanced contributions in machine learning

    Get PDF
    Face recognition (FR) has been extensively studied, due to both scientific fundamental challenges and current and potential applications where human identification is needed. FR systems have the benefits of their non intrusiveness, low cost of equipments and no useragreement requirements when doing acquisition, among the most important ones. Nevertheless, despite the progress made in last years and the different solutions proposed, FR performance is not yet satisfactory when more demanding conditions are required (different viewpoints, blocked effects, illumination changes, strong lighting states, etc). Particularly, the effect of such non-controlled lighting conditions on face images leads to one of the strongest distortions in facial appearance. This dissertation addresses the problem of FR when dealing with less constrained illumination situations. In order to approach the problem, a new multi-session and multi-spectral face database has been acquired in visible, Near-infrared (NIR) and Thermal infrared (TIR) spectra, under different lighting conditions. A theoretical analysis using information theory to demonstrate the complementarities between different spectral bands have been firstly carried out. The optimal exploitation of the information provided by the set of multispectral images has been subsequently addressed by using multimodal matching score fusion techniques that efficiently synthesize complementary meaningful information among different spectra. Due to peculiarities in thermal images, a specific face segmentation algorithm has been required and developed. In the final proposed system, the Discrete Cosine Transform as dimensionality reduction tool and a fractional distance for matching were used, so that the cost in processing time and memory was significantly reduced. Prior to this classification task, a selection of the relevant frequency bands is proposed in order to optimize the overall system, based on identifying and maximizing independence relations by means of discriminability criteria. The system has been extensively evaluated on the multispectral face database specifically performed for our purpose. On this regard, a new visualization procedure has been suggested in order to combine different bands for establishing valid comparisons and giving statistical information about the significance of the results. This experimental framework has more easily enabled the improvement of robustness against training and testing illumination mismatch. Additionally, focusing problem in thermal spectrum has been also addressed, firstly, for the more general case of the thermal images (or thermograms), and then for the case of facialthermograms from both theoretical and practical point of view. In order to analyze the quality of such facial thermograms degraded by blurring, an appropriate algorithm has been successfully developed. Experimental results strongly support the proposed multispectral facial image fusion, achieving very high performance in several conditions. These results represent a new advance in providing a robust matching across changes in illumination, further inspiring highly accurate FR approaches in practical scenarios.El reconeixement facial (FR) ha estat àmpliament estudiat, degut tant als reptes fonamentals científics que suposa com a les aplicacions actuals i futures on requereix la identificació de les persones. Els sistemes de reconeixement facial tenen els avantatges de ser no intrusius,presentar un baix cost dels equips d’adquisició i no la no necessitat d’autorització per part de l’individu a l’hora de realitzar l'adquisició, entre les més importants. De totes maneres i malgrat els avenços aconseguits en els darrers anys i les diferents solucions proposades, el rendiment del FR encara no resulta satisfactori quan es requereixen condicions més exigents (diferents punts de vista, efectes de bloqueig, canvis en la il·luminació, condicions de llum extremes, etc.). Concretament, l'efecte d'aquestes variacions no controlades en les condicions d'il·luminació sobre les imatges facials condueix a una de les distorsions més accentuades sobre l'aparença facial. Aquesta tesi aborda el problema del FR en condicions d'il·luminació menys restringides. Per tal d'abordar el problema, hem adquirit una nova base de dades de cara multisessió i multiespectral en l'espectre infraroig visible, infraroig proper (NIR) i tèrmic (TIR), sota diferents condicions d'il·luminació. En primer lloc s'ha dut a terme una anàlisi teòrica utilitzant la teoria de la informació per demostrar la complementarietat entre les diferents bandes espectrals objecte d’estudi. L'òptim aprofitament de la informació proporcionada pel conjunt d'imatges multiespectrals s'ha abordat posteriorment mitjançant l'ús de tècniques de fusió de puntuació multimodals, capaces de sintetitzar de manera eficient el conjunt d’informació significativa complementària entre els diferents espectres. A causa de les característiques particulars de les imatges tèrmiques, s’ha requerit del desenvolupament d’un algorisme específic per la segmentació de les mateixes. En el sistema proposat final, s’ha utilitzat com a eina de reducció de la dimensionalitat de les imatges, la Transformada del Cosinus Discreta i una distància fraccional per realitzar les tasques de classificació de manera que el cost en temps de processament i de memòria es va reduir de forma significa. Prèviament a aquesta tasca de classificació, es proposa una selecció de les bandes de freqüències més rellevants, basat en la identificació i la maximització de les relacions d'independència per mitjà de criteris discriminabilitat, per tal d'optimitzar el conjunt del sistema. El sistema ha estat àmpliament avaluat sobre la base de dades de cara multiespectral, desenvolupada pel nostre propòsit. En aquest sentit s'ha suggerit l’ús d’un nou procediment de visualització per combinar diferents bandes per poder establir comparacions vàlides i donar informació estadística sobre el significat dels resultats. Aquest marc experimental ha permès més fàcilment la millora de la robustesa quan les condicions d’il·luminació eren diferents entre els processos d’entrament i test. De forma complementària, s’ha tractat la problemàtica de l’enfocament de les imatges en l'espectre tèrmic, en primer lloc, pel cas general de les imatges tèrmiques (o termogrames) i posteriorment pel cas concret dels termogrames facials, des dels punt de vista tant teòric com pràctic. En aquest sentit i per tal d'analitzar la qualitat d’aquests termogrames facials degradats per efectes de desenfocament, s'ha desenvolupat un últim algorisme. Els resultats experimentals recolzen fermament que la fusió d'imatges facials multiespectrals proposada assoleix un rendiment molt alt en diverses condicions d’il·luminació. Aquests resultats representen un nou avenç en l’aportació de solucions robustes quan es contemplen canvis en la il·luminació, i esperen poder inspirar a futures implementacions de sistemes de reconeixement facial precisos en escenaris no controlats.Postprint (published version
    corecore