26 research outputs found

    Noise-limited scene-change detection in images

    Get PDF
    This thesis describes the theoretical, experimental, and practical aspects of a noise-limited method for scene-change detection in images. The research is divided into three sections: noise analysis and modelling, dual illumination scene-change modelling, and integration of noise into the scene-change model. The sources of noise within commercially available digital cameras are described, with a new model for image noise derived for charge-coupled device (CCD) cameras. The model is validated experimentally through the development of techniques that allow the individual noise components to be measured from the analysis of output images alone. A generic model for complementary metal-oxide-semiconductor (CMOS) cameras is also derived. Methods for the analysis of spatial (inter-pixel) and temporal (intra-pixel) noise are developed. These are used subsequently to investigate the effects of environmental temperature on camera noise. Based on the cameras tested, the results show that the CCD camera noise response to variation in environmental temperature is complex whereas the CMOS camera response simply increases monotonically. A new concept for scene-change detection is proposed based upon a dual illumination concept where both direct and ambient illumination sources are present in an environment, such as that which occurs in natural outdoor scenes with direct sunlight and ambient skylight. The transition of pixel colour from the combined direct and ambient illuminants to the ambient illuminant only is modelled. A method for shadow-free scene-change is then developed that predicts a pixel's colour when the area in the scene is subjected to ambient illumination only, allowing pixel change to be distinguished as either being due to a cast shadow or due to a genuine change in the scene. Experiments on images captured in controlled lighting demonstrate 91% of scene-change and 83% of cast shadows are correctly determined from analysis of pixel colour change alone. A statistical method for detecting shadow-free scene-change is developed. This is achieved by bounding the dual illumination model by the confidence interval associated with the pixel's noise. Three benefits arise from the integration of noise into the scene-change detection method: - The necessity for pre-filtering images for noise is removed; - All empirical thresholds are removed; and - Performance is improved. The noise-limited scene-change detection algorithm correctly classifies 93% of scene-change and 87% of cast shadows from pixel colour change alone. When simple post-analysis size-filtering is applied both these figures increase to 95%

    Illumination Invariant Deep Learning for Hyperspectral Data

    Get PDF
    Motivated by the variability in hyperspectral images due to illumination and the difficulty in acquiring labelled data, this thesis proposes different approaches for learning illumination invariant feature representations and classification models for hyperspectral data captured outdoors, under natural sunlight. The approaches integrate domain knowledge into learning algorithms and hence does not rely on a priori knowledge of atmospheric parameters, additional sensors or large amounts of labelled training data. Hyperspectral sensors record rich semantic information from a scene, making them useful for robotics or remote sensing applications where perception systems are used to gain an understanding of the scene. Images recorded by hyperspectral sensors can, however, be affected to varying degrees by intrinsic factors relating to the sensor itself (keystone, smile, noise, particularly at the limits of the sensed spectral range) but also by extrinsic factors such as the way the scene is illuminated. The appearance of the scene in the image is tied to the incident illumination which is dependent on variables such as the position of the sun, geometry of the surface and the prevailing atmospheric conditions. Effects like shadows can make the appearance and spectral characteristics of identical materials to be significantly different. This degrades the performance of high-level algorithms that use hyperspectral data, such as those that do classification and clustering. If sufficient training data is available, learning algorithms such as neural networks can capture variability in the scene appearance and be trained to compensate for it. Learning algorithms are advantageous for this task because they do not require a priori knowledge of the prevailing atmospheric conditions or data from additional sensors. Labelling of hyperspectral data is, however, difficult and time-consuming, so acquiring enough labelled samples for the learning algorithm to adequately capture the scene appearance is challenging. Hence, there is a need for the development of techniques that are invariant to the effects of illumination that do not require large amounts of labelled data. In this thesis, an approach to learning a representation of hyperspectral data that is invariant to the effects of illumination is proposed. This approach combines a physics-based model of the illumination process with an unsupervised deep learning algorithm, and thus requires no labelled data. Datasets that vary both temporally and spatially are used to compare the proposed approach to other similar state-of-the-art techniques. The results show that the learnt representation is more invariant to shadows in the image and to variations in brightness due to changes in the scene topography or position of the sun in the sky. The results also show that a supervised classifier can predict class labels more accurately and more consistently across time when images are represented using the proposed method. Additionally, this thesis proposes methods to train supervised classification models to be more robust to variations in illumination where only limited amounts of labelled data are available. The transfer of knowledge from well-labelled datasets to poorly labelled datasets for classification is investigated. A method is also proposed for enabling small amounts of labelled samples to capture the variability in spectra across the scene. These samples are then used to train a classifier to be robust to the variability in the data caused by variations in illumination. The results show that these approaches make convolutional neural network classifiers more robust and achieve better performance when there is limited labelled training data. A case study is presented where a pipeline is proposed that incorporates the methods proposed in this thesis for learning robust feature representations and classification models. A scene is clustered using no labelled data. The results show that the pipeline groups the data into clusters that are consistent with the spatial distribution of the classes in the scene as determined from ground truth

    Sound Processing for Autonomous Driving

    Get PDF
    Nowadays, a variety of intelligent systems for autonomous driving have been developed, which have already shown a very high level of capability. One of the prerequisites for autonomous driving is an accurate and reliable representation of the environment around the vehicle. Current systems rely on cameras, RADAR, and LiDAR to capture the visual environment and to locate and track other traffic participants. Human drivers, in addition to vision, have hearing and use a lot of auditory information to understand the environment in addition to visual cues. In this thesis, we present the sound signal processing system for auditory based environment representation. Sound propagation is less dependent on occlusion than all other types of sensors and in some situations is less sensitive to different types of weather conditions such as snow, ice, fog or rain. Various audio processing algorithms provide the detection and classification of different audio signals specific to certain types of vehicles, as well as localization. First, the ambient sound is classified into fourteen major categories consisting of traffic objects and actions performed. Additionally, the classification of three specific types of emergency vehicles sirens is provided. Secondly, each object is localized using a combined localization algorithm based on time difference of arrival and amplitude. The system is evaluated on real data with a focus on reliable detection and accurate localization of emergency vehicles. On the third stage the possibility of visualizing the sound source on the image from the autonomous vehicle camera system is provided. For this purpose, a method for camera to microphones calibration has been developed. The presented approaches and methods have great potential to increase the accuracy of environment perception and, consequently, to improve the reliability and safety of autonomous driving systems in general

    Distributed scene reconstruction from multiple mobile platforms

    Get PDF
    Recent research on mobile robotics has produced new designs that provide house-hold robots with omnidirectional motion. The image sensor embedded in these devices motivates the application of 3D vision techniques on them for navigation and mapping purposes. In addition to this, distributed cheapsensing systems acting as unitary entity have recently been discovered as an efficient alternative to expensive mobile equipment. In this work we present an implementation of a visual reconstruction method, structure from motion (SfM), on a low-budget, omnidirectional mobile platform, and extend this method to distributed 3D scene reconstruction with several instances of such a platform. Our approach overcomes the challenges yielded by the plaform. The unprecedented levels of noise produced by the image compression typical of the platform is processed by our feature filtering methods, which ensure suitable feature matching populations for epipolar geometry estimation by means of a strict quality-based feature selection. The robust pose estimation algorithms implemented, along with a novel feature tracking system, enable our incremental SfM approach to novelly deal with ill-conditioned inter-image configurations provoked by the omnidirectional motion. The feature tracking system developed efficiently manages the feature scarcity produced by noise and outputs quality feature tracks, which allow robust 3D mapping of a given scene even if - due to noise - their length is shorter than what it is usually assumed for performing stable 3D reconstructions. The distributed reconstruction from multiple instances of SfM is attained by applying loop-closing techniques. Our multiple reconstruction system merges individual 3D structures and resolves the global scale problem with minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping stretches of sequences. The performance of this system is demonstrated in the 2-session case. The management of noise, the stability against ill-configurations and the robustness of our SfM system is validated on a number of experiments and compared with state-of-the-art approaches. Possible future research areas are also discussed

    Face recognition by means of advanced contributions in machine learning

    Get PDF
    Face recognition (FR) has been extensively studied, due to both scientific fundamental challenges and current and potential applications where human identification is needed. FR systems have the benefits of their non intrusiveness, low cost of equipments and no useragreement requirements when doing acquisition, among the most important ones. Nevertheless, despite the progress made in last years and the different solutions proposed, FR performance is not yet satisfactory when more demanding conditions are required (different viewpoints, blocked effects, illumination changes, strong lighting states, etc). Particularly, the effect of such non-controlled lighting conditions on face images leads to one of the strongest distortions in facial appearance. This dissertation addresses the problem of FR when dealing with less constrained illumination situations. In order to approach the problem, a new multi-session and multi-spectral face database has been acquired in visible, Near-infrared (NIR) and Thermal infrared (TIR) spectra, under different lighting conditions. A theoretical analysis using information theory to demonstrate the complementarities between different spectral bands have been firstly carried out. The optimal exploitation of the information provided by the set of multispectral images has been subsequently addressed by using multimodal matching score fusion techniques that efficiently synthesize complementary meaningful information among different spectra. Due to peculiarities in thermal images, a specific face segmentation algorithm has been required and developed. In the final proposed system, the Discrete Cosine Transform as dimensionality reduction tool and a fractional distance for matching were used, so that the cost in processing time and memory was significantly reduced. Prior to this classification task, a selection of the relevant frequency bands is proposed in order to optimize the overall system, based on identifying and maximizing independence relations by means of discriminability criteria. The system has been extensively evaluated on the multispectral face database specifically performed for our purpose. On this regard, a new visualization procedure has been suggested in order to combine different bands for establishing valid comparisons and giving statistical information about the significance of the results. This experimental framework has more easily enabled the improvement of robustness against training and testing illumination mismatch. Additionally, focusing problem in thermal spectrum has been also addressed, firstly, for the more general case of the thermal images (or thermograms), and then for the case of facialthermograms from both theoretical and practical point of view. In order to analyze the quality of such facial thermograms degraded by blurring, an appropriate algorithm has been successfully developed. Experimental results strongly support the proposed multispectral facial image fusion, achieving very high performance in several conditions. These results represent a new advance in providing a robust matching across changes in illumination, further inspiring highly accurate FR approaches in practical scenarios.El reconeixement facial (FR) ha estat àmpliament estudiat, degut tant als reptes fonamentals científics que suposa com a les aplicacions actuals i futures on requereix la identificació de les persones. Els sistemes de reconeixement facial tenen els avantatges de ser no intrusius,presentar un baix cost dels equips d’adquisició i no la no necessitat d’autorització per part de l’individu a l’hora de realitzar l'adquisició, entre les més importants. De totes maneres i malgrat els avenços aconseguits en els darrers anys i les diferents solucions proposades, el rendiment del FR encara no resulta satisfactori quan es requereixen condicions més exigents (diferents punts de vista, efectes de bloqueig, canvis en la il·luminació, condicions de llum extremes, etc.). Concretament, l'efecte d'aquestes variacions no controlades en les condicions d'il·luminació sobre les imatges facials condueix a una de les distorsions més accentuades sobre l'aparença facial. Aquesta tesi aborda el problema del FR en condicions d'il·luminació menys restringides. Per tal d'abordar el problema, hem adquirit una nova base de dades de cara multisessió i multiespectral en l'espectre infraroig visible, infraroig proper (NIR) i tèrmic (TIR), sota diferents condicions d'il·luminació. En primer lloc s'ha dut a terme una anàlisi teòrica utilitzant la teoria de la informació per demostrar la complementarietat entre les diferents bandes espectrals objecte d’estudi. L'òptim aprofitament de la informació proporcionada pel conjunt d'imatges multiespectrals s'ha abordat posteriorment mitjançant l'ús de tècniques de fusió de puntuació multimodals, capaces de sintetitzar de manera eficient el conjunt d’informació significativa complementària entre els diferents espectres. A causa de les característiques particulars de les imatges tèrmiques, s’ha requerit del desenvolupament d’un algorisme específic per la segmentació de les mateixes. En el sistema proposat final, s’ha utilitzat com a eina de reducció de la dimensionalitat de les imatges, la Transformada del Cosinus Discreta i una distància fraccional per realitzar les tasques de classificació de manera que el cost en temps de processament i de memòria es va reduir de forma significa. Prèviament a aquesta tasca de classificació, es proposa una selecció de les bandes de freqüències més rellevants, basat en la identificació i la maximització de les relacions d'independència per mitjà de criteris discriminabilitat, per tal d'optimitzar el conjunt del sistema. El sistema ha estat àmpliament avaluat sobre la base de dades de cara multiespectral, desenvolupada pel nostre propòsit. En aquest sentit s'ha suggerit l’ús d’un nou procediment de visualització per combinar diferents bandes per poder establir comparacions vàlides i donar informació estadística sobre el significat dels resultats. Aquest marc experimental ha permès més fàcilment la millora de la robustesa quan les condicions d’il·luminació eren diferents entre els processos d’entrament i test. De forma complementària, s’ha tractat la problemàtica de l’enfocament de les imatges en l'espectre tèrmic, en primer lloc, pel cas general de les imatges tèrmiques (o termogrames) i posteriorment pel cas concret dels termogrames facials, des dels punt de vista tant teòric com pràctic. En aquest sentit i per tal d'analitzar la qualitat d’aquests termogrames facials degradats per efectes de desenfocament, s'ha desenvolupat un últim algorisme. Els resultats experimentals recolzen fermament que la fusió d'imatges facials multiespectrals proposada assoleix un rendiment molt alt en diverses condicions d’il·luminació. Aquests resultats representen un nou avenç en l’aportació de solucions robustes quan es contemplen canvis en la il·luminació, i esperen poder inspirar a futures implementacions de sistemes de reconeixement facial precisos en escenaris no controlats.Postprint (published version

    Combining omnidirectional vision with polarization vision for robot navigation

    Get PDF
    La polarisation est le phénomène qui décrit les orientations des oscillations des ondes lumineuses qui sont limitées en direction. La lumière polarisée est largement utilisée dans le règne animal,à partir de la recherche de nourriture, la défense et la communication et la navigation. Le chapitre (1) aborde brièvement certains aspects importants de la polarisation et explique notre problématique de recherche. Nous visons à utiliser un capteur polarimétrique-catadioptrique car il existe de nombreuses applications qui peuvent bénéficier d'une telle combinaison en vision par ordinateur et en robotique, en particulier pour l'estimation d'attitude et les applications de navigation. Le chapitre (2) couvre essentiellement l'état de l'art de l'estimation d'attitude basée sur la vision.Quand la lumière non-polarisée du soleil pénètre dans l'atmosphère, l'air entraine une diffusion de Rayleigh, et la lumière devient partiellement linéairement polarisée. Le chapitre (3) présente les motifs de polarisation de la lumière naturelle et couvre l'état de l'art des méthodes d'acquisition des motifs de polarisation de la lumière naturelle utilisant des capteurs omnidirectionnels (par exemple fisheye et capteurs catadioptriques). Nous expliquons également les caractéristiques de polarisation de la lumière naturelle et donnons une nouvelle dérivation théorique de son angle de polarisation.Notre objectif est d'obtenir une vue omnidirectionnelle à 360 associée aux caractéristiques de polarisation. Pour ce faire, ce travail est basé sur des capteurs catadioptriques qui sont composées de surfaces réfléchissantes et de lentilles. Généralement, la surface réfléchissante est métallique et donc l'état de polarisation de la lumière incidente, qui est le plus souvent partiellement linéairement polarisée, est modifiée pour être polarisée elliptiquement après réflexion. A partir de la mesure de l'état de polarisation de la lumière réfléchie, nous voulons obtenir l'état de polarisation incident. Le chapitre (4) propose une nouvelle méthode pour mesurer les paramètres de polarisation de la lumière en utilisant un capteur catadioptrique. La possibilité de mesurer le vecteur de Stokes du rayon incident est démontré à partir de trois composants du vecteur de Stokes du rayon réfléchi sur les quatre existants.Lorsque les motifs de polarisation incidents sont disponibles, les angles zénithal et azimutal du soleil peuvent être directement estimés à l'aide de ces modèles. Le chapitre (5) traite de l'orientation et de la navigation de robot basées sur la polarisation et différents algorithmes sont proposés pour estimer ces angles dans ce chapitre. A notre connaissance, l'angle zénithal du soleil est pour la première fois estimé dans ce travail à partir des schémas de polarisation incidents. Nous proposons également d'estimer l'orientation d'un véhicule à partir de ces motifs de polarisation.Enfin, le travail est conclu et les possibles perspectives de recherche sont discutées dans le chapitre (6). D'autres exemples de schémas de polarisation de la lumière naturelle, leur calibrage et des applications sont proposées en annexe (B).Notre travail pourrait ouvrir un accès au monde de la vision polarimétrique omnidirectionnelle en plus des approches conventionnelles. Cela inclut l'orientation bio-inspirée des robots, des applications de navigation, ou bien la localisation en plein air pour laquelle les motifs de polarisation de la lumière naturelle associés à l'orientation du soleil à une heure précise peuvent aboutir à la localisation géographique d'un véhiculePolarization is the phenomenon that describes the oscillations orientations of the light waves which are restricted in direction. Polarized light has multiple uses in the animal kingdom ranging from foraging, defense and communication to orientation and navigation. Chapter (1) briefly covers some important aspects of polarization and explains our research problem. We are aiming to use a polarimetric-catadioptric sensor since there are many applications which can benefit from such combination in computer vision and robotics specially robot orientation (attitude estimation) and navigation applications. Chapter (2) mainly covers the state of art of visual based attitude estimation.As the unpolarized sunlight enters the Earth s atmosphere, it is Rayleigh-scattered by air, and it becomes partially linearly polarized. This skylight polarization provides a signi cant clue to understanding the environment. Its state conveys the information for obtaining the sun orientation. Robot navigation, sensor planning, and many other applications may bene t from using this navigation clue. Chapter (3) covers the state of art in capturing the skylight polarization patterns using omnidirectional sensors (e.g fisheye and catadioptric sensors). It also explains the skylight polarization characteristics and gives a new theoretical derivation of the skylight angle of polarization pattern. Our aim is to obtain an omnidirectional 360 view combined with polarization characteristics. Hence, this work is based on catadioptric sensors which are composed of reflective surfaces and lenses. Usually the reflective surface is metallic and hence the incident skylight polarization state, which is mostly partially linearly polarized, is changed to be elliptically polarized after reflection. Given the measured reflected polarization state, we want to obtain the incident polarization state. Chapter (4) proposes a method to measure the light polarization parameters using a catadioptric sensor. The possibility to measure the incident Stokes is proved given three Stokes out of the four reflected Stokes. Once the incident polarization patterns are available, the solar angles can be directly estimated using these patterns. Chapter (5) discusses polarization based robot orientation and navigation and proposes new algorithms to estimate these solar angles where, to the best of our knowledge, the sun zenith angle is firstly estimated in this work given these incident polarization patterns. We also propose to estimate any vehicle orientation given these polarization patterns. Finally the work is concluded and possible future research directions are discussed in chapter (6). More examples of skylight polarization patterns, their calibration, and the proposed applications are given in appendix (B). Our work may pave the way to move from the conventional polarization vision world to the omnidirectional one. It enables bio-inspired robot orientation and navigation applications and possible outdoor localization based on the skylight polarization patterns where given the solar angles at a certain date and instant of time may infer the current vehicle geographical location.DIJON-BU Doc.électronique (212319901) / SudocSudocFranceF

    Colour measurement and colour reproduction systems.

    Get PDF
    Thesis (M.Sc.Eng.)-University of Natal, Durban, 1987.Techniques of colour measurement and colour reproduction are important in a wide range of commercial and social activities in most modern economies. Their study thus constitutes one of the major areas of interest to the CIE. The project described in this thesis began as an outgrowth of studies of new types of light sources and of the colorimetry of colour-TV systems; plus a conviction that modern TV cameras can operate effectively with a wide range of different illuminating spectra. It was soon evident that two important prerequisites for this research were: an understanding of the processes of human colour vision; and a knowledge of the standard, international, colorimetric terminology of the CIE. These topics are discussed fully in the text. Also included is a review of modern gas-discharge lamps, the~y properties, and their applications. Both high-pressure (HID) types and low-pressure (fluorescent-tube) types are considered. Because of the need to measure the colours of surfaces and their TV reproductions as accurately as possible, various forms of colorimeter were examined, leading to the choice of a spectrophotometer system for this work. The design, construction, and evaluation of an original spetrophotometer system (the UND Spectrophotometer) are described fully in the text. Finally, attention is given to the operation of a television system under nonstandard lighting. Twelve different light sources were evaluated as TV ((taking" illuminants, using both subjective and colorimetric methods of assessment. The experimental results tend to confirm that colorimetric methods are unsuited to colour reproduction evaluation, and that subjective methods are more meaningful. A subjective scale of colour reproduction performance was established, and it was found to correlate closely with the CIE general colour rendering index (Ra) for the various test lamps. The work reported herein predates similar experiments with TV lighting by other workers, and it includes a wider range of light sources. In spite of differences in experimental technique, however, there is broad agreement with their general results
    corecore