2,899 research outputs found

    Cast shadow modelling and detection

    Get PDF
    Computer vision applications are often confronted by the need to differentiate between objects and their shadows. A number of shadow detection algorithms have been proposed in literature, based on physical, geometrical, and other heuristic techniques. While most of these existing approaches are dependent on the scene environments and object types, the ones that are not, are classified as superior to others conceptually and in terms of accuracy. Despite these efforts, the design of a generic, accurate, simple, and efficient shadow detection algorithm still remains an open problem. In this thesis, based on a physically-derived hypothesis for shadow identification, novel, multi-domain shadow detection algorithms are proposed and tested in the spatial and transform domains. A novel "Affine Shadow Test Hypothesis" has been proposed, derived, and validated across multiple environments. Based on that, several new shadow detection algorithms have been proposed and modelled for short-duration video sequences, where a background frame is available as a reliable reference, and for long duration video sequences, where the use of a dedicated background frame is unreliable. Finally, additional algorithms have been proposed to detect shadows in still images, where the use of a separate background frame is not possible. In this approach, the author shows that the proposed algorithms are capable of detecting cast, and self shadows simultaneously. All proposed algorithms have been modelled, and tested to detect shadows in the spatial (pixel) and transform (frequency) domains and are compared against state-of-art approaches, using popular test and novel videos, covering a wide range of test conditions. It is shown that the proposed algorithms outperform most existing methods and effectively detect different types of shadows under various lighting and environmental conditions

    Motion Segmentation Aided Super Resolution Image Reconstruction

    Get PDF
    This dissertation addresses Super Resolution (SR) Image Reconstruction focusing on motion segmentation. The main thrust is Information Complexity guided Gaussian Mixture Models (GMMs) for Statistical Background Modeling. In the process of developing our framework we also focus on two other topics; motion trajectories estimation toward global and local scene change detections and image reconstruction to have high resolution (HR) representations of the moving regions. Such a framework is used for dynamic scene understanding and recognition of individuals and threats with the help of the image sequences recorded with either stationary or non-stationary camera systems. We introduce a new technique called Information Complexity guided Statistical Background Modeling. Thus, we successfully employ GMMs, which are optimal with respect to information complexity criteria. Moving objects are segmented out through background subtraction which utilizes the computed background model. This technique produces superior results to competing background modeling strategies. The state-of-the-art SR Image Reconstruction studies combine the information from a set of unremarkably different low resolution (LR) images of static scene to construct an HR representation. The crucial challenge not handled in these studies is accumulating the corresponding information from highly displaced moving objects. In this aspect, a framework of SR Image Reconstruction of the moving objects with such high level of displacements is developed. Our assumption is that LR images are different from each other due to local motion of the objects and the global motion of the scene imposed by non-stationary imaging system. Contrary to traditional SR approaches, we employed several steps. These steps are; the suppression of the global motion, motion segmentation accompanied by background subtraction to extract moving objects, suppression of the local motion of the segmented out regions, and super-resolving accumulated information coming from moving objects rather than the whole scene. This results in a reliable offline SR Image Reconstruction tool which handles several types of dynamic scene changes, compensates the impacts of camera systems, and provides data redundancy through removing the background. The framework proved to be superior to the state-of-the-art algorithms which put no significant effort toward dynamic scene representation of non-stationary camera systems

    Dataset shift in land-use classification for optical remote sensing

    Get PDF
    Multimodal dataset shifts consisting of both concept and covariate shifts are addressed in this study to improve texture-based land-use classification accuracy for optical panchromatic and multispectral remote sensing. Multitemporal and multisensor variances between train and test data are caused by atmospheric, phenological, sensor, illumination and viewing geometry differences, which cause supervised classification inaccuracies. The first dataset shift reduction strategy involves input modification through shadow removal before feature extraction with gray-level co-occurrence matrix and local binary pattern features. Components of a Rayleigh quotient-based manifold alignment framework is investigated to reduce multimodal dataset shift at the input level of the classifier through unsupervised classification, followed by manifold matching to transfer classification labels by finding across-domain cluster correspondences. The ability of weighted hierarchical agglomerative clustering to partition poorly separated feature spaces is explored and weight-generalized internal validation is used for unsupervised cardinality determination. Manifold matching solves the Hungarian algorithm with a cost matrix featuring geometric similarity measurements that assume the preservation of intrinsic structure across the dataset shift. Local neighborhood geometric co-occurrence frequency information is recovered and a novel integration thereof is shown to improve matching accuracy. A final strategy for addressing multimodal dataset shift is multiscale feature learning, which is used within a convolutional neural network to obtain optimal hierarchical feature representations instead of engineered texture features that may be sub-optimal. Feature learning is shown to produce features that are robust against multimodal acquisition differences in a benchmark land-use classification dataset. A novel multiscale input strategy is proposed for an optimized convolutional neural network that improves classification accuracy to a competitive level for the UC Merced benchmark dataset and outperforms single-scale input methods. All the proposed strategies for addressing multimodal dataset shift in land-use image classification have resulted in significant accuracy improvements for various multitemporal and multimodal datasets.Thesis (PhD)--University of Pretoria, 2016.National Research Foundation (NRF)University of Pretoria (UP)Electrical, Electronic and Computer EngineeringPhDUnrestricte

    A Cost-Effective System for Aerial 3D Thermography of Buildings

    Get PDF
    Three-dimensional (3D) imaging and infrared (IR) thermography are powerful tools in many areas in engineering and sciences. Their joint use is of great interest in the buildings sector, allowing inspection and non-destructive testing of elements as well as an evaluation of the energy efficiency. When dealing with large and complex structures, as buildings (particularly historical) generally are, 3D thermography inspection is enhanced by Unmanned Aerial Vehicles (UAV-also known as drones). The aim of this paper is to propose a simple and cost-effective system for aerial 3D thermography of buildings. Special attention is thus payed to instrument and reconstruction software choice. After a very brief introduction to IR thermography for buildings and 3D thermography, the system is described. Some experimental results are given to validate the proposal

    Spectral-Spatial Analysis of Remote Sensing Data: An Image Model and A Procedural Design

    Get PDF
    The distinguishing property of remotely sensed data is the multivariate information coupled with a two-dimensional pictorial representation amenable to visual interpretation. The contribution of this work is the design and implementation of various schemes that exploit this property. This dissertation comprises two distinct parts. The essence of Part One is the algebraic solution for the partition function of a high-order lattice model of a two dimensional binary particle system. The contribution of Part Two is the development of a procedural framework to guide multispectral image analysis. The characterization of binary (black and white) images with little semantic content is discussed in Part One. Measures of certain observable properties of binary images are proposed. A lattice model is introduced, the solution to which yields functional mappings from the model parameters to the measurements on the image. Simulation of the model is explained, as is its usage in the design of Bayesian priors to bias classification analysis of spectral data. The implication of such a bias is that spatially adjacent remote sensing data are identified as belonging to the same class with a high likelihood. Experiments illustrating the benefit of using the model in multispectral image analysis are also discussed. The second part of this dissertation presents a procedural schema for remote sensing data analysis. It is believed that the data crucial to a succc~ssful analysis is provided by the human, as an interpretation of the image representation of the remote sensing spectral data. Subsequently, emphasis is laid on the design of an intelligent implementation of existing algorithms, rather than the development of new algorithms for analysis. The development introduces hyperspectral analysis as a problem requiring multi-source data fusion and presents a process model to guide the design of a solution. Part Two concludes with an illustration of the schema as used in the classification analysis of a given hyperspectral data set
    • …
    corecore