204 research outputs found

    Identification and Ranking of Relevant Image Content

    Get PDF
    The work in this thesis proposes an image understanding algorithm for automatically identifying and ranking different image regions into several levels of importance. Given a color image, specialized maps for classifying image content namely: weighted similarity, weighted homogeneity, image contrast and memory color maps are generated and combined to provide a perceptual importance map. Further analysis of this map yields a region ranking map which sorts the image content into different levels of significance. The algorithm was tested on a large database that contains a variety of color images. Those images were acquired from the Berkeley segmentation dataset as well as internal images. Experimental results show that our technique matches human manual ranking with 90% efficiency. Applications of the proposed algorithm include image rendering, classification, indexing and retrieval. Adaptive compression and camera auto-focus are other potential applications

    Computer Vision for Timber Harvesting

    Get PDF

    Symbolic and Deep Learning Based Data Representation Methods for Activity Recognition and Image Understanding at Pixel Level

    Get PDF
    Efficient representation of large amount of data particularly images and video helps in the analysis, processing and overall understanding of the data. In this work, we present two frameworks that encapsulate the information present in such data. At first, we present an automated symbolic framework to recognize particular activities in real time from videos. The framework uses regular expressions for symbolically representing (possibly infinite) sets of motion characteristics obtained from a video. It is a uniform framework that handles trajectory-based and periodic articulated activities and provides polynomial time graph algorithms for fast recognition. The regular expressions representing motion characteristics can either be provided manually or learnt automatically from positive and negative examples of strings (that describe dynamic behavior) using offline automata learning frameworks. Confidence measures are associated with recognitions using Levenshtein distance between a string representing a motion signature and the regular expression describing an activity. We have used our framework to recognize trajectory-based activities like vehicle turns (U-turns, left and right turns, and K-turns), vehicle start and stop, person running and walking, and periodic articulated activities like digging, waving, boxing, and clapping in videos from the VIRAT public dataset, the KTH dataset, and a set of videos obtained from YouTube. Next, we present a core sampling framework that is able to use activation maps from several layers of a Convolutional Neural Network (CNN) as features to another neural network using transfer learning to provide an understanding of an input image. The intermediate map responses of a Convolutional Neural Network (CNN) contain information about an image that can be used to extract contextual knowledge about it. Our framework creates a representation that combines features from the test data and the contextual knowledge gained from the responses of a pretrained network, processes it and feeds it to a separate Deep Belief Network. We use this representation to extract more information from an image at the pixel level, hence gaining understanding of the whole image. We experimentally demonstrate the usefulness of our framework using a pretrained VGG-16 model to perform segmentation on the BAERI dataset of Synthetic Aperture Radar (SAR) imagery and the CAMVID dataset. Using this framework, we also reconstruct images by removing noise from noisy character images. The reconstructed images are encoded using Quadtrees. Quadtrees can be an efficient representation in learning from sparse features. When we are dealing with handwritten character images, they are quite susceptible to noise. Hence, preprocessing stages to make the raw data cleaner can improve the efficacy of their use. We improve upon the efficiency of probabilistic quadtrees by using a pixel level classifier to extract the character pixels and remove noise from the images. The pixel level denoiser uses a pretrained CNN trained on a large image dataset and uses transfer learning to aid the reconstruction of characters. In this work, we primarily deal with classification of noisy characters and create the noisy versions of handwritten Bangla Numeral and Basic Character datasets and use them and the Noisy MNIST dataset to demonstrate the usefulness of our approach

    Multivariate texture-based segmentation of remotely sensed imagery for extraction of objects and their uncertainty

    Get PDF
    In this study, a segmentation procedure is proposed, based on grey-level and multivariate texture to extract spatial objects from an image scene. Object uncertainty was quantified to identify transitions zones of objects with indeterminate boundaries. The Local Binary Pattern (LBP) operator, modelling texture, was integrated into a hierarchical splitting segmentation to identify homogeneous texture regions in an image. We proposed a multivariate extension of the standard univariate LBP operator to describe colour texture. The paper is illustrated with two case studies. The first considers an image with a composite of texture regions. The two LBP operators provided good segmentation results on both grey-scale and colour textures, depicted by accuracy values of 96% and 98%, respectively. The second case study involved segmentation of coastal land cover objects from a multi-spectral Compact Airborne Spectral Imager (CASI) image, of a coastal area in the UK. Segmentation based on the univariate LBP measure provided unsatisfactory segmentation results from a single CASI band (70% accuracy). A multivariate LBP-based segmentation of three CASI bands improved segmentation results considerably (77% accuracy). Uncertainty values for object building blocks provided valuable information for identification of object transition zones. We conclude that the (multivariate) LBP texture model in combination with a hierarchical splitting segmentation framework is suitable for identifying objects and for quantifying their uncertainty

    Content Based Retrieval Using Colour And Texture Of Wavelet Based Compressed Images [TA1637. I67 2008 f rb].

    Get PDF
    Permintaan yang tinggi terhadap penggunaan dapatan semula imej telah menggalakkan pembangun aplikasi multimedia untuk mencari cara untuk mengurus dan mencari imej dengan lebih efisien. The growing demands for image retrieval in multimedia field such as crime prevention, health informatics and biometrics has pushed application developers to search ways to manage and retrieve images more efficiently

    Probabilistic and Deep Learning Algorithms for the Analysis of Imagery Data

    Get PDF
    Accurate object classification is a challenging problem for various low to high resolution imagery data. This applies to both natural as well as synthetic image datasets. However, each object recognition dataset poses its own distinct set of domain-specific problems. In order to address these issues, we need to devise intelligent learning algorithms which require a deep understanding and careful analysis of the feature space. In this thesis, we introduce three new learning frameworks for the analysis of both airborne images (NAIP dataset) and handwritten digit datasets without and with noise (MNIST and n-MNIST respectively). First, we propose a probabilistic framework for the analysis of the NAIP dataset which includes (1) an unsupervised segmentation module based on the Statistical Region Merging algorithm, (2) a feature extraction module that extracts a set of standard hand-crafted texture features from the images, (3) a supervised classification algorithm based on Feedforward Backpropagation Neural Networks, and (4) a structured prediction framework using Conditional Random Fields that integrates the results of the segmentation and classification modules into a single composite model to generate the final class labels. Next, we introduce two new datasets SAT-4 and SAT-6 sampled from the NAIP imagery and use them to evaluate a multitude of Deep Learning algorithms including Deep Belief Networks (DBN), Convolutional Neural Networks (CNN) and Stacked Autoencoders (SAE) for generating class labels. Finally, we propose a learning framework by integrating hand-crafted texture features with a DBN. A DBN uses an unsupervised pre-training phase to perform initialization of the parameters of a Feedforward Backpropagation Neural Network to a global error basin which can then be improved using a round of supervised fine-tuning using Feedforward Backpropagation Neural Networks. These networks can subsequently be used for classification. In the following discussion, we show that the integration of hand-crafted features with DBN shows significant improvement in performance as compared to traditional DBN models which take raw image pixels as input. We also investigate why this integration proves to be particularly useful for aerial datasets using a statistical analysis based on Distribution Separability Criterion. Then we introduce a new dataset called noisy-MNIST (n-MNIST) by adding (1) additive white gaussian noise (AWGN), (2) motion blur and (3) Reduced contrast and AWGN to the MNIST dataset and present a learning algorithm by combining probabilistic quadtrees and Deep Belief Networks. This dynamic integration of the Deep Belief Network with the probabilistic quadtrees provide significant improvement over traditional DBN models on both the MNIST and the n-MNIST datasets. Finally, we extend our experiments on aerial imagery to the class of general texture images and present a theoretical analysis of Deep Neural Networks applied to texture classification. We derive the size of the feature space of textural features and also derive the Vapnik-Chervonenkis dimension of certain classes of Neural Networks. We also derive some useful results on intrinsic dimension and relative contrast of texture datasets and use these to highlight the differences between texture datasets and general object recognition datasets
    corecore