350 research outputs found

    Hyperspectral image representation and processing with binary partition trees

    Get PDF
    The optimal exploitation of the information provided by hyperspectral images requires the development of advanced image processing tools. Therefore, under the title Hyperspectral image representation and Processing with Binary Partition Trees, this PhD thesis proposes the construction and the processing of a new region-based hierarchical hyperspectral image representation: the Binary Partition Tree (BPT). This hierarchical region-based representation can be interpreted as a set of hierarchical regions stored in a tree structure. Hence, the Binary Partition Tree succeeds in presenting: (i) the decomposition of the image in terms of coherent regions and (ii) the inclusion relations of the regions in the scene. Based on region-merging techniques, the construction of BPT is investigated in this work by studying hyperspectral region models and the associated similarity metrics. As a matter of fact, the very high dimensionality and the complexity of the data require the definition of specific region models and similarity measures. Once the BPT is constructed, the fixed tree structure allows implementing efficient and advanced application-dependent techniques on it. The application-dependent processing of BPT is generally implemented through a specific pruning of the tree. Accordingly, some pruning techniques are proposed and discussed according to different applications. This Ph.D is focused in particular on segmentation, object detection and classification of hyperspectral imagery. Experimental results on various hyperspectral data sets demonstrate the interest and the good performances of the BPT representatio

    Semi-automatic video object segmentation for multimedia applications

    Get PDF
    A semi-automatic video object segmentation tool is presented for segmenting both still pictures and image sequences. The approach comprises both automatic segmentation algorithms and manual user interaction. The still image segmentation component is comprised of a conventional spatial segmentation algorithm (Recursive Shortest Spanning Tree (RSST)), a hierarchical segmentation representation method (Binary Partition Tree (BPT)), and user interaction. An initial segmentation partition of homogeneous regions is created using RSST. The BPT technique is then used to merge these regions and hierarchically represent the segmentation in a binary tree. The semantic objects are then manually built by selectively clicking on image regions. A video object-tracking component enables image sequence segmentation, and this subsystem is based on motion estimation, spatial segmentation, object projection, region classification, and user interaction. The motion between the previous frame and the current frame is estimated, and the previous object is then projected onto the current partition. A region classification technique is used to determine which regions in the current partition belong to the projected object. User interaction is allowed for object re-initialisation when the segmentation results become inaccurate. The combination of all these components enables offline video sequence segmentation. The results presented on standard test sequences illustrate the potential use of this system for object-based coding and representation of multimedia

    Video object tracking by label propagation and backward projection

    Get PDF
    This paper presents an approach dedicated to the tracking of one or several semantic objects in a video shot. A state of the art on spatio-temporal segmentation techniques allows us to introduce our own approach. It combines three different steps: label prediction based on partition projection, local segmentation associated with a label propagation, and classification by backward projection. Experimental results highlight the visual quality obtained with this method. Different kinds of objects can be accurately tracked in different kinds of video sequences.Cet article présente nos travaux sur le suivi d'objets dans un plan séquence. Un état de l'art sur les techniques de segmentation spatio-temporelle nous permet d'introduire notre propre méthode de suivi temporel d'objets. Elle est constituée de trois phases distinctes : une prédiction d'étiquettes par projection de partition, une segmentation locale associée à une propagation d'étiquettes, et une classification par rétro-projection. L'association de ces trois étapes cumule les avantages de chaque approche pour un suivi rigoureux d'objets et réduit le temps de traitement de chaque image. La qualité visuelle des résultats obtenus par cette méthode est illustrée en fin d'article. Pour cela nous avons considéré le suivi d'objets ayant des caractéristiques différentes au niveau de leur composition et de leur déplacement

    Analyse hiérarchique d'images multimodales

    Get PDF
    There is a growing interest in the development of adapted processing tools for multimodal images (several images acquired over the same scene with different characteristics). Allowing a more complete description of the scene, multimodal images are of interest in various image processing fields, but their optimal handling and exploitation raise several issues. This thesis extends hierarchical representations, a powerful tool for classical image analysis and processing, to multimodal images in order to better exploit the additional information brought by the multimodality and improve classical image processing techniques. %when applied to real applications. This thesis focuses on three different multimodalities frequently encountered in the remote sensing field. We first investigate the spectral-spatial information of hyperspectral images. Based on an adapted construction and processing of the hierarchical representation, we derive a segmentation which is optimal with respect to the spectral unmixing operation. We then focus on the temporal multimodality and sequences of hyperspectral images. Using the hierarchical representation of the frames in the sequence, we propose a new method to achieve object tracking and apply it to chemical gas plume tracking in thermal infrared hyperspectral video sequences. Finally, we study the sensorial multimodality, being images acquired with different sensors. Relying on the concept of braids of partitions, we propose a novel methodology of image segmentation, based on an energetic minimization framework.Il y a un intérêt grandissant pour le développement d’outils de traitements adaptés aux images multimodales (plusieurs images de la même scène acquises avec différentes caractéristiques). Permettant une représentation plus complète de la scène, ces images multimodales ont de l'intérêt dans plusieurs domaines du traitement d'images, mais les exploiter et les manipuler de manière optimale soulève plusieurs questions. Cette thèse étend les représentations hiérarchiques, outil puissant pour le traitement et l’analyse d’images classiques, aux images multimodales afin de mieux exploiter l’information additionnelle apportée par la multimodalité et améliorer les techniques classiques de traitement d’images. Cette thèse se concentre sur trois différentes multimodalités fréquemment rencontrées dans le domaine de la télédétection. Nous examinons premièrement l’information spectrale-spatiale des images hyperspectrales. Une construction et un traitement adaptés de la représentation hiérarchique nous permettent de produire une carte de segmentation de l'image optimale vis-à-vis de l'opération de démélange spectrale. Nous nous concentrons ensuite sur la multimodalité temporelle, traitant des séquences d’images hyperspectrales. En utilisant les représentations hiérarchiques des différentes images de la séquence, nous proposons une nouvelle méthode pour effectuer du suivi d’objet et l’appliquons au suivi de nuages de gaz chimique dans des séquences d’images hyperspectrales dans le domaine thermique infrarouge. Finalement, nous étudions la multimodalité sensorielle, c’est-à-dire les images acquises par différents capteurs. Nous appuyant sur le concept des tresses de partitions, nous proposons une nouvelle méthodologie de segmentation se basant sur un cadre de minimisation d’énergie

    Image segmentation, evaluation, and applications

    Get PDF
    This thesis aims to advance research in image segmentation by developing robust techniques for evaluating image segmentation algorithms. The key contributions of this work are as follows. First, we investigate the characteristics of existing measures for supervised evaluation of automatic image segmentation algorithms. We show which of these measures is most effective at distinguishing perceptually accurate image segmentation from inaccurate segmentation. We then apply these measures to evaluating four state-of-the-art automatic image segmentation algorithms, and establish which best emulates human perceptual grouping. Second, we develop a complete framework for evaluating interactive segmentation algorithms by means of user experiments. Our system comprises evaluation measures, ground truth data, and implementation software. We validate our proposed measures by showing their correlation with perceived accuracy. We then use our framework to evaluate four popular interactive segmentation algorithms, and demonstrate their performance. Finally, acknowledging that user experiments are sometimes prohibitive in practice, we propose a method of evaluating interactive segmentation by algorithmically simulating the user interactions. We explore four strategies for this simulation, and demonstrate that the best of these produces results very similar to those from the user experiments

    An attention model and its application in man-made scene interpretation

    No full text
    The ultimate aim of research into computer vision is designing a system which interprets its surrounding environment in a similar way the human can do effortlessly. However, the state of technology is far from achieving such a goal. In this thesis different components of a computer vision system that are designed for the task of interpreting man-made scenes, in particular images of buildings, are described. The flow of information in the proposed system is bottom-up i.e., the image is first segmented into its meaningful components and subsequently the regions are labelled using a contextual classifier. Starting from simple observations concerning the human vision system and the gestalt laws of human perception, like the law of “good (simple) shape” and “perceptual grouping”, a blob detector is developed, that identifies components in a 2D image. These components are convex regions of interest, with interest being defined as significant gradient magnitude content. An eye tracking experiment is conducted, which shows that the regions identified by the blob detector, correlate significantly with the regions which drive the attention of viewers. Having identified these blobs, it is postulated that a blob represents an object, linguistically identified with its own semantic name. In other words, a blob may contain a window a door or a chimney in a building. These regions are used to identify and segment higher order structures in a building, like facade, window array and also environmental regions like sky and ground. Because of inconsistency in the unary features of buildings, a contextual learning algorithm is used to classify the segmented regions. A model which learns spatial and topological relationships between different objects from a set of hand-labelled data, is used. This model utilises this information in a MRF to achieve consistent labellings of new scenes

    Markov rasgele alanları aracılığı ile anlam bilgisi ve imge bölütlemenin birleştirilmesi.

    Get PDF
    The formulation of image segmentation problem is evolved considerably, from the early years of computer vision in 1970s to these years, in 2010s. While the initial studies offer mostly unsupervised approaches, a great deal of recent studies shift towards the supervised solutions. This is due to the advancements in the cognitive science and its influence on the computer vision research. Also, accelerated availability of computational power enables the researchers to develop complex algorithms. Despite the great effort on the image segmentation research, the state of the art techniques still fall short to satisfy the need of the further processing steps of computer vision. This study is another attempt to generate a “substantially complete” segmentation output for the consumption of object classification, recognition and detection steps. Our approach is to fuse the multiple segmentation outputs in order to achieve the “best” result with respect to a cost function. The proposed approach, called Boosted-MRF, elegantly formulates the segmentation fusion problem as a Markov Random Fields (MRF) model in an unsupervised framework. For this purpose, a set of initial segmentation outputs is obtained and the consensus among the segmentation partitions are formulated in the energy function of the Markov Random Fields model. Finally, minimization of the energy function yields the “best” consensus among the segmentation ensemble. We proceed one step further to improve the performance of the Boosted-MRF by introducing some auxiliary domain information into the segmentation fusion process. This enhanced segmentation fusion method, called the Domain Specific MRF, updates the energy function of the MRF model by the available information which is received from a domain expert. For this purpose, a top-down segmentation method is employed to obtain a set of Domain Specific Segmentation Maps which are incomplete segmentations of a given image. Therefore, in this second segmentation fusion method, in addition to the set of bottom-up segmentation ensemble, we generate ensemble of top-down Domain Specific Segmentation Maps. Based on the bottom–up and top down segmentation ensembles a new MRF energy function is defined. Minimization of this energy function yields the “best” consensus which is consistent with the domain specific information. The experiments performed on various datasets show that the proposed segmentation fusion methods improve the performances of the segmentation outputs in the ensemble measured with various indexes, such as Probabilistic Rand Index, Mutual Information. The Boosted-MRF method is also compared to a popular segmentation fusion method, namely, Best of K. The Boosted-MRF is slightly better than the Best of K method. The suggested Domain Specific-MRF method is applied on a set of outdoor images with vegetation where vegetation information is utilized as domain specific information. A slight improvement in the performance is recorded in this experiment. The method is also applied on remotely sensed dataset of building images, where more advanced domain specific information is available. The segmentation performance is evaluated with a performance measure which is specifically defined to estimate the segmentation performance for building images. In these two experiments with the Domain Specific-MRF method, it is observed that, as long as reliable domain specific information is available, the segmentation performance improves significantly.Ph.D. - Doctoral Progra
    corecore