32 research outputs found

    Quelques extensions des level sets et des graph cuts et leurs applications à la segmentation d'images et de vidéos

    Image processing techniques are now widely spread out over a large quantity of domains: like medical imaging, movies post-production, games... Automatic detection and extraction of regions of interest inside an image, a volume or a video is challenging problem since it is a starting point for many applications in image processing. However many techniques were developed during the last years and the state of the art methods suffer from some drawbacks: The Level Sets method only provides a local minimum while the Graph Cuts method comes from Combinatorial Community and could take advantage of the specificity of image processing problems. In this thesis, we propose two extensions of the previously cited methods in order to soften or remove these drawbacks. We first discuss the existing methods and show how they are related to the segmentation problem through an energy formulation. Then we introduce stochastic perturbations to the Level Sets method and we build a more generic framework: the Stochastic Level Sets (SLS). Later we provide a direct application of the SLS to image segmentation that provides a better minimization of energies. Basically, it allows the contours to escape from local minimum. Then we propose a new formulation of an existing algorithm of Graph Cuts in order to introduce some interesting concept for image processing community: like initialization of the algorithm for speed improvement. We also provide a new approach for layer extraction from video sequence that retrieves both visible and hidden layers in it.Les techniques de traitement d'image sont maintenant largement répandues dans une grande quantité de domaines: comme l'imagerie médicale, la post-production de films, les jeux... La détection et l'extraction automatique de régions d'intérêt à l'intérieur d'une image, d'un volume ou d'une vidéo est réel challenge puisqu'il représente un point de départ pour un grand nombre d'applications en traitement d'image. Cependant beaucoup de techniques développées pendant ces dernières années et les méthodes de l'état de l'art souffrent de quelques inconvénients: la méthode des ensembles de niveaux fournit seulement un minimum local tandis que la méthode de coupes de graphe vient de la communauté combinatoire et pourrait tirer profit de la spécificité des problèmes de traitement d'image. Dans cette thèse, nous proposons deux prolongements des méthodes précédemment citées afin de réduire ou enlever ces inconvénients. Nous discutons d'abord les méthodes existantes et montrons comment elles sont liées au problème de segmentation via une formulation énergétique. Nous présentons ensuite des perturbations stochastiques a la méthode des ensembles de niveaux et nous établissons un cadre plus générique: les ensembles de niveaux stochastiques (SLS). Plus tard nous fournissons une application directe du SLS à la segmentation d'image et montrons qu'elle fournit une meilleure minimisation des énergies. Fondamentalement, il permet aux contours de s'échapper des minima locaux. Nous proposons ensuite une nouvelle formulation d'un algorithme existant des coupes de graphe afin d'introduire de nouveaux concepts intéressant pour la communauté de traitement d'image: comme l'initialisation de l'algorithme pour l'amélioration de vitesse. Nous fournissons également une nouvelle approche pour l'extraction de couches d'une vidéo par segmentation du mouvement et qui extrait à la fois les couches visibles et cachées présentes

    Biometric iris image segmentation and feature extraction for iris recognition

    PhD ThesisThe continued threat to security in our interconnected world today begs for urgent solution. Iris biometric like many other biometric systems provides an alternative solution to this lingering problem. Although, iris recognition have been extensively studied, it is nevertheless, not a fully solved problem which is the factor inhibiting its implementation in real world situations today. There exists three main problems facing the existing iris recognition systems: 1) lack of robustness of the algorithm to handle non-ideal iris images, 2) slow speed of the algorithm and 3) the applicability to the existing systems in real world situation. In this thesis, six novel approaches were derived and implemented to address these current limitation of existing iris recognition systems. A novel fast and accurate segmentation approach based on the combination of graph-cut optimization and active contour model is proposed to define the irregular boundaries of the iris in a hierarchical 2-level approach. In the first hierarchy, the approximate boundary of the pupil/iris is estimated using a method based on Hough’s transform for the pupil and adapted starburst algorithm for the iris. Subsequently, in the second hierarchy, the final irregular boundary of the pupil/iris is refined and segmented using graph-cut based active contour (GCBAC) model proposed in this work. The segmentation is performed in two levels, whereby the pupil is segmented first before the iris. In order to detect and eliminate noise and reflection artefacts which might introduce errors to the algorithm, a preprocessing technique based on adaptive weighted edge detection and high-pass filtering is used to detect reflections on the high intensity areas of the image while exemplar based image inpainting is used to eliminate the reflections. After the segmentation of the iris boundaries, a post-processing operation based on combination of block classification method and statistical prediction approach is used to detect any super-imposed occluding eyelashes/eyeshadows. The normalization of the iris image is achieved though the rubber sheet model. In the second stage, an approach based on construction of complex wavelet filters and rotation of the filters to the direction of the principal texture direction is used for the extraction of important iris information while a modified particle swam optimization (PSO) is used to select the most prominent iris features for iris encoding. Classification of the iriscode is performed using adaptive support vector machines (ASVM). Experimental results demonstrate that the proposed approach achieves accuracy of 98.99% and is computationally about 2 times faster than the best existing approach.Ebonyi State University and Education Task Fund, Nigeri

    Object Segmentation And Recognition Using Gradient Based Descriptors And Shape Driven Fast Marching Methods

    Tez (Doktora) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2010Thesis (PhD) -- İstanbul Technical University, Institute of Science and Technology, 2010Bu çalışmada, aktif çevrit nesne bölütleyici yöntemlerle birlikte kullanılabilecek yeni bir şekil betimleme ve tanıma sistemi önerilmiştir. Önerilen sistem daha önce yapılan çalışmalar gibi aktif çevriti önceden tanımlı şekillerden birine zorlamak yerine, çevrit nesne sınırlarına yapışırken aynı zamanda şekil betimleme yapmayı amaçlamıştır. Aktif çevrit bölütleyici olarak Hızlı Yürüme (Fast Marching) algoritması kullanılmış, Hızlı Yürüme metodu için yeni bir hız işlevi tanımlanmıştır. Ayrıca çevriti nesne sınırlarından geçtiği sırada durdurmayı amaçlayan özgün yaklaşımlar önerilmiştir. Çalışmanın en önemli katkılarından birisi yeni ortaya atılan Gradyan Temelli Şekil Betimleyicisi (GTŞB) dir [1]. GTŞB, aktif çevrit bölütleyicilerin yapısına uygun, sınır tabanlı, hem ikili hem de gri-seviyeli görüntülerle rahatça kullanılabilecek başarılı bir şekil betimleyicidir. GTŞB nin araç plaka karakter veritabanı, MPEG-7 şekil veritabanı, Kimia şekil veritabanı gibi farklı şekil veritabanlarında elde ettiği başarılar diğer çok bilinen sınır tabanlı betimleyicilerle de karşılaştırılarak verilmiştir. Elde edilen sonuçlar GTŞB nin tüm veritabanlarında diğer yöntemlere göre daha başarılı olduğunu işaret etmektedir. Çalışmada geliştirilen bir diğer önemli yaklaşım da Hızlı Yürüme çevritinin nesne sınırına yaklaşırken örneklenerek şeklin birden fazla defa betimlenmesine olanak veren yeni sınıflandırıcı yapıdır. Bu yaklaşım nesne tanımayı bir denemede sonuçlandıran geleneksel yöntemlerin bu sınırlamasını aşarak aynı nesneyi birçok kez tanıma olanağı sunmaktadır. Bu tanıma sonuçlarının tümleştirilmesiyle tek tanımaya göre daha yüksek başarılar elde edildiği çalışmanın ilgili bölümlerinde başarıları karşılaştıran tablolar yardımıyla gösterilmektedir.In this thesis, a gradient based shape description and recognition methodology to use with active contour-based object segmentation systems has been proposed. The Fast Marching (FM) active contour evolving model is utilized for boundary segmentation. A new speed functional has been defined to use first and second order image intensity derivatives. A local front stopping algorithm has also been proposed to improve the boundary handling performance of the FM model. The most critical improvement of the thesis is defining a new shape descriptor called the Gradient Based Shape Descriptor (GBSD) [1]. GBSD is a new boundary-based shape descriptor that can operate on both binary and gray-scaled images. The recognition performance of GBSD is measured on a license plate character database, MPEG-7 Core Experiments shape data set and Kimia data Set. The success rates are compared with other well-known boundary-based shape descriptors and it is shown that GBSD achieves better recognition percentages. A new recognition approach that utilizes the progressive active contours while iterating towards the real object boundaries has been proposed. This approach provides the recognizer many trials for shape description; it removes the limitation of traditional recognition systems that have only one chance for shape classification. Test results shown in this study prove that the voted decision result among these iterated contours outperforms the ordinary individual shape recognizers.DoktoraPh

    Stochastic Filtering on Shape Manifolds

    This thesis addresses the problem of learning the dynamics of deforming objects in image time series. In many biomedical imaging and computer vision applications it is important to satisfy certain geometric constraints which traditional time series methods are not capable of handling. We focus on building topology-preserving spatio-temporal stochastic models for shape deformation, which we combine with the observed images to obtain robust object tracking. The shape of the object is modeled as obtained through the action of a group of diffeomorphisms on the initial object boundary. We formulate a state space model for the diffeomorphic deformation of the object, and implement a particle filter on this shape space to estimate the state of the shape in each video frame. We use a practical method for sampling diffeomorphic shapes in which we generate deformations via flows of finitely generated vector fields. Based on the observations and the proposed samples we obtain an approximate estimate for the posterior distribution of the shape. We present the performance of this framework on various image sequences under different scenarios. We extend the random perturbation models to diffusion models on the manifold of planar (discretized) shapes whose drift component represents a trend in the shape deformation. To obtain trends intrinsic to the shape, we define the drift as a gradient of appropriate functions defined over the boundary of the shape. Given a sequence of observations from the path of the suggested stochastic differential equations, we propose a likelihood-ratio-based technique to estimate the missing parameters in the drift terms. We show how to reduce the computational burden and improve the robustness of the estimators by constraining the motion of the shapes to a lower-dimensional submanifold equipped with a sub-Riemannian metric. We further discuss how to apply this methodology to obtain estimates when we have only a limited number of observations

    Recent Advances in Signal Processing

    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Automated Resolution Selection for Image Segmentation

    It is well known in image processing in general, and hence in image segmentation in particular, that computational cost increases rapidly with the number and dimensions of the images to be processed. Several fields, such as astronomy, remote sensing, and medical imaging, use very large images, which might also be 3D and/or captured at several frequency bands, all adding to the computational expense. Multiresolution analysis is one method of increasing the efficiency of the segmentation process. One multiresolution approach is the coarse-to-fine segmentation strategy, whereby the segmentation starts at a coarse resolution and is then fine-tuned during subsequent steps. Until now, the starting resolution for segmentation has been selected arbitrarily with no clear selection criteria. The research conducted for this thesis showed that starting from different resolutions for image segmentation results in different accuracies and speeds, even for images from the same dataset. An automated method for resolution selection for an input image would thus be beneficial. This thesis introduces a framework for the selection of the best resolution for image segmentation. First proposed is a measure for defining the best resolution based on user/system criteria, which offers a trade-off between accuracy and time. A learning approach is then described for the selection of the resolution, whereby extracted image features are mapped to the previously determined best resolution. In the learning process, class (i.e., resolution) distribution is imbalanced, making effective learning from the data difficult. A variant of AdaBoost, called RAMOBoost, is therefore used in this research for the learning-based selection of the best resolution for image segmentation. RAMOBoost is designed specifically for learning from imbalanced data. Two sets of features are used: Local Binary Patterns (LBP) and statistical features. Experiments conducted with four datasets using three different segmentation algorithms show that the resolutions selected through learning enable much faster segmentation than the original ones, while retaining at least the original accuracy. For three of the four datasets used, the segmentation results obtained with the proposed framework were significantly better than with the original resolution with respect to both accuracy and time

    Using facial expression recognition for crowd monitoring.

    Master of Science in Engineering. University of KwaZulu-Natal, Durban 2017.In recent years, Crowd Monitoring techniques have attracted emerging interest in the eld of computer vision due to their ability to monitor groups of people in crowded areas, where conventional image processing methods would not suffice. Existing Crowd Monitoring techniques focus heavily on analyzing a crowd as a single entity, usually in terms of their density and movement pattern. While these techniques are well suited for the task of identifying dangerous and emergency situations, such as a large group of people exiting a building at once, they are very limited when it comes to identifying emotion within a crowd. By isolating different types of emotion within a crowd, we aim to predict the mood of a crowd even in scenes of non-panic. In this work, we propose a novel Crowd Monitoring system based on estimating crowd emotion using Facial Expression Recognition (FER). In the past decade, both FER and activity recognition have been proposed for human emotion detection. However, facial expression is arguably more descriptive when identifying emotion and is less likely to be obscured in crowded environments compared to body pos- ture. Given a crowd image, the popular Viola and Jones face detection algorithm is used to detect and extract unobscured faces from individuals in the crowd. A ro- bust and efficient appearance based method of FER, such as Gradient Local Ternary Pattern (GLTP), is used together with a machine learning algorithm, Support Vec- tor Machine (SVM), to extract and classify each facial expression as one of seven universally accepted emotions (joy, surprise, anger, fear, disgust, sadness or neutral emotion). Crowd emotion is estimated by isolating groups of similar emotion based on their relative size and weighting. To validate the effectiveness of the proposed system, a series of cross-validation tests are performed using a novel Crowd Emotion dataset with known ground-truth emotions. The results show that the system presented is able to accurately and efficiently predict multiple classes of crowd emotion even in non-panic situations where movement and density information may be incomplete. In the future, this type of system can be used for many security applications; such as helping to alert authorities to potentially aggressive crowds of people in real-time

    Content-based image retrieval of museum images

    Content-based image retrieval (CBIR) is becoming more and more important with the advance of multimedia and imaging technology. Among many retrieval features associated with CBIR, texture retrieval is one of the most difficult. This is mainly because no satisfactory quantitative definition of texture exists at this time, and also because of the complex nature of the texture itself. Another difficult problem in CBIR is query by low-quality images, which means attempts to retrieve images using a poor quality image as a query. Not many content-based retrieval systems have addressed the problem of query by low-quality images. Wavelet analysis is a relatively new and promising tool for signal and image analysis. Its time-scale representation provides both spatial and frequency information, thus giving extra information compared to other image representation schemes. This research aims to address some of the problems of query by texture and query by low quality images by exploiting all the advantages that wavelet analysis has to offer, particularly in the context of museum image collections. A novel query by low-quality images algorithm is presented as a solution to the problem of poor retrieval performance using conventional methods. In the query by texture problem, this thesis provides a comprehensive evaluation on wavelet-based texture method as well as comparison with other techniques. A novel automatic texture segmentation algorithm and an improved block oriented decomposition is proposed for use in query by texture. Finally all the proposed techniques are integrated in a content-based image retrieval application for museum image collections