182 research outputs found

    Automatic human face detection in color images

    Get PDF
    Automatic human face detection in digital image has been an active area of research over the past decade. Among its numerous applications, face detection plays a key role in face recognition system for biometric personal identification, face tracking for intelligent human computer interface (HCI), and face segmentation for object-based video coding. Despite significant progress in the field in recent years, detecting human faces in unconstrained and complex images remains a challenging problem in computer vision. An automatic system that possesses a similar capability as the human vision system in detecting faces is still a far-reaching goal. This thesis focuses on the problem of detecting human laces in color images. Although many early face detection algorithms were designed to work on gray-scale Images, strong evidence exists to suggest face detection can be done more efficiently by taking into account color characteristics of the human face. In this thesis, we present a complete and systematic face detection algorithm that combines the strengths of both analytic and holistic approaches to face detection. The algorithm is developed to detect quasi-frontal faces in complex color Images. This face class, which represents typical detection scenarios in most practical applications of face detection, covers a wide range of face poses Including all in-plane rotations and some out-of-plane rotations. The algorithm is organized into a number of cascading stages including skin region segmentation, face candidate selection, and face verification. In each of these stages, various visual cues are utilized to narrow the search space for faces. In this thesis, we present a comprehensive analysis of skin detection using color pixel classification, and the effects of factors such as the color space, color classification algorithm on segmentation performance. We also propose a novel and efficient face candidate selection technique that is based on color-based eye region detection and a geometric face model. This candidate selection technique eliminates the computation-intensive step of window scanning often employed In holistic face detection, and simplifies the task of detecting rotated faces. Besides various heuristic techniques for face candidate verification, we developface/nonface classifiers based on the naive Bayesian model, and investigate three feature extraction schemes, namely intensity, projection on face subspace and edge-based. Techniques for improving face/nonface classification are also proposed, including bootstrapping, classifier combination and using contextual information. On a test set of face and nonface patterns, the combination of three Bayesian classifiers has a correct detection rate of 98.6% at a false positive rate of 10%. Extensive testing results have shown that the proposed face detector achieves good performance in terms of both detection rate and alignment between the detected faces and the true faces. On a test set of 200 images containing 231 faces taken from the ECU face detection database, the proposed face detector has a correct detection rate of 90.04% and makes 10 false detections. We have found that the proposed face detector is more robust In detecting in-plane rotated laces, compared to existing face detectors. +D2

    Color Image Edge Detection and Segmentation: A Comparison of the Vector Angle and the Euclidean Distance Color Similarity Measures

    Get PDF
    This work is based on Shafer's Dichromatic Reflection Model as applied to color image formation. The color spaces RGB, XYZ, CIELAB, CIELUV, rgb, l1l2l3, and the new h1h2h3 color space are discussed from this perspective. Two color similarity measures are studied: the Euclidean distance and the vector angle. The work in this thesis is motivated from a practical point of view by several shortcomings of current methods. The first problem is the inability of all known methods to properly segment objects from the background without interference from object shadows and highlights. The second shortcoming is the non-examination of the vector angle as a distance measure that is capable of directly evaluating hue similarity without considering intensity especially in RGB. Finally, there is inadequate research on the combination of hue- and intensity-based similarity measures to improve color similarity calculations given the advantages of each color distance measure. These distance measures were used for two image understanding tasks: edge detection, and one strategy for color image segmentation, namely color clustering. Edge detection algorithms using Euclidean distance and vector angle similarity measures as well as their combinations were examined. The list of algorithms is comprised of the modified Roberts operator, the Sobel operator, the Canny operator, the vector gradient operator, and the 3x3 difference vector operator. Pratt's Figure of Merit is used for a quantitative comparison of edge detection results. Color clustering was examined using the k-means (based on the Euclidean distance) and Mixture of Principal Components (based on the vector angle) algorithms. A new quantitative image segmentation evaluation procedure is introduced to assess the performance of both algorithms. Quantitative and qualitative results on many color images (artificial, staged scenes and natural scene images) indicate good edge detection performance using a vector version of the Sobel operator on the h1h2h3 color space. The results using combined hue- and intensity-based difference measures show a slight improvement qualitatively and over using each measure independently in RGB. Quantitative and qualitative results for image segmentation on the same set of images suggest that the best image segmentation results are obtained using the Mixture of Principal Components algorithm on the RGB, XYZ and rgb color spaces. Finally, poor color clustering results in the h1h2h3 color space suggest that some assumptions in deriving a simplified version of the Dichromatic Reflectance Model might have been violated

    Survey of contemporary trends in color image segmentation

    Full text link

    Study and Development of Some Novel Image Segmentation Techniques

    Get PDF
    Some fuzzy technique based segmentation methods are studied and implemented and some fuzzy c means clustering based segmentation algorithms are developed in this thesis to suppress high and low uniform random noise. The reason for not developing fuzzy rule based segmentation method is that they are application dependent In many occasions, the images in real life are affected with noise. Fuzzy c means clustering based segmentation does not give good segmentation result under such condition. Various extension of the FCM method for segmentation are present in the literature. But most of them modify the objective function hence changing the basic FCM algorithm present in MATLAB toolboxes. Hence efforts have been made to develop FCM algorithm without modifying their objective function for better segmentation . The fuzzy technique based segmentation methods that are studied and developed are summarized here. (A) Fuzzy edge detection based segmentation: Two fuzzy edge detection methods are studied and implemented for segmentation: (i) FIS based edge detection and (ii) Fast multilevel fuzzy edge detector (FMFED). (i): The Fuzzy Inference system (FIS) based edge detector consists of some fuzzy inference rules which are defined in such a way that the FIS system output (“edges”) is high only for those pixels belonging to edges in the input image. A robustness to contrast and lightining variations were also taken into consideration while developing these rules.The output of the FIS based edge detector is then compared with the existing Sobel, LoG and Canny edge detector results. The algorithm is seen to be application dependent and time consuming. (ii) Fast Multilevel Fuzzy Edge Detector: To realise the fast and accurate detection of edges, the FMFED algorithm is proposed. It first enhances the image contrast by means of a fast multilevel fuzzy enhancement algorithm using simple transformation function based on two image thresholds. Second, the edges are extracted from the enhanced image by using a two stage edge detector operator that identifies the edge candidates based on local characteristics of the image and then determines the true edge pixels using edge detector operator based on extremum of the gradient values. Finally the segmentation of the edge image is done by morphological operator by edge linking. (B) FCM based segmentation: Two fuzzy clustering based segmentation methods are developed: (i) Modified Spatial Fuzzy c-Means (MSFCM) (ii) Neighbourhood Attraction Fuzzy c-Means (NAFCM). . (i) Contrast-Limited Adaptive Histogram Equalization Fuzzy c-Means (CLAHEFCM): This proposed algorithm presents a color segmentation process for low contrast images or unevenly illuminated images. The algorithm presented in this paper first enhances the contrast of the image by using contrast limited adaptive histogram equalization. After the enhancement of the image this method divides the color space into a given number of clusters, the number of cluster are fixed initially. The image is converted from RGB color space to LAB color space before the clustering process. Clustering is done here by using Fuzzy c means algorithm. The image is segmented based on color of a region, that is, areas having same color are grouped together. The image segmentation is done by taking into consideration, to which cluster a given pixel belongs the most. The method has been applied on a number of color test images and it is observed to give good segmentation results (ii) Modified Spatial Fuzzy c-means (MSFCM): The proposed algorithm divides the color space into a given number of clusters, the number of cluster are fixed initially. The image is converted from RGB color space to LAB color space before the clustering process. A robust segmentation technique based on extension to the traditional fuzzy c-means (FCM) clustering algorithm is proposed. The spatial information of each pixel in an image has been taken into consideration to get a noise free segmentation result. The image is segmented based on color of a region, that is, areas having same color are grouped together. The image segmentation is done by taking into consideration, to which cluster a given pixel belongs the most. The method has been applied to some color test images and its performance has been compared to FCM and FCM based methods to show its superiority over them. The proposed technique is observed to be an efficient and easy method for segmentation of noisy images. (iv) Neighbourhood Attraction Fuzzy c Means Algorithm: A new algorithm based on the IFCM neighbourhood attraction is used without changing the distance function of the FCM and hence avoiding an extra neural network optimization step for the adjusting parameters of the distance function, it is called Neighborhood Atrraction FCM (NAFCM). During clustering, each pixel attempts to attract its neighbouring pixels towards its own cluster. This neighbourhood attraction depends on two factors: the pixel intensities or feature attraction, and the spatial position of the neighbours or distance attraction, which also depends on neighbourhood structure. The NAFCM algorithm is tested on a synthetic image (chapter 6, figure 6.3-6.6) and a number of skin tumor images. It is observed to produce excellent clustering result under high noise condition when compared with the other FCM based clustering methods

    Modelling and tracking objects with a topology preserving self-organising neural network

    Get PDF
    Human gestures form an integral part in our everyday communication. We use gestures not only to reinforce meaning, but also to describe the shape of objects, to play games, and to communicate in noisy environments. Vision systems that exploit gestures are often limited by inaccuracies inherent in handcrafted models. These models are generated from a collection of training examples which requires segmentation and alignment. Segmentation in gesture recognition typically involves manual intervention, a time consuming process that is feasible only for a limited set of gestures. Ideally gesture models should be automatically acquired via a learning scheme that enables the acquisition of detailed behavioural knowledge only from topological and temporal observation. The research described in this thesis is motivated by a desire to provide a framework for the unsupervised acquisition and tracking of gesture models. In any learning framework, the initialisation of the shapes is very crucial. Hence, it would be beneficial to have a robust model not prone to noise that can automatically correspond the set of shapes. In the first part of this thesis, we develop a framework for building statistical 2D shape models by extracting, labelling and corresponding landmark points using only topological relations derived from competitive hebbian learning. The method is based on the assumption that correspondences can be addressed as an unsupervised classification problem where landmark points are the cluster centres (nodes) in a high-dimensional vector space. The approach is novel in that the network can be used in cases where the topological structure of the input pattern is not known a priori thus no topology of fixed dimensionality is imposed onto the network. In the second part, we propose an approach to minimise the user intervention in the adaptation process, which requires to specify a priori the number of nodes needed to represent an object, by utilising an automatic criterion for maximum node growth. Furthermore, this model is used to represent motion in image sequences by initialising a suitable segmentation that separates the object of interest from the background. The segmentation system takes into consideration some illumination tolerance, images as inputs from ordinary cameras and webcams, some low to medium cluttered background avoiding extremely cluttered backgrounds, and that the objects are at close range from the camera. In the final part, we extend the framework for the automatic modelling and unsupervised tracking of 2D hand gestures in a sequence of k frames. The aim is to use the tracked frames as training examples in order to build the model and maintain correspondences. To do that we add an active step to the Growing Neural Gas (GNG) network, which we call Active Growing Neural Gas (A-GNG) that takes into consideration not only the geometrical position of the nodes, but also the underlined local feature structure of the image, and the distance vector between successive images. The quality of our model is measured through the calculation of the topographic product. The topographic product is our topology preserving measure which quantifies the neighbourhood preservation. In our system we have applied specific restrictions in the velocity and the appearance of the gestures to simplify the difficulty of the motion analysis in the gesture representation. The proposed framework has been validated on applications related to sign language. The work has great potential in Virtual Reality (VR) applications where the learning and the representation of gestures becomes natural without the need of expensive wear cable sensors

    Visual region understanding: unsupervised extraction and abstraction

    Get PDF
    The ability to gain a conceptual understanding of the world in uncontrolled environments is the ultimate goal of vision-based computer systems. Technological societies today are heavily reliant on surveillance and security infrastructure, robotics, medical image analysis, visual data categorisation and search, and smart device user interaction, to name a few. Out of all the complex problems tackled by computer vision today in context of these technologies, that which lies closest to the original goals of the field is the subarea of unsupervised scene analysis or scene modelling. However, its common use of low level features does not provide a good balance between generality and discriminative ability, both a result and a symptom of the sensory and semantic gaps existing between low level computer representations and high level human descriptions. In this research we explore a general framework that addresses the fundamental problem of universal unsupervised extraction of semantically meaningful visual regions and their behaviours. For this purpose we address issues related to (i) spatial and spatiotemporal segmentation for region extraction, (ii) region shape modelling, and (iii) the online categorisation of visual object classes and the spatiotemporal analysis of their behaviours. Under this framework we propose (a) a unified region merging method and spatiotemporal region reduction, (b) shape representation by the optimisation and novel simplication of contour-based growing neural gases, and (c) a foundation for the analysis of visual object motion properties using a shape and appearance based nearest-centroid classification algorithm and trajectory plots for the obtained region classes. 1 Specifically, we formulate a region merging spatial segmentation mechanism that combines and adapts features shown previously to be individually useful, namely parallel region growing, the best merge criterion, a time adaptive threshold, and region reduction techniques. For spatiotemporal region refinement we consider both scalar intensity differences and vector optical flow. To model the shapes of the visual regions thus obtained, we adapt the growing neural gas for rapid region contour representation and propose a contour simplication technique. A fast unsupervised nearest-centroid online learning technique next groups observed region instances into classes, for which we are then able to analyse spatial presence and spatiotemporal trajectories. The analysis results show semantic correlations to real world object behaviour. Performance evaluation of all steps across standard metrics and datasets validate their performance

    Unsupervised colour image segmentation by low-level perceptual grouping

    Get PDF
    This paper proposes a new unsupervised approach for colour image segmentation. A hierarchy of image partitions is created on the basis of a function that merges spatially connected regions according to primary perceptual criteria. Likewise, a global function that measures the goodness of each defined partition is used to choose the best low-level perceptual grouping in the hierarchy. Contributions also include a comparative study with five unsupervised colour image segmentation techniques. These techniques have been frequently used as a reference in other comparisons. The results obtained by each method have been systematically evaluated using four well-known unsupervised measures for judging the segmentation quality. Our methodology has globally shown the best performance, obtaining better results in three out of four of these segmentation quality measures. Experiments will also show that our proposal finds low-level perceptual solutions that are highly correlated with the ones provided by human
    corecore