10 research outputs found

    Illumination-robust Pattern Matching Using Distorted Color Histograms

    Get PDF
    It is argued that global illumination should be modeled separately from other incidents that change the appearance of objects. The effects of intensity variations of the global illumination are discussed and constraints deduced that restrict the shape of a function that maps the histogram of a template to the histogram of an image location. This approach is illustrated for simple pattern matching and for a combination with a PCA (\emph{Eigenface}) model of the grey-level appearance

    Eye gaze based reading detection

    Get PDF
    Master'sMASTER OF ENGINEERIN

    Facial expression recognition in the wild : from individual to group

    Get PDF
    The progress in computing technology has increased the demand for smart systems capable of understanding human affect and emotional manifestations. One of the crucial factors in designing systems equipped with such intelligence is to have accurate automatic Facial Expression Recognition (FER) methods. In computer vision, automatic facial expression analysis is an active field of research for over two decades now. However, there are still a lot of questions unanswered. The research presented in this thesis attempts to address some of the key issues of FER in challenging conditions mentioned as follows: 1) creating a facial expressions database representing real-world conditions; 2) devising Head Pose Normalisation (HPN) methods which are independent of facial parts location; 3) creating automatic methods for the analysis of mood of group of people. The central hypothesis of the thesis is that extracting close to real-world data from movies and performing facial expression analysis on movies is a stepping stone in the direction of moving the analysis of faces towards real-world, unconstrained condition. A temporal facial expressions database, Acted Facial Expressions in the Wild (AFEW) is proposed. The database is constructed and labelled using a semi-automatic process based on closed caption subtitle based keyword search. Currently, AFEW is the largest facial expressions database representing challenging conditions available to the research community. For providing a common platform to researchers in order to evaluate and extend their state-of-the-art FER methods, the first Emotion Recognition in the Wild (EmotiW) challenge based on AFEW is proposed. An image-only based facial expressions database Static Facial Expressions In The Wild (SFEW) extracted from AFEW is proposed. Furthermore, the thesis focuses on HPN for real-world images. Earlier methods were based on fiducial points. However, as fiducial points detection is an open problem for real-world images, HPN can be error-prone. A HPN method based on response maps generated from part-detectors is proposed. The proposed shape-constrained method does not require fiducial points and head pose information, which makes it suitable for real-world images. Data from movies and the internet, representing real-world conditions poses another major challenge of the presence of multiple subjects to the research community. This defines another focus of this thesis where a novel approach for modeling the perception of mood of a group of people in an image is presented. A new database is constructed from Flickr based on keywords related to social events. Three models are proposed: averaging based Group Expression Model (GEM), Weighted Group Expression Model (GEM_w) and Augmented Group Expression Model (GEM_LDA). GEM_w is based on social contextual attributes, which are used as weights on each person's contribution towards the overall group's mood. Further, GEM_LDA is based on topic model and feature augmentation. The proposed framework is applied to applications of group candid shot selection and event summarisation. The application of Structural SIMilarity (SSIM) index metric is explored for finding similar facial expressions. The proposed framework is applied to the problem of creating image albums based on facial expressions, finding corresponding expressions for training facial performance transfer algorithms

    Emotional State Analysis Upon Image Patterns

    Get PDF
    Tato disertační práce se zabývá návrhem automatického systému pro rozpoznávání základních emocionálních výrazů ve tváři ze statických obrazů. Celkově je tento systém rozdělen do tří nezávislých částí, které jsou ovšem jistým způsobem propojeny. Jedná se o automatickou detekci tváře v barevném obraze, kde byl navržen vlastní detektor pracující na základě barvy lidské kůže a metody pro lokalizaci pozic očí a rtů v již nalezených tvářích pomocí barevných map. Součástí tohoto celku je modifikovaný algoritmus detekce tváří Viola-Jones, který byl také experimentálně využit pro detekci očí. Spolehlivost těchto detektorů byla testována pomocí obrazové databáze obličejů Georgia Tech Face Database. Další částí automatického systému je extrakce příznaků skládající se ze dvou statistických metod a jedné metody založené na filtraci obrazu pomocí sady Gaborových filtrů. Pro účely této práce byly taky experimentálně vyzkoušeny různé kombinace příznaků extrahovaných pomocí těchto metod. Poslední částí systému pak je matematický klasifikátor reprezentován dopřednou neuronovou sítí. Celý systém je navíc doplněn o přesné určení pozic jednotlivých částí lidského obličeje pomocí aktivního modelu tvaru. Spolehlivost celého automatického systému byla testována na rozpoznání základních emocionálních výrazů ve tváři pomocí databáze Japanese Female Facial Expression.This dissertation thesis deals with the automatic system for basic emotional facial expressions recognition from static images. Generally the system is divided into the three independent parts, which are linked together in some way. The first part deals with automatic face detection from color images. In this part they were proposed the face detector based on skin color and the methods for eyes and lips position localization from detected faces using color maps. A part of this is modified Viola-Jones face detector, which was even experimentally used for eyes detection. The both face detectors were tested on the Georgia Tech Face Database. Another part of the automatic system is features extraction process, which consists of two statistical methods and of one method based on image filtering using set of Gabor’s filters. For purposes of this thesis they were experimentally used some combinations of features extracted using these methods. The last part of the automatic system is mathematical classifier, which is represented by feed-forward neural network. The automatic system is utilized by adding an accurate particular facial features localization using active shape model. The whole automatic system was benchmarked on recognizing of basic emotional facial expressions using the Japanese Female Facial Expression database.

    Methods and Applications of Eye Blink Detection with Digital Image Processing

    Get PDF
    Disertační práce pojednává o oblasti detekce mrkání jako součásti komplexního oboru detekce a rozpoznávání obličejů. Práce je zaměřena směrem k číslicovému zpracování obrazu. Součástí je analýza problematiky a popis dostupných obrazových databází vhodných k testovacím účelům. Ve dvou hlavních kapitolách jsou návrhy metod pro detekci mrkání s využitím číslicového zpracování obrazu jak bez IR technologie, tak s IR technologií.The thesis deals with eye blink detection, which is part of complex topic of face detection and recognition. The work intents on digital image processing. There is analyse of the topic and description of image databases for testing. Two main chapters describe design of eye blink detection with digital image processing with IR technology and without IR technology.

    Learning from one example in machine vision by sharing probability densities

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 125-130).Human beings exhibit rapid learning when presented with a small number of images of a new object. A person can identify an object under a wide variety of visual conditions after having seen only a single example of that object. This ability can be partly explained by the application of previously learned statistical knowledge to a new setting. This thesis presents an approach to acquiring knowledge in one setting and using it in another. Specifically, we develop probability densities over common image changes. Given a single image of a new object and a model of change learned from a different object, we form a model of the new object that can be used for synthesis, classification, and other visual tasks. We start by modeling spatial changes. We develop a framework for learning statistical knowledge of spatial transformations in one task and using that knowledge in a new task. By sharing a probability density over spatial transformations learned from a sample of handwritten letters, we develop a handwritten digit classifier that achieves 88.6% accuracy using only a single hand-picked training example from each class. The classification scheme includes a new algorithm, congealing, for the joint alignment of a set of images using an entropy minimization criterion. We investigate properties of this algorithm and compare it to other methods of addressing spatial variability in images. We illustrate its application to binary images, gray-scale images, and a set of 3-D neonatal magnetic resonance brain volumes.Next, we extend the method of change modeling from spatial transformations to color transformations. By measuring statistically common joint color changes of a scene in an office environment, and then applying standard statistical techniques such as principal components analysis, we develop a probabilistic model of color change. We show that these color changes, which we call color flows, can be shared effectively between certain types of scenes. That is, a probability density over color change developed by observing one scene can provide useful information about the variability of another scene. We demonstrate a variety of applications including image synthesis, image matching, and shadow detection.by Erik G. Miller.Ph.D

    Visual Tracking Algorithms using Different Object Representation Schemes

    Get PDF
    Visual tracking, being one of the fundamental, most important and challenging areas in computer vision, has attracted much attention in the research community during the past decade due to its broad range of real-life applications. Even after three decades of research, it still remains a challenging problem in view of the complexities involved in the target searching due to intrinsic and extrinsic appearance variations of the object. The existing trackers fail to track the object when there are considerable amount of object appearance variations and when the object undergoes severe occlusion, scale change, out-of-plane rotation, motion blur, fast motion, in-plane rotation, out-of-view and illumination variation either individually or simultaneously. In order to have a reliable and improved tracking performance, the appearance variations should be handled carefully such that the appearance model should adapt to the intrinsic appearance variations and be robust enough for extrinsic appearance variations. The objective of this thesis is to develop visual object tracking algorithms by addressing the deficiencies of the existing algorithms to enhance the tracking performance by investigating the use of different object representation schemes to model the object appearance and then devising mechanisms to update the observation models. A tracking algorithm based on the global appearance model using robust coding and its collaboration with a local model is proposed. The global PCA subspace is used to model the global appearance of the object, and the optimum PCA basis coefficients and the global weight matrix are estimated by developing an iteratively reweighted robust coding (IRRC) technique. This global model is collaborated with the local model to exploit their individual merits. Global and local robust coding distances are introduced to find the candidate sample having similar appearance as that of the reconstructed sample from the subspace, and these distances are used to define the observation likelihood. A robust occlusion map generation scheme and a mechanism to update both the global and local observation models are developed. Quantitative and qualitative performance evaluations on OTB-50 and VOT2016, two popular benchmark datasets, demonstrate that the proposed algorithm with histogram of oriented gradient (HOG) features generally performs better than the state-of-the-art methods considered do. In spite of its good performance, there is a need to improve the tracking performance in some of the challenging attributes of OTB-50 and VOT2016. A second tracking algorithm is developed to provide an improved performance in situations for the above mentioned challenging attributes. The algorithms is designed based on a structural local 2DDCT sparse appearance model and an occlusion handling mechanism. In a structural local 2DDCT sparse appearance model, the energy compaction property of the transform is exploited to reduce the size of the dictionary as well as that of the candidate samples in the object representation so that the computational cost of the l_1-minimization used could be reduced. This strategy is in contrast to the existing models that use raw pixels. A holistic image reconstruction procedure is presented from the overlapped local patches that are obtained from the dictionary and the sparse codes, and then the reconstructed holistic image is used for robust occlusion detection and occlusion map generation. The occlusion map thus obtained is used for developing a novel observation model update mechanism to avoid the model degradation. A patch occlusion ratio is employed in the calculation of the confidence score to improve the tracking performance. Quantitative and qualitative performance evaluations on the two above mentioned benchmark datasets demonstrate that this second proposed tracking algorithm generally performs better than several state-of-the-art methods and the first proposed tracking method do. Despite the improved performance of this second proposed tracking algorithm, there are still some challenging attributes of OTB-50 and of VOT2016 for which the performance needs to be improved. Finally, a third tracking algorithm is proposed by developing a scheme for collaboration between the discriminative and generative appearance models. The discriminative model is explored to estimate the position of the target and a new generative model is used to find the remaining affine parameters of the target. In the generative model, robust coding is extended to two dimensions and employed in the bilateral two dimensional PCA (2DPCA) reconstruction procedure to handle the non-Gaussian or non-Laplacian residuals by developing an IRRC technique. A 2D robust coding distance is introduced to differentiate the candidate sample from the one reconstructed from the subspace and used to compute the observation likelihood in the generative model. A method of generating a robust occlusion map from the weights obtained during the IRRC technique and a novel update mechanism of the observation model for both the kernelized correlation filters and the bilateral 2DPCA subspace are developed. Quantitative and qualitative performance evaluations on the two datasets demonstrate that this algorithm with HOG features generally outperforms the state-of-the-art methods and the other two proposed algorithms for most of the challenging attributes

    Multi-camera object segmentation in dynamically textured scenes using disparity contours

    Get PDF
    This thesis presents a stereo-based object segmentation system that combines the simplicity and efficiency of the background subtraction approach with the capacity of dealing with dynamic lighting and background texture and large textureless regions. The method proposed here does not rely on full stereo reconstruction or empirical parameter tuning, but employs disparity-based hypothesis verification to separate multiple objects at different depths.The proposed stereo-based segmentation system uses a pair of calibrated cameras with a small baseline and factors the segmentation problem into two stages: a well-understood offline stage and a novel online one. Based on the calibrated parameters, the offline stage models the 3D geometry of a background by constructing a complete disparity map. The online stage compares corresponding new frames synchronously captured by the two cameras according to the background disparity map in order to falsify the hypothesis that the scene contains only background. The resulting object boundary contours possess a number of useful features that can be exploited for object segmentation.Three different approaches to contour extraction and object segmentation were experimented with and their advantages and limitations analyzed. The system demonstrates its ability to extract multiple objects from a complex scene with near real-time performance. The algorithm also has the potential of providing precise object boundaries rather than just bounding boxes, and is extensible to perform 2D and 3D object tracking and online background update

    A Framework for Modeling Appearance Change in Image Sequences

    No full text
    Image “appearance ” may change over time due to a variety of causes such as 1) object or camera motion; 2) generic photometric events including variations in illumination (e.g. shadows) and specular reflections; and 3) “iconic changes ” which are specific to the objects being viewed and include complex occlusion events and changes in the material properties of the objects. We propose a general framework for representingand recovering these “appearance changes ” in an image sequence as a “mixture ” of different causes. The approach generalizes previous work on optical flow to provide a richer description of image events and more reliable estimates of image motion.

    A Framework for Modeling Appearance Change in Image Sequences

    No full text
    Image “appearance ” may change over time due to a variety of causes such as 1) object or camera mo-tion; 2) generic photometric events including varia-tions in illumination (e.g. shadows) and specular re-jections; and 3) “iconic changes ” which are specific to the objects being viewed and include complex occlu-sion events and changes in the material properties of the objects. We propose a general framework for repre-senting and recovering these “appearance changes ’ ’ in an image sequence as a “mixture ” of different causes. The approach generalizes previous work on optical $ow to provide a richer description of image events and more reliable estimates of image motion.
    corecore