14 research outputs found

    Enhancement of the Adaptive Shape Variants Average Values by Using Eight Movement Directions for Multi-Features Detection of Facial Sketch

    Get PDF
    This paper aims to detect multi features of a facial sketch by using a novel approach. The detection of multi features of facial sketch has been conducted by several researchers, but they mainly considered frontal face sketches as object samples. In fact, the detection of multi features of facial sketch with certain angle is very important to assist police for describing the criminal's face, when criminal's face only appears on certain angle. Integration of the maximum line gradient value enhancement and the level set methods was implemented to detect facial features sketches with tilt angle to 15 degrees. However, these methods tend to move towards non features when there are a lot of graffiti around the shape. To overcome this weakness, the author proposes a novel approach to move the shape by adding a parameter to control the movement based on enhancement of the adaptive shape variants average values with 8 movement directions. The experimental results show that the proposed method can improve the detection accuracy up to 92.74%

    Fast and Accurate Algorithm for Eye Localization for Gaze Tracking in Low Resolution Images

    Full text link
    Iris centre localization in low-resolution visible images is a challenging problem in computer vision community due to noise, shadows, occlusions, pose variations, eye blinks, etc. This paper proposes an efficient method for determining iris centre in low-resolution images in the visible spectrum. Even low-cost consumer-grade webcams can be used for gaze tracking without any additional hardware. A two-stage algorithm is proposed for iris centre localization. The proposed method uses geometrical characteristics of the eye. In the first stage, a fast convolution based approach is used for obtaining the coarse location of iris centre (IC). The IC location is further refined in the second stage using boundary tracing and ellipse fitting. The algorithm has been evaluated in public databases like BioID, Gi4E and is found to outperform the state of the art methods.Comment: 12 pages, 10 figures, IET Computer Vision, 201

    A motion-based approach for audio-visual automatic speech recognition

    Get PDF
    The research work presented in this thesis introduces novel approaches for both visual region of interest extraction and visual feature extraction for use in audio-visual automatic speech recognition. In particular, the speaker‘s movement that occurs during speech is used to isolate the mouth region in video sequences and motionbased features obtained from this region are used to provide new visual features for audio-visual automatic speech recognition. The mouth region extraction approach proposed in this work is shown to give superior performance compared with existing colour-based lip segmentation methods. The new features are obtained from three separate representations of motion in the region of interest, namely the difference in luminance between successive images, block matching based motion vectors and optical flow. The new visual features are found to improve visual-only and audiovisual speech recognition performance when compared with the commonly-used appearance feature-based methods. In addition, a novel approach is proposed for visual feature extraction from either the discrete cosine transform or discrete wavelet transform representations of the mouth region of the speaker. In this work, the image transform is explored from a new viewpoint of data discrimination; in contrast to the more conventional data preservation viewpoint. The main findings of this work are that audio-visual automatic speech recognition systems using the new features extracted from the frequency bands selected according to their discriminatory abilities generally outperform those using features designed for data preservation. To establish the noise robustness of the new features proposed in this work, their performance has been studied in presence of a range of different types of noise and at various signal-to-noise ratios. In these experiments, the audio-visual automatic speech recognition systems based on the new approaches were found to give superior performance both to audio-visual systems using appearance based features and to audio-only speech recognition systems

    Yüz anotomisine dayalı ifade tanıma

    Get PDF
    Literatürde sunulan geometriye dayalı yüz ifadesi tanıma algoritmaları çoğunlukla araştırmacılar tarafından seçilen nirengi noktalarının devinimlerine veya yüz ifadesi kodlama sistemi (FACS) tarafından tanımlanan eylem birimlerinin etkinlik derecelerine odaklanır. Her iki yaklaşımda da nirengi noktaları, ifadenin en yoğun gözlemlendiği dudak, burun kenarları ve alın üzerinde konumlandırılır. Farklı kas etkinlikleri, birden fazla kasın etki alanında bulunan bu nirengi noktaları üzerinde benzer devinimlere neden olurlar. Bu nedenle, karmaşık ifadelerin belli noktalara konulan, sınırlı sayıdaki nirengi ile analizi oldukça zordur. Bu projede, yüz üzerinde kas etkinlik alanlarına dağıtılmış çok sayıda nirengi nokta-sının yüz ifadesinin oluşturulması sürecinde izlenmesi ile kas etkinlik derecelerinin belirlenmesini önerdik. Önerdiğimiz yüz ifadesi tanıma algoritması altı aşama içerir; (1) yüz modelinin deneğin yüzüne uyarlanması, (2) herhangi bir kasın etki alanında bulunan tüm nirengi noktalarının imge dizisinin ardışık çerçevelerinde izlenmesi, (3) baş yöneliminin belirlenmesi ve yüz modelinin imge üzerinde gözlemlenen yüz ile hizalanması, (4) yüze ait nirengi noktalarının deviniminden yola çıkarak model düğümlerinin yeni koordinatlarının kestirimi, (5) düğüm devinimlerinin kas kuvvetleri için çözülmesi, ve (6) elde edilen kas kuvvetleri ile yüz ifadesi sınıflandırılmasının yapılması. Algoritmamız, modelin yüze uyarlanması aşamasında yüz imgesi üzerinde nirengi noktalarının seçilmesi haricinde tamamen otomatiktir. Kas etkinliğine dayalı bu öznitelikleri temel ve belirsiz ifadelerin sınıflandırılması problemlerinde sınadık. Yedi adet temel yüz ifadesi üzerinde SVM sınıflandırıcısı ile %76 oranında başarı elde ettik. Bu oran, insanların ifade tanımadaki yetkinliklerine yakındır. Yedi temel ifadenin belirsiz gözlemlendiği çerçevelerde en yüksek başarıyı yine SVM sınıflandırıcısı ile %55 olarak elde ettik. Bu başarım, kas kuvvetlerinin genellikle hafif ve ani görülen istemsiz ifadelerin seziminde de başarılı olabileceğini göstermektedir. Kas kuvvetleri, yüz ifadesinin oluşturulmasındaki temel fiziksel gerçekliği yansıtan özniteliklerdir. Kas etkinliklerinin hassasiyetle kestirimi, belirsiz ifade değişikliklerinin sezimini sağladığı gibi, karmaşık yüz ifadelerinin sınıflandırılmalarını kolaylaştıracaktır. Ek olarak, araştırmacılar veya uzmanlar tarafından seçilen nirengi devinimleri ile kısıtlı kal-mayan bu yaklaşım, duygular ve yüz ifadeleri arasında bilinmeyen bağıntıların ortaya çıkarılmasını sağlayabilecektir.The geometric approaches to facial expression recognition commonly focus on the displa-cement of feature points that are selected by the researchers or the action units that aredefined by the facial action coding system (FACS). In both approaches the feature pointsare carefully located on lips, nose and the forehead, where an expression is observed at itsfull strength. Since these regions are under the influence of multiple muscles, distinct mus-cular activities could result in similar displacements of the feature points. Hence, analysisof complex expressions through a set of specific feature points is quite difficult.In this project we propose to extract the facial muscle activity levels through multiplepoints distributed over the muscular regions of influence. The proposed algorithm consistsof; (1) semi–automatic customization of the face model to a subject, (2) identification andtracking of facial features that reside in the region of influence of a muscle, (3) estimationof head orientation and alignment of the face model with the observed face, (4) estima-tion of relative displacements of vertices that produce facial expressions, (5) solving vertexdisplacements to obtain muscle forces, and (6) classification of facial expression with themuscle force features. Our algorithm requires manual intervention only in the stage ofmodel customization.We demonstrate the representative power of the proposed muscle–based features onclassification problems of seven basic and subtle expressions. The best performance onthe classification problem of basic expressions was 76%, obtained by use of SVM. Thisresult is close to the performance of humans in facial expression recognition. Our bestperformance for classification of seven subtle expressions was %55, once again by use ofSVM. This figure implies that muscle–based features are good candidates for involuntaryexpressions, which are often subtle and instantaneous.Muscle forces can be considered as the ultimate base functions that anatomicallycompose all expressions. Increased reliability in extraction of muscle forces will enabledetection and classification of subtle and complex expressions with higher precision. Mo-reover, the proposed algorithm may be used to reveal unknown mechanisms of emotionsand expressions as it is not limited to a predefined set of heuristic features.TÜBİTA

    An Efficient Boosted Classifier Tree-Based Feature Point Tracking System for Facial Expression Analysis

    Get PDF
    The study of facial movement and expression has been a prominent area of research since the early work of Charles Darwin. The Facial Action Coding System (FACS), developed by Paul Ekman, introduced the first universal method of coding and measuring facial movement. Human-Computer Interaction seeks to make human interaction with computer systems more effective, easier, safer, and more seamless. Facial expression recognition can be broken down into three distinctive subsections: Facial Feature Localization, Facial Action Recognition, and Facial Expression Classification. The first and most important stage in any facial expression analysis system is the localization of key facial features. Localization must be accurate and efficient to ensure reliable tracking and leave time for computation and comparisons to learned facial models while maintaining real-time performance. Two possible methods for localizing facial features are discussed in this dissertation. The Active Appearance Model is a statistical model describing an object\u27s parameters through the use of both shape and texture models, resulting in appearance. Statistical model-based training for object recognition takes multiple instances of the object class of interest, or positive samples, and multiple negative samples, i.e., images that do not contain objects of interest. Viola and Jones present a highly robust real-time face detection system, and a statistically boosted attentional detection cascade composed of many weak feature detectors. A basic algorithm for the elimination of unnecessary sub-frames while using Viola-Jones face detection is presented to further reduce image search time. A real-time emotion detection system is presented which is capable of identifying seven affective states (agreeing, concentrating, disagreeing, interested, thinking, unsure, and angry) from a near-infrared video stream. The Active Appearance Model is used to place 23 landmark points around key areas of the eyes, brows, and mouth. A prioritized binary decision tree then detects, based on the actions of these key points, if one of the seven emotional states occurs as frames pass. The completed system runs accurately and achieves a real-time frame rate of approximately 36 frames per second. A novel facial feature localization technique utilizing a nested cascade classifier tree is proposed. A coarse-to-fine search is performed in which the regions of interest are defined by the response of Haar-like features comprising the cascade classifiers. The individual responses of the Haar-like features are also used to activate finer-level searches. A specially cropped training set derived from the Cohn-Kanade AU-Coded database is also developed and tested. Extensions of this research include further testing to verify the novel facial feature localization technique presented for a full 26-point face model, and implementation of a real-time intensity sensitive automated Facial Action Coding System

    A motion based approach for audio-visual automatic speech recognition

    Get PDF
    The research work presented in this thesis introduces novel approaches for both visual region of interest extraction and visual feature extraction for use in audio-visual automatic speech recognition. In particular, the speaker‘s movement that occurs during speech is used to isolate the mouth region in video sequences and motionbased features obtained from this region are used to provide new visual features for audio-visual automatic speech recognition. The mouth region extraction approach proposed in this work is shown to give superior performance compared with existing colour-based lip segmentation methods. The new features are obtained from three separate representations of motion in the region of interest, namely the difference in luminance between successive images, block matching based motion vectors and optical flow. The new visual features are found to improve visual-only and audiovisual speech recognition performance when compared with the commonly-used appearance feature-based methods. In addition, a novel approach is proposed for visual feature extraction from either the discrete cosine transform or discrete wavelet transform representations of the mouth region of the speaker. In this work, the image transform is explored from a new viewpoint of data discrimination; in contrast to the more conventional data preservation viewpoint. The main findings of this work are that audio-visual automatic speech recognition systems using the new features extracted from the frequency bands selected according to their discriminatory abilities generally outperform those using features designed for data preservation. To establish the noise robustness of the new features proposed in this work, their performance has been studied in presence of a range of different types of noise and at various signal-to-noise ratios. In these experiments, the audio-visual automatic speech recognition systems based on the new approaches were found to give superior performance both to audio-visual systems using appearance based features and to audio-only speech recognition systems.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Evaluation and Hardware Realization for a Face Recognition System

    Get PDF
    Facial recognition from an image or a video sequence draws attention for many image processing researchers owing to its myriad applications in real world as well as in computer vision, human-computer interaction and intelligent systems. Facial structures have unique features which can be extracted using some mathematical tools. We have used Principal Component Analysis (PCA) and Local Binary Pattern (LBP) to extract them and stored them in a database. When the query image is given the facial features are extracted and compared to the previously obtained results using Sparse Face recognition. Detailed test methods have been defined and an extensive testing of the algorithm has been performed on various standard databases. The results have been tabulated with required graphs. The proposed algorithm has been compared to other different algorithms which show significant improvement in results with small number of training samples. Finally the algorithm was integrated in a hardware system so that it can be used as a self sufficient portable system

    Timing is everything: A spatio-temporal approach to the analysis of facial actions

    No full text
    This thesis presents a fully automatic facial expression analysis system based on the Facial Action Coding System (FACS). FACS is the best known and the most commonly used system to describe facial activity in terms of facial muscle actions (i.e., action units, AUs). We will present our research on the analysis of the morphological, spatio-temporal and behavioural aspects of facial expressions. In contrast with most other researchers in the field who use appearance based techniques, we use a geometric feature based approach. We will argue that that approach is more suitable for analysing facial expression temporal dynamics. Our system is capable of explicitly exploring the temporal aspects of facial expressions from an input colour video in terms of their onset (start), apex (peak) and offset (end). The fully automatic system presented here detects 20 facial points in the first frame and tracks them throughout the video. From the tracked points we compute geometry-based features which serve as the input to the remainder of our systems. The AU activation detection system uses GentleBoost feature selection and a Support Vector Machine (SVM) classifier to find which AUs were present in an expression. Temporal dynamics of active AUs are recognised by a hybrid GentleBoost-SVM-Hidden Markov model classifier. The system is capable of analysing 23 out of 27 existing AUs with high accuracy. The main contributions of the work presented in this thesis are the following: we have created a method for fully automatic AU analysis with state-of-the-art recognition results. We have proposed for the first time a method for recognition of the four temporal phases of an AU. We have build the largest comprehensive database of facial expressions to date. We also present for the first time in the literature two studies for automatic distinction between posed and spontaneous expressions
    corecore