44 research outputs found

    Evaluation of Data Augmentation Techniques for Facial Expression Recognition Systems

    Get PDF
    Most Facial Expression Recognition (FER) systems rely on machine learning approaches that require large databases for an effective training. As these are not easily available, a good solution is to augment the databases with appropriate data augmentation (DA) techniques, which are typically based on either geometric transformation or oversampling augmentations (e.g., generative adversarial networks (GANs)). However, it is not always easy to understand which DA technique may be more convenient for FER systems because most state-of-the-art experiments use different settings which makes the impact of DA techniques not comparable. To advance in this respect, in this paper, we evaluate and compare the impact of using well-established DA techniques on the emotion recognition accuracy of a FER system based on the well-known VGG16 convolutional neural network (CNN). In particular, we consider both geometric transformations and GAN to increase the amount of training images. We performed cross-database evaluations: training with the "augmented" KDEF database and testing with two different databases (CK+ and ExpW). The best results were obtained combining horizontal reflection, translation and GAN, bringing an accuracy increase of approximately 30%. This outperforms alternative approaches, except for the one technique that could however rely on a quite bigger database

    Data Augmentation and Transfer Learning Approaches Applied to Facial Expressions Recognition

    Get PDF
    The face expression is the first thing we pay attention to when we want to understand a person’s state of mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research field. In this paper, because the small size of available training datasets, we propose a novel data augmentation technique that improves the performances in the recognition task. We apply geometrical transformations and build from scratch GAN models able to generate new synthetic images for each emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with different architectures. To measure the generalization ability of the models, we apply extra-database protocol approach, namely we train models on the augmented versions of training dataset and test them on two different databases. The combination of these techniques allows to reach average accuracy values of the order of 85\% for the InceptionResNetV2 model

    Data Augmentation Techniques and Transfer Learning Approaches Applied to Facial Expressions Recognition Systems

    Get PDF
    The face expression is the first thing we pay attention to when we want to understand a person’s state of mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research field. In this paper, because the small size of available training datasets, we propose a novel data augmentation technique that improves the performances in the recognition task. We apply geometrical transformations and build from scratch GAN models able to generate new synthetic images for each emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with different architectures. To measure the generalization ability of the models, we apply extra-database protocol approach, namely we train models on the augmented versions of training dataset and test them on two different databases. The combination of these techniques allows to reach average accuracy values of the order of 85\% for the InceptionResNetV2 model

    顔表情自動認識における西洋人と東洋人の基本的表情の違いに対する分析

    Get PDF
    Facial Expression Recognition (FER) has been one of the main targets of the well-known Human Computer Interaction (HCI) research field. Recent developments on this topic have attained high recognition rates under controlled and “in-the-wild” environments overcoming some of the main problems attached to FER systems, such as illumination changes, individual differences, partial occlusion, and so on. However, to the best of the author’s knowledge, all of those proposals have taken for granted the cultural universality of basic facial expressions of emotion. This hypothesis recently has been questioned and in some degree refuted by certain part of the research community from the psychological viewpoint. In this dissertation, an analysis of the differences between Western-Caucasian (WSN) and East-Asian (ASN) prototypic facial expressions is presented in order to assess the cultural universality from an HCI viewpoint. In addition, a full automated FER system is proposed for this analysis. This system is based on hybrid features of specific facial regions of forehead, eyes-eyebrows, mouth and nose, which are described by Fourier coefficients calculated individually from appearance and geometric features. The proposal takes advantage of the static structure of individual faces to be finally classified by Support Vector Machines. The culture-specific analysis is composed by automatic facial expression recognition and visual analysis of facial expression images from different standard databases divided into two different cultural datasets. Additionally, a human study applied to 40 subjects from both ethnic races is presented as a baseline. Evaluation results aid in identifying culture-specific facial expression differences based on individual and combined facial regions. Finally, two possible solutions for solving these differences are proposed. The first one builds on an early ethnicity detection which is based on the extraction of color, shape and texture representative features from each culture. The second approach independently considers the culture-specific basic expressions for the final classification process. In summary, the main contributions of this dissertation are: 1) Qualitative and quantitative analysis of appearance and geometric feature differences between Western-Caucasian and East-Asian facial expressions. 2) A fully automated FER system based on facial region segmentation and hybrid features. 3) The prior considerations for working with multicultural databases on FER. 4) Two possible solutions for FER with multicultural environments. This dissertation is organized as follows. Chapter 1 introduced the motivation, objectives and contributions of this dissertation. Chapter 2 presented, in detail, the background of FER and reviewed the related works from the psychological viewpoint along with the proposals which work with multicultural databases for FER from HCI. Chapter 3 explained the proposed FER method based on facial region segmentation. The automatic segmentation is focused on four facial regions. This proposal is capable to recognize the six basic expression by using only one part of the face. Therefore, it is useful for dealing with the problem of partial occlusion. Finally a modal value approach is proposed for unifying the different results obtained by facial regions of the same face image. Chapter 4 described the proposed fully automated FER method based on Fourier coefficients of hybrid features. This method takes advantage of information extracted from pixel intensities (appearance features) and facial shapes (geometric features) of three different facial regions. Hence, it also overcomes the problem of partial occlusion. This proposal is based on a combination of Local Fourier Coefficients (LFC) and Facial Fourier Descriptors (FFD) of appearance and geometric information, respectively. In addition, this method takes into account the effect of the static structure of the faces by subtracting the neutral face from the expressive face at the feature extraction level. Chapter 5 introduced the proposed analysis of differences between Western-Caucasian (WSN) and East-Asian (ASN) basic facial expressions, it is composed by FER and visual analysis which are divided by appearance, geometric and hybrid features. The FER analysis is focused on in- and out-group performance as well as multicultural tests. The proposed human study which shows cultural differences in perceiving the basic facial expressions, is also described in this chapter. Finally, the two possible solutions for working with multicultural environments are detailed, which are based on an early ethnicity detection and the consideration of previously found culture-specific expressions, respectively. Chapter 6 drew the conclusion and the future works of this research.電気通信大学201

    Analysis of facial expressions: experiments on multiple databases

    Get PDF
    This master thesis compares different face descriptors using classification techniques in order to classify emotions in images of faces of people of different ethnicities and ages, male and female. The comparison is done between hand-crafted features such as LBP and HOG and more modern features such as some pre-trained neural networks. The proposed methods were used on different databases, using different image sizes and cropping and standardizing all the images. The experimental results showed that some of the hand- crafted features were better that the pre-trained neural networks. To facilitate replication of our experiments the MATLBAB source code will be available at https://github. com/nagwlei/FaceEmotions

    Yüz analizine dayalı derin öğrenme tabanlı bir ilgi tespit sisteminin gerçekleştirilmesi

    Get PDF
    06.03.2018 tarihli ve 30352 sayılı Resmi Gazetede yayımlanan “Yükseköğretim Kanunu İle Bazı Kanun Ve Kanun Hükmünde Kararnamelerde Değişiklik Yapılması Hakkında Kanun” ile 18.06.2018 tarihli “Lisansüstü Tezlerin Elektronik Ortamda Toplanması, Düzenlenmesi ve Erişime Açılmasına İlişkin Yönerge” gereğince tam metin erişime açılmıştır.Pazarlama alanında, en heyecan verici, yenilikçi ve gelecek vaat eden konulardan biri müşteri ilgisinin ölçülmesidir. Müşteri ilgisini ölçmek için geleneksel bir yaklaşım olan müşteri memnuniyet anketleri, günümüzde müşteriyi rahatsız edici bir yöntem olarak değerlendirilmektedir. Diğer bir müşteri ilgisi ölçme yöntemi de bir insan gözlemcinin müşteri davranışlarını izleyip kaydetmesi şeklinde olabilir ancak bu da deneyimli ve yetenekli insan gerektirir. Ayrıca her gözlemci, insan davranışlarını farklı yorumlayabileceğinden, sonuçlar tarafsız olamayabilir. Bu nedenle müşteri davranışlarını izlemek için, rahatsız edici olmayan, nicel, tarafsız ve otomatik sonuçlar üretebilen sistemlere ihtiyaç vardır. Bu tez çalışması ile müşteri davranışının bilgisayar aracılığı ile izlenmesi ve bir ürüne ya da reklama ilgi duyan müşterilerin belirlenmesi için derin öğrenme tabanlı bir sistem önerilmektedir. Bu sistem ilk olarak müşterinin dikkatini baş yönelimi tahminiyle ölçer. Baş pozisyonları reklama veya ilgilenilen ürüne yönelik olan müşteriler için, sistem yüz ifadelerini analiz eder ve yüz ifadesine dayalı olarak müşterilerin ürünlere veya reklamlara olan ilgisini tahmin eder. Sistem ön yüz görüntülerinin algılanmasıyla çalışmaya başlar, ardından yüz ifadesi tespiti için önemli olan ağız, göz ve kaş bileşenleri tespit edilip yüz üzerinde bölütlenir ve bölütlenmiş bir yüz görüntüsü oluşturulur. Son olarak, ham yüz görüntüleri ile birlikte, elde edilen bölütlenmiş yüz görüntülerine ait güven değerleri kullanılarak yüz ifadeleri tespit edilir. İki aşamalı olan bu yüz ifadesi tespit yöntemi, parça tabanlı özellikler ile bütünsel yüz özelliklerini birleştirerek daha güçlü bir yüz ifadesi sistemi sunar. Sistemde ayrıca müşteri yüzleri etiketlenerek video çerçevesi boyunca takip edilir. Her müşteriye ait yüz ifadeleri belirli bir süre boyunca depolanır ve bu süre sonunda müşterinin ürüne ilgili olup olmadığı ile ilgili sonuç bildirilir. Önerilen sistem müşteri davranışlarının izlenmesine ek olarak, farklı odak gruplarının çeşitli fikirlere, resimlere, seslere, kelimelere ve diğer uyaranlara duygusal tepkisini izlemek için de kullanılabilir.In the marketing research, one of the most exciting, innovative, and promising trends is quantification of customer interest. The customer satisfaction survey, which is a traditional approach to quantify customer interest, has come to be considered as an invasive method in recent years. Recording customer interest by a salesperson who observes customers' behavior during the advertisement watching or shopping phase is another approach. However, this task requires specific skills for every salesperson, and each observer may interpret customer behaviors differently. Consequently, there is a critical need to develop non-invasive, objective, and quantitative tools for monitoring customer interest. This study presents a deep learning-based system for monitoring customer behavior specifically for detection of interest. The proposed system first measures customer attention through head pose estimation. For those customers whose heads are oriented toward the advertisement or the product of interest, the system further analyzes the facial expressions and reports customers' interest. The proposed system starts by detecting frontal face poses; facial components important for facial expression recognition are then segmented and an iconized face image is generated; finally, facial expressions are analyzed using the confidence values of obtained iconized face image combined with the raw facial images. This approach fuses local part-based features with holistic facial information for robust facial expression recognition. The system is also tracked human faces along the video frame by labeling the faces. The facial expressions of each customer are stored for a certain period of time; at the end of this period, the result of whether the customer is related to the product or advirtesement is notified. With the proposed processing pipeline, using a basic imaging device, such as a webcam, head pose estimation and facial expression recognition is possible. The proposed pipeline can be used to monitor emotional response of focus groups to various ideas, pictures, sounds, words, and other stimuli

    Estimation of the QoE for video streaming services based on facial expressions and gaze direction

    Get PDF
    As the multimedia technologies evolve, the need to control their quality becomes even more important making the Quality of Experience (QoE) measurements a key priority. Machine Learning (ML) can support this task providing models to analyse the information extracted by the multimedia. It is possible to divide the ML models applications in the following categories: 1) QoE modelling: ML is used to define QoE models which provide an output (e.g., perceived QoE score) for any given input (e.g., QoE influence factor). 2) QoE monitoring in case of encrypted traffic: ML is used to analyze passive traffic monitored data to obtain insight into degradations perceived by end-users. 3) Big data analytics: ML is used for the extraction of meaningful and useful information from the collected data, which can further be converted to actionable knowledge and utilized in managing QoE. The QoE estimation quality task can be carried out by using two approaches: the objective approach and subjective one. As the two names highlight, they are referred to the pieces of information that the model analyses. The objective approach analyses the objective features extracted by the network connection and by the used media. As objective parameters, the state-of-the-art shows different approaches that use also the features extracted by human behaviour. The subjective approach instead, comes as a result of the rating approach, where the participants were asked to rate the perceived quality using different scales. This approach had the problem of being a time-consuming approach and for this reason not all the users agree to compile the questionnaire. Thus the direct evolution of this approach is the ML model adoption. A model can substitute the questionnaire and evaluate the QoE, depending on the data that analyses. By modelling the human response to the perceived quality on multimedia, QoE researchers found that the parameters extracted from the users could be different, like Electroencephalogram (EEG), Electrocardiogram (ECG), waves of the brain. The main problem with these techniques is the hardware. In fact, the user must wear electrodes in case of ECG and EEG, and also if the obtained results from these methods are relevant, their usage in a real context could be not feasible. For this reason, my studies have been focused on the developing of a Machine Learning framework completely unobtrusively based on the Facial reactions

    Analysis of facial expressions: experiments on multiple databases

    Get PDF
    This master thesis compares different face descriptors using classification techniques in order to classify emotions in images of faces of people of different ethnicities and ages, male and female. The comparison is done between hand-crafted features such as LBP and HOG and more modern features such as some pre-trained neural networks. The proposed methods were used on different databases, using different image sizes and cropping and standardizing all the images. The experimental results showed that some of the hand- crafted features were better that the pre-trained neural networks. To facilitate replication of our experiments the MATLBAB source code will be available at https://github. com/nagwlei/FaceEmotions

    Kısmi ve tam yüz görüntüleri üzerinde makine öğrenmesi yöntemleriyle yüz ifadesi tespiti

    Get PDF
    06.03.2018 tarihli ve 30352 sayılı Resmi Gazetede yayımlanan “Yükseköğretim Kanunu İle Bazı Kanun Ve Kanun Hükmünde Kararnamelerde Değişiklik Yapılması Hakkında Kanun” ile 18.06.2018 tarihli “Lisansüstü Tezlerin Elektronik Ortamda Toplanması, Düzenlenmesi ve Erişime Açılmasına İlişkin Yönerge” gereğince tam metin erişime açılmıştır.Yüz ifadeleri insanlar arası iletişimin önemli bir parçası olduğu gibi insan makine etkileşiminde de önemli rol oynamaktadır. Suçlu tespiti, sürücü dikkatinin izlenmesi, hasta takibi gibi önemli konularda karar vermede yüz ifadesi tespiti kullanılmaktadır. Bu sebeple, yüz ifadelerinin sistemler aracılığı ile otomatik tespiti popüler bir makine öğrenmesi çalışma alanıdır. Bu tez çalışmasında yüz ifadesi sınıflandırma çalışmaları yapılmıştır. Yapılan yüz ifadesi tespiti uygulamaları genel olarak iki başlık altında toplanabilir. Bunlardan ilki kısmi yüz görüntülerinin klasik makine öğrenmesi yöntemleriyle analizi ve ikincisi ise tüm yüz görüntülerinin derin öğrenme yöntemleri ile analiz edilmesidir. Geliştirilen ilk uygulamada, yüz görüntülerinden duygu tespiti için literatürdeki çalışmalardan farklı olarak sadece göz ve kaşların bulunduğu bölgeler kullanılarak sınıflandırma yapılmış ve yüksek başarım elde edilmiştir. Önerilen bu yöntem sayesinde yüz ifadesi tespitleri alt yüz kapanmalarından veya ağız hareketlerinden etkilenmeyecek, gürbüz özniteliklerin seçimi ile daha az öznitelikle sınırlı kaynaklara sahip cihazlarda çalışabilecek niteliktedir. Ayrıca önerilen sistemin genelleme yeteneğinin yüksek olduğu karşılaştırmalı olarak deneysel çalışmalarla ortaya konulmuştur. Tez kapsamında yapılan diğer yüz ifadesi sınıflandırma çalışmaları tüm yüz görüntüleri kullanılarak derin öğrenme yöntemleri ile gerçeklenmiştir. Önerilen yaklaşımlardan birisi yüz bölütleme çalışmasıdır. Bu çalışmalar ile elde edilen bölütlenmiş görüntüde yüz ifadesi ile ilgili öznitelikler korunmakta, kişisel herhangi bir veri saklanmamakta ve böylece kişisel gizlilik de korunmuş olmaktadır. Ayrıca bölütlenmiş görüntü ile orijinal yüz görüntüsünün birleşimi; yüz ifadesi için önemli olan kaş, göz ve ağız bölgelerine odaklanılarak yüz ifadelerinin tanınma başarımının arttırılması sağlamıştır.Facial expressions are important for interpersonal communication also play an important role in human machine interaction. Facial expressions are used in many areas such as criminal detection, driver attention monitoring, patient monitoring. Therefore, automatic facial expression recognition systems are a popular machine learning problem. In this thesis study, facial expression recognition studies are performed. In general, the applications of facial expression recognition can be grouped under two topic in this thesis: analysis of partial facial images with classical machine learning methods and analysis of whole facial images with deep learning methods. In the first application, classification of the facial expressions from facial images was performed using only eye and eyebrows regions. This approach is different from the studies which are studied facial expression recognition in the literature and high success rate was achieved. With this approach, proposed system is more robust for under facial occlusions and mouth motion during speech. Further, according to our experiments, the generalization ability of the proposed system is high. In this thesis, the rest of the facial expression recognition applications was developed with whole face images using deep learning techniques. One of the proposed methods is segmentation of facial parts with CNN. After segmentation process, facial segmented images were obtained. With this segmented images, personal privacy is protected because the segmented images don't include any personal information. Also, the success rate of the classification was increased with combining original raw image and segmented image. Because; eyes, eyebrows and mouth are crucial for facial expression recognition and segmented images have these areas. Therefore, the proposed CNN architecture for classification forces the earlier layers of the CNN system to learn to detect and localize the facial regions, thus providing decoupled and guided training
    corecore