1,348 research outputs found

    Automatic recognition of fingerspelled words in British Sign Language

    Get PDF
    We investigate the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet. This is a challenging task since the BSL alphabet involves both hands occluding each other, and contains signs which are ambiguous from the observer’s viewpoint. The main contributions of our work include: (i) recognition based on hand shape alone, not requiring motion cues; (ii) robust visual features for hand shape recognition; (iii) scalability to large lexicon recognition with no re-training. We report results on a dataset of 1,000 low quality webcam videos of 100 words. The proposed method achieves a word recognition accuracy of 98.9%

    Facial Emotional Classifier For Natural Interaction

    Get PDF
    The recognition of emotional information is a key step toward giving computers the ability to interact more naturally and intelligently with people. We present a simple and computationally feasible method to perform automatic emotional classification of facial expressions. We propose the use of a set of characteristic facial points (that are part of the MPEG4 feature points) to extract relevant emotional information (basically five distances, presence of wrinkles in the eyebrow and mouth shape). The method defines and detects the six basic emotions (plus the neutral one) in terms of this information and has been fine-tuned with a database of more than 1500 images. The system has been integrated in a 3D engine for managing virtual characters, allowing the exploration of new forms of natural interaction

    Multimodaalsel emotsioonide tuvastamisel pÔhineva inimese-roboti suhtluse arendamine

    Get PDF
    VĂ€itekirja elektrooniline versioon ei sisalda publikatsiooneÜks afektiivse arvutiteaduse peamistest huviobjektidest on mitmemodaalne emotsioonituvastus, mis leiab rakendust peamiselt inimese-arvuti interaktsioonis. Emotsiooni Ă€ratundmiseks uuritakse nendes sĂŒsteemides nii inimese nĂ€oilmeid kui kakĂ”net. KĂ€esolevas töös uuritakse inimese emotsioonide ja nende avaldumise visuaalseid ja akustilisi tunnuseid, et töötada vĂ€lja automaatne multimodaalne emotsioonituvastussĂŒsteem. KĂ”nest arvutatakse mel-sageduse kepstri kordajad, helisignaali erinevate komponentide energiad ja prosoodilised nĂ€itajad. NĂ€oilmeteanalĂŒĂŒsimiseks kasutatakse kahte erinevat strateegiat. Esiteks arvutatakse inimesenĂ€o tĂ€htsamate punktide vahelised erinevad geomeetrilised suhted. Teiseks vĂ”etakse emotsionaalse sisuga video kokku vĂ€hendatud hulgaks pĂ”hikaadriteks, misantakse sisendiks konvolutsioonilisele tehisnĂ€rvivĂ”rgule emotsioonide visuaalsekseristamiseks. Kolme klassifitseerija vĂ€ljunditest (1 akustiline, 2 visuaalset) koostatakse uus kogum tunnuseid, mida kasutatakse Ă”ppimiseks sĂŒsteemi viimasesetapis. Loodud sĂŒsteemi katsetati SAVEE, Poola ja Serbia emotsionaalse kĂ”neandmebaaside, eNTERFACE’05 ja RML andmebaaside peal. Saadud tulemusednĂ€itavad, et vĂ”rreldes olemasolevatega vĂ”imaldab kĂ€esoleva töö raames loodudsĂŒsteem suuremat tĂ€psust emotsioonide Ă€ratundmisel. Lisaks anname kĂ€esolevastöös ĂŒlevaate kirjanduses vĂ€ljapakutud sĂŒsteemidest, millel on vĂ”imekus tunda Ă€raemotsiooniga seotud ̆zeste. Selle ĂŒlevaate eesmĂ€rgiks on hĂ”lbustada uute uurimissuundade leidmist, mis aitaksid lisada töö raames loodud sĂŒsteemile ̆zestipĂ”hiseemotsioonituvastuse vĂ”imekuse, et veelgi enam tĂ”sta sĂŒsteemi emotsioonide Ă€ratundmise tĂ€psust.Automatic multimodal emotion recognition is a fundamental subject of interest in affective computing. Its main applications are in human-computer interaction. The systems developed for the foregoing purpose consider combinations of different modalities, based on vocal and visual cues. This thesis takes the foregoing modalities into account, in order to develop an automatic multimodal emotion recognition system. More specifically, it takes advantage of the information extracted from speech and face signals. From speech signals, Mel-frequency cepstral coefficients, filter-bank energies and prosodic features are extracted. Moreover, two different strategies are considered for analyzing the facial data. First, facial landmarks' geometric relations, i.e. distances and angles, are computed. Second, we summarize each emotional video into a reduced set of key-frames. Then they are taught to visually discriminate between the emotions. In order to do so, a convolutional neural network is applied to the key-frames summarizing the videos. Afterward, the output confidence values of all the classifiers from both of the modalities are used to define a new feature space. Lastly, the latter values are learned for the final emotion label prediction, in a late fusion. The experiments are conducted on the SAVEE, Polish, Serbian, eNTERFACE'05 and RML datasets. The results show significant performance improvements by the proposed system in comparison to the existing alternatives, defining the current state-of-the-art on all the datasets. Additionally, we provide a review of emotional body gesture recognition systems proposed in the literature. The aim of the foregoing part is to help figure out possible future research directions for enhancing the performance of the proposed system. More clearly, we imply that incorporating data representing gestures, which constitute another major component of the visual modality, can result in a more efficient framework

    Biometric fusion methods for adaptive face recognition in computer vision

    Get PDF
    PhD ThesisFace recognition is a biometric method that uses different techniques to identify the individuals based on the facial information received from digital image data. The system of face recognition is widely used for security purposes, which has challenging problems. The solutions to some of the most important challenges are proposed in this study. The aim of this thesis is to investigate face recognition across pose problem based on the image parameters of camera calibration. In this thesis, three novel methods have been derived to address the challenges of face recognition and offer solutions to infer the camera parameters from images using a geomtric approach based on perspective projection. The following techniques were used: camera calibration CMT and Face Quadtree Decomposition (FQD), in order to develop the face camera measurement technique (FCMT) for human facial recognition. Facial information from a feature extraction and identity-matching algorithm has been created. The success and efficacy of the proposed algorithm are analysed in terms of robustness to noise, the accuracy of distance measurement, and face recognition. To overcome the intrinsic and extrinsic parameters of camera calibration parameters, a novel technique has been developed based on perspective projection, which uses different geometrical shapes to calibrate the camera. The parameters used in novel measurement technique CMT that enables the system to infer the real distance for regular and irregular objects from the 2-D images. The proposed system of CMT feeds into FQD to measure the distance between the facial points. Quadtree decomposition enhances the representation of edges and other singularities along curves of the face, and thus improves directional features from face detection across face pose. The proposed FCMT system is the new combination of CMT and FQD to recognise the faces in the various pose. The theoretical foundation of the proposed solutions has been thoroughly developed and discussed in detail. The results show that the proposed algorithms outperform existing algorithms in face recognition, with a 2.5% improvement in main error recognition rate compared with recent studies

    Emotion and Stress Recognition Related Sensors and Machine Learning Technologies

    Get PDF
    This book includes impactful chapters which present scientific concepts, frameworks, architectures and ideas on sensing technologies and machine learning techniques. These are relevant in tackling the following challenges: (i) the field readiness and use of intrusive sensor systems and devices for capturing biosignals, including EEG sensor systems, ECG sensor systems and electrodermal activity sensor systems; (ii) the quality assessment and management of sensor data; (iii) data preprocessing, noise filtering and calibration concepts for biosignals; (iv) the field readiness and use of nonintrusive sensor technologies, including visual sensors, acoustic sensors, vibration sensors and piezoelectric sensors; (v) emotion recognition using mobile phones and smartwatches; (vi) body area sensor networks for emotion and stress studies; (vii) the use of experimental datasets in emotion recognition, including dataset generation principles and concepts, quality insurance and emotion elicitation material and concepts; (viii) machine learning techniques for robust emotion recognition, including graphical models, neural network methods, deep learning methods, statistical learning and multivariate empirical mode decomposition; (ix) subject-independent emotion and stress recognition concepts and systems, including facial expression-based systems, speech-based systems, EEG-based systems, ECG-based systems, electrodermal activity-based systems, multimodal recognition systems and sensor fusion concepts and (x) emotion and stress estimation and forecasting from a nonlinear dynamical system perspective

    A deep learning approach to monitoring workers stress at office

    Get PDF
    Identifying stress in people is not a trivial or straightforward task, as several factors are involved in detecting the presence or absence of stress. Since there are few tools on the market that companies can use, new models have been created and developed that can be used to detect stress. In this study, we propose developing a stress detection application using deep learning models to analyze images obtained in the workplace. It will provide information from these analyses to the company so they can use it for occupational health management. The proposed solution uses deep learning algorithms to create prediction models and analyze images. The new non-invasive application is designed to help detect stress and educate people to control their health conditions. The model trained achieved an F1=79.9% with a binary dataset of stress/non-stress that have an imbalanced ratio of 0.49Identificar o estresse nas pessoas nĂŁo Ă© uma tarefa trivial ou simples, pois vĂĄrios fatores estĂŁo envolvidos na detecção da presença ou ausĂȘncia de estresse. Como existem poucas ferramentas no mercado que as empresas podem utilizar, foram criados e desenvolvidos novos modelos que podem ser utilizados para detectar o estresse. Neste estudo, propomos desenvolver um aplicativo de detecção de estresse usando modelos de aprendizado profundo para analisar imagens obtidas no local de trabalho. Ele fornecerĂĄ informaçÔes dessas anĂĄlises para a empresa para que possa utilizĂĄ-las para a gestĂŁo da saĂșde ocupacional. A solução proposta usa algoritmos de aprendizado profundo para criar modelos de previsĂŁo e analisar imagens. O novo aplicativo nĂŁo invasivo foi projetado para ajudar a detectar o estresse e educar as pessoas para controlar suas condiçÔes de saĂșde. O modelo treinado alcançou um F1=79,9% com um conjunto de dados binĂĄrios de estresse/nĂŁo estresse que continha um ratio de desbalanceamento de 0.4

    Classification et Caractérisation de l'Expression Corporelle des Emotions dans des Actions Quotidiennes

    Get PDF
    The work conducted in this thesis can be summarized into four main steps.Firstly, we proposed a multi-level body movement notation system that allows the description ofexpressive body movement across various body actions. Secondly, we collected a new databaseof emotional body expression in daily actions. This database constitutes a large repository of bodilyexpression of emotions including the expression of 8 emotions in 7 actions, combining video andmotion capture recordings and resulting in more than 8000 sequences of expressive behaviors.Thirdly, we explored the classification of emotions based on our multi-level body movement notationsystem. Random Forest approach is used for this purpose. The advantage of using RandomForest approach in our work is double-fold : 1) reliability of the classification model and 2) possibilityto select a subset of relevant features based on their relevance measures. We also comparedthe automatic classification of emotions with human perception of emotions expressed in differentactions. Finally, we extracted the most relevant features that capture the expressive content of themotion based on the relevance measure of features returned by the Random Forest model. Weused this subset of features to explore the characterization of emotional body expression acrossdifferent actions. A Decision Tree model was used for this purpose.Ce travail de thĂšse peut ĂȘtre rĂ©sumĂ© en quatre Ă©tapes principales. PremiĂšrement, nousavons proposĂ© un systĂšme d’annotation multi-niveaux pour dĂ©crire le mouvement corporel expressif dansdiffĂ©rentes actions. DeuxiĂšmement, nous avons enregistrĂ© une base de donnĂ©es de l’expression corporelledes Ă©motions dans des actions quotidiennes. Cette base de donnĂ©es constitue un large corpus de comportementsexpressifs considĂ©rant l’expression de 8 Ă©motions dans 7 actions quotidiennes, combinant Ă  la fois lesdonnĂ©es audio-visuelle et les donnĂ©es de capture de mouvement et donnant lieu Ă  plus que 8000 sĂ©quencesde mouvement expressifs. TroisiĂšmement, nous avons explorĂ© la classification des Ă©motions en se basantsur notre systĂšme d’annotation multi-niveaux. L’approche des forĂȘts alĂ©atoires est utilisĂ©e pour cette fin. L’utilisationdes forĂȘts alĂ©atoires dans notre travail a un double objectif : 1) la fiabilitĂ© du modĂšle de classification,et 2) la possibilitĂ© de sĂ©lectionner un sous-ensemble de paramĂštres pertinents en se basant sur la mesured’importance retournĂ©e par le modĂšle. Nous avons aussi comparĂ© la classification automatique des Ă©motionsavec la perception humaine des Ă©motions exprimĂ©es dans diffĂ©rentes actions. Finalement, nous avonsextrait les paramĂštres les plus pertinents qui retiennent l’expressivitĂ© du mouvement en se basant sur la mesured’importance retournĂ©e par le modĂšle des forĂȘts alĂ©atoires. Nous avons utilisĂ© ce sous-ensemble deparamĂštres pour explorer la caractĂ©risation de l’expression corporelle des Ă©motions dans diffĂ©rentes actionsquotidiennes. Un modĂšle d’arbre de dĂ©cision a Ă©tĂ© utilisĂ© pour cette fin

    Robust Modeling of Epistemic Mental States and Their Applications in Assistive Technology

    Get PDF
    This dissertation presents the design and implementation of EmoAssist: Emotion-Enabled Assistive Tool to Enhance Dyadic Conversation for the Blind . The key functionalities of the system are to recognize behavioral expressions and to predict 3-D affective dimensions from visual cues and to provide audio feedback to the visually impaired in a natural environment. Prior to describing the EmoAssist, this dissertation identifies and advances research challenges in the analysis of the facial features and their temporal dynamics with Epistemic Mental States in dyadic conversation. A number of statistical analyses and simulations were performed to get the answer of important research questions about the complex interplay between facial features and mental states. It was found that the non-linear relations are mostly prevalent rather than the linear ones. Further, the portable prototype of assistive technology that can aid blind individual to understand his/her interlocutor\u27s mental states has been designed based on the analysis. A number of challenges related to the system, communication protocols, error-free tracking of face and robust modeling of behavioral expressions /affective dimensions were addressed to make the EmoAssist effective in a real world scenario. In addition, orientation-sensor information from the phone was used to correct image alignment to improve the robustness in real life deployment. It was observed that the EmoAssist can predict affective dimensions with acceptable accuracy (Maximum Correlation-Coefficient for valence: 0.76, arousal: 0.78, and dominance: 0.76) in natural conversation. The overall minimum and maximum response-times are (64.61 milliseconds) and (128.22 milliseconds), respectively. The integration of sensor information for correcting the orientation has helped in significant improvement (16% in average) of accuracy in recognizing behavioral expressions. A user study with ten blind people shows that the EmoAssist is highly acceptable to them (Average acceptability rating using Likert: 6.0 where 1 and 7 are the lowest and highest possible ratings, respectively) in social interaction

    Implementation of Artificial Intelligence in Food Science, Food Quality, and Consumer Preference Assessment

    Get PDF
    In recent years, new and emerging digital technologies applied to food science have been gaining attention and increased interest from researchers and the food/beverage industries. In particular, those digital technologies that can be used throughout the food value chain are accurate, easy to implement, affordable, and user-friendly. Hence, this Special Issue (SI) is dedicated to novel technology based on sensor technology and machine/deep learning modeling strategies to implement artificial intelligence (AI) into food and beverage production and for consumer assessment. This SI published quality papers from researchers in Australia, New Zealand, the United States, Spain, and Mexico, including food and beverage products, such as grapes and wine, chocolate, honey, whiskey, avocado pulp, and a variety of other food products

    Affective Brain-Computer Interfaces

    Get PDF
    • 

    corecore