Search CORE

505 research outputs found

Using Fisher Vector and Bag-of-Audio-Words Representations to Identify Styrian Dialects, Sleepiness, Baby & Orca Sounds

Author: Gosztolya Gábor
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2019
Field of study

Crossref

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Repository of the Academy's Library

Toward Accountable and Explainable Artificial Intelligence Part Two: The Framework Implementation

Author: Khan Masood
Vice Jordan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

This paper builds upon the theoretical foundations of the Accountable eXplainable Artificial Intelligence (AXAI) capability framework presented in part one of this paper.We demonstrate incorporation of the AXAI capability in the real time Affective State Assessment Module (ASAM) of a robotic system. We show that adhering to the eXtreme Programming (XP) practices would help in understanding user behavior and systematic incorporation of the AXAI capability in AI systems. We further show that a collaborative software design and development process (SDDP) would facilitate identification of ethical, technical, functional, and domain-specific system requirements. Meeting these requirements would increase user confidence in AI systems. Our results show that the ASAM can synthesize discrete and continuous models of affective state expressions for classifying them in real-time. The ASAM continuously shares important inputs, processed data and the output information with users via a graphical user interface (GUI). Thus, the GUI provides reasons behind system decisions and disseminates information about local reasoning, data handling and decision-making. Through this demonstrated work, we expect to move toward enhancing AI systems’ acceptability, utility and establishing a chain of responsibility if a system fails. We hope this work will initiate further investigations on developing the AXAI capability and use of a suitable SDDP for incorporating them in AI systems

espace@Curtin

Multimodal Approach to Emotion Recognition for Enhancing Human Machine Interaction - A Survey

Author: S Thushara
S Veni
Publication venue: 'Insight Society'
Publication date: 31/08/2017
Field of study

Emotions are defined as a mental state that occurs instinctively rather than through voluntary effort. They are strong feelings triggered by experiencing the joy, hate, fear, love and is followed by some physiological changes. Emotions play a vital role in social interactions and facilitate the decision making and perception in human being. Emotions are conveyed through speech, facial expression or by physiological signals. There are 6 emotions which are treated as universal emotions: anger, happiness, sadness, disgust, surprise and fear. This paper projects different emotion recognition systems which aim at enhancing the Human-Machine interaction. The techniques and systems used in emotion detection may vary depending on the features inspected. This paper explores them in a descriptive and comparative manner. Further the various applications that adopt these systems to reduce the difficulties in implementing the models in real-time are contemplated. Also, A multimodal system with both speech and facial features is proposed for emotion recognition through which it is possible to obtain an enhanced accuracy compare with the existing systems

International Journal on Advanced Science, Engineering and Information Technology

Multimodal Approach to Emotion Recognition for Enhancing Human Machine Interaction - A Survey

Author
Publication venue: 'Insight Society'
Publication date
Field of study

Crossref

Multimodaalsel emotsioonide tuvastamisel põhineva inimese-roboti suhtluse arendamine

Author: Noroozi Fatemeh
Publication venue
Publication date: 03/05/2018
Field of study

Väitekirja elektrooniline versioon ei sisalda publikatsiooneÜks afektiivse arvutiteaduse peamistest huviobjektidest on mitmemodaalne emotsioonituvastus, mis leiab rakendust peamiselt inimese-arvuti interaktsioonis. Emotsiooni äratundmiseks uuritakse nendes süsteemides nii inimese näoilmeid kui kakõnet. Käesolevas töös uuritakse inimese emotsioonide ja nende avaldumise visuaalseid ja akustilisi tunnuseid, et töötada välja automaatne multimodaalne emotsioonituvastussüsteem. Kõnest arvutatakse mel-sageduse kepstri kordajad, helisignaali erinevate komponentide energiad ja prosoodilised näitajad. Näoilmeteanalüüsimiseks kasutatakse kahte erinevat strateegiat. Esiteks arvutatakse inimesenäo tähtsamate punktide vahelised erinevad geomeetrilised suhted. Teiseks võetakse emotsionaalse sisuga video kokku vähendatud hulgaks põhikaadriteks, misantakse sisendiks konvolutsioonilisele tehisnärvivõrgule emotsioonide visuaalsekseristamiseks. Kolme klassifitseerija väljunditest (1 akustiline, 2 visuaalset) koostatakse uus kogum tunnuseid, mida kasutatakse õppimiseks süsteemi viimasesetapis. Loodud süsteemi katsetati SAVEE, Poola ja Serbia emotsionaalse kõneandmebaaside, eNTERFACE’05 ja RML andmebaaside peal. Saadud tulemusednäitavad, et võrreldes olemasolevatega võimaldab käesoleva töö raames loodudsüsteem suuremat täpsust emotsioonide äratundmisel. Lisaks anname käesolevastöös ülevaate kirjanduses väljapakutud süsteemidest, millel on võimekus tunda äraemotsiooniga seotud ̆zeste. Selle ülevaate eesmärgiks on hõlbustada uute uurimissuundade leidmist, mis aitaksid lisada töö raames loodud süsteemile ̆zestipõhiseemotsioonituvastuse võimekuse, et veelgi enam tõsta süsteemi emotsioonide äratundmise täpsust.Automatic multimodal emotion recognition is a fundamental subject of interest in affective computing. Its main applications are in human-computer interaction. The systems developed for the foregoing purpose consider combinations of different modalities, based on vocal and visual cues. This thesis takes the foregoing modalities into account, in order to develop an automatic multimodal emotion recognition system. More specifically, it takes advantage of the information extracted from speech and face signals. From speech signals, Mel-frequency cepstral coefficients, filter-bank energies and prosodic features are extracted. Moreover, two different strategies are considered for analyzing the facial data. First, facial landmarks' geometric relations, i.e. distances and angles, are computed. Second, we summarize each emotional video into a reduced set of key-frames. Then they are taught to visually discriminate between the emotions. In order to do so, a convolutional neural network is applied to the key-frames summarizing the videos. Afterward, the output confidence values of all the classifiers from both of the modalities are used to define a new feature space. Lastly, the latter values are learned for the final emotion label prediction, in a late fusion. The experiments are conducted on the SAVEE, Polish, Serbian, eNTERFACE'05 and RML datasets. The results show significant performance improvements by the proposed system in comparison to the existing alternatives, defining the current state-of-the-art on all the datasets. Additionally, we provide a review of emotional body gesture recognition systems proposed in the literature. The aim of the foregoing part is to help figure out possible future research directions for enhancing the performance of the proposed system. More clearly, we imply that incorporating data representing gestures, which constitute another major component of the visual modality, can result in a more efficient framework

DSpace at Tartu University Library

Multimodal Emotion Recognition among Couples from Lab Settings to Daily Life using Smartwatches

Author: Boateng George
Publication venue
Publication date: 01/01/2022
Field of study

Couples generally manage chronic diseases together and the management takes an emotional toll on both patients and their romantic partners. Consequently, recognizing the emotions of each partner in daily life could provide an insight into their emotional well-being in chronic disease management. The emotions of partners are currently inferred in the lab and daily life using self-reports which are not practical for continuous emotion assessment or observer reports which are manual, time-intensive, and costly. Currently, there exists no comprehensive overview of works on emotion recognition among couples. Furthermore, approaches for emotion recognition among couples have (1) focused on English-speaking couples in the U.S., (2) used data collected from the lab, and (3) performed recognition using observer ratings rather than partner's self-reported / subjective emotions. In this body of work contained in this thesis (8 papers - 5 published and 3 currently under review in various journals), we fill the current literature gap on couples' emotion recognition, develop emotion recognition systems using 161 hours of data from a total of 1,051 individuals, and make contributions towards taking couples' emotion recognition from the lab which is the status quo, to daily life. This thesis contributes toward building automated emotion recognition systems that would eventually enable partners to monitor their emotions in daily life and enable the delivery of interventions to improve their emotional well-being.Comment: PhD Thesis, 2022 - ETH Zuric

arXiv.org e-Print Archive

Repository for Publications and Research Data

ZORA

Ethical awareness in paralinguistics: a taxonomy of applications

Author: Baird Alice
Batliner Anton
Burkhardt Felix
Meyer Sarina
Neumann Michael
Schuller Björn W.
Vu Ngoc Thang
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2022
Field of study

OPUS Augsburg

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)