Search CORE

7 research outputs found

Анализ информационного и математического обеспечения для распознавания аффективных состояний человека

Author: Alena Velichko
Alexey Karpov
Anastasia Dvoynikova
Dmitry Ryumin
Elena Lyakso
Elena Ryumina
Maxim Markitantov
Mikhail Uzdiaev
Publication venue: 'SPIIRAS'
Publication date: 01/11/2022
Field of study

В статье представлен аналитический обзор исследований в области аффективных вычислений. Это направление является составляющей искусственного интеллекта, и изучает методы, алгоритмы и системы для анализа аффективных состояний человека при его взаимодействии с другими людьми, компьютерными системами или роботами. В области интеллектуального анализа данных под аффектом подразумевается проявление психологических реакций на возбуждаемое событие, которое может протекать как в краткосрочном, так и в долгосрочном периоде, а также иметь различную интенсивность переживаний. Аффекты в рассматриваемой области разделены на 4 вида: аффективные эмоции, базовые эмоции, настроение и аффективные расстройства. Проявление аффективных состояний отражается в вербальных данных и невербальных характеристиках поведения: акустических и лингвистических характеристиках речи, мимике, жестах и позах человека. В обзоре приводится сравнительный анализ существующего информационного обеспечения для автоматического распознавания аффективных состояний человека на примере эмоций, сентимента, агрессии и депрессии. Немногочисленные русскоязычные аффективные базы данных пока существенно уступают по объему и качеству электронным ресурсам на других мировых языках, что обуславливает необходимость рассмотрения широкого спектра дополнительных подходов, методов и алгоритмов, применяемых в условиях ограниченного объема обучающих и тестовых данных, и ставит задачу разработки новых подходов к аугментации данных, переносу обучения моделей и адаптации иноязычных ресурсов. В статье приводится описание методов анализа одномодальной визуальной, акустической и лингвистической информации, а также многомодальных подходов к распознаванию аффективных состояний. Многомодальный подход к автоматическому анализу аффективных состояний позволяет повысить точность распознавания рассматриваемых явлений относительно одномодальных решений. В обзоре отмечена тенденция современных исследований, заключающаяся в том, что нейросетевые методы постепенно вытесняют классические детерминированные методы благодаря лучшему качеству распознавания состояний и оперативной обработке большого объема данных. В статье рассматриваются методы анализа аффективных состояний. Преимуществом использования многозадачных иерархических подходов является возможность извлекать новые типы знаний, в том числе о влиянии, корреляции и взаимодействии нескольких аффективных состояний друг на друга, что потенциально влечет к улучшению качества распознавания. Приводятся потенциальные требования к разрабатываемым системам анализа аффективных состояний и основные направления дальнейших исследований

Directory of Open Access Journals

A Neural Network Architecture for Children’s Audio–Visual Emotion Recognition

Author: Aleksandr Nikolaev
Anton Matveev
Elena Lyakso
Olga Frolova
Yuri Matveev
Publication venue: MDPI AG
Publication date: 01/11/2023
Field of study

Detecting and understanding emotions are critical for our daily activities. As emotion recognition (ER) systems develop, we start looking at more difficult cases than just acted adult audio–visual speech. In this work, we investigate the automatic classification of the audio–visual emotional speech of children, which presents several challenges including the lack of publicly available annotated datasets and the low performance of the state-of-the art audio–visual ER systems. In this paper, we present a new corpus of children’s audio–visual emotional speech that we collected. Then, we propose a neural network solution that improves the utilization of the temporal relationships between audio and video modalities in the cross-modal fusion for children’s audio–visual emotion recognition. We select a state-of-the-art neural network architecture as a baseline and present several modifications focused on a deeper learning of the cross-modal temporal relationships using attention. By conducting experiments with our proposed approach and the selected baseline model, we observe a relative improvement in performance by 2%. Finally, we conclude that focusing more on the cross-modal temporal relationships may be beneficial for building ER systems for child–machine communications and environments where qualified professionals work with children

Directory of Open Access Journals

Automatic Speech Emotion Recognition of Younger School Age Children

Author: Anton Matveev
Elena Lyakso
Nersisson Ruban
Olga Frolova
Yuri Matveev
Publication venue: 'MDPI AG'
Publication date: 06/07/2022
Field of study

This paper introduces the extended description of a database that contains emotional speech in the Russian language of younger school age (8–12-year-old) children and describes the results of validation of the database based on classical machine learning algorithms, such as Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP). The validation is performed using standard procedures and scenarios of the validation similar to other well-known databases of children’s emotional acting speech. Performance evaluation of automatic multiclass recognition on four emotion classes “Neutral (Calm)—Joy—Sadness—Anger” shows the superiority of SVM performance and also MLP performance over the results of perceptual tests. Moreover, the results of automatic recognition on the test dataset which was used in the perceptual test are even better. These results prove that emotions in the database can be reliably recognized both by experts and automatically using classical machine learning algorithms such as SVM and MLP, which can be used as baselines for comparing emotion recognition systems based on more sophisticated modern machine learning methods and deep neural networks. The results also confirm that this database can be a valuable resource for researchers studying affective reactions in speech communication during child-computer interactions in the Russian language and can be used to develop various edutainment, health care, etc. applications

Multidisciplinary Digital Publishing Institute

Strategies of Speech Interaction between Adults and Preschool Children with Typical and Atypical Development

Author: Aleksandr Nikolaev
Aleksey Grigorev
Elena Lyakso
Olga Frolova
Viktor Gorodnyi
Publication venue: 'MDPI AG'
Publication date: 16/12/2019
Field of study

The goal of this research is to study the speech strategies of adults’ interactions with 4–7-year-old children. The participants are “mother–child” dyads with typically developing (TD, n = 40) children, children with autism spectrum disorders (ASDs, n = 20), Down syndrome (DS, n = 10), and “experimenter–orphan” pairs (n = 20). Spectrographic, linguistic, phonetic, and perceptual analyses (n = 465 listeners) of children’s speech and mothers’ speech (MS) are executed. The analysis of audio records by listeners (n = 10) and the elements of nonverbal behavior on the basis of video records by experts (n = 5) are made. Differences in the speech behavior strategies of mothers during interactions with TD children, children with ASD, and children with DS are revealed. The different strategies of “mother–child” interactions depending on the severity of the child’s developmental disorders and the child’s age are described. The same features of MS addressed to TD children with low levels of speech formation are used in MS directed to children with atypical development. The acoustic features of MS correlated with a high level of TD child speech development do not lead to a similar correlation in dyads with ASD and DS children. The perceptual and phonetic features of the speech of children of all groups are described

Multidisciplinary Digital Publishing Institute

Emotion, age, and gender classification in children's speech by humans and machines

Author: Frolova Olga
Grigorev Aleksey
Karpov Alexey A.
Kaya Heysem
Lyakso Elena
Salah Albert Ali
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

In this article, we present the first child emotional speech corpus in Russian, called EmoChildRu, collected from 3 to 7 years old children. The base corpus includes over 20 K recordings (approx. 30 h), collected from 120 children. Audio recordings are carried out in three controlled settings by creating different emotional states for children: playing with a standard set of toys; repetition of words from a toy-parrot in a game store setting; watching a cartoon and retelling of the story, respectively. This corpus is designed to study the reflection of the emotional state in the characteristics of voice and speech and for studies of the formation of emotional states in ontogenesis. A portion of the corpus is annotated for three emotional states (comfort, discomfort, neutral). Additional data include the results of the adult listeners' analysis of child speech, questionnaires, as well as annotation for gender and age in months. We also provide several baselines, comparing human and machine estimation on this corpus for prediction of age, gender and comfort state. While in age estimation, the acoustics-based automatic systems show higher performance, they do not reach human perception levels in comfort state and gender classification. The comparative results indicate the importance and necessity of developing further linguistic models for discrimination. (C) 2017 Elsevier Ltd. All rights reserved.Russian Foundation for Basic ResearchRussian Foundation for Basic Research (RFBR) [10-00-000.24, 15-06-07852, 16-37-60100]; Russian Foundation for Basic Research DHSS [17-06-00503]; Government of Russia [074-U01]; Bogazici UniversityBogazici University [BAP 16A01P4]; BAGEP Award of the Science Academy; [MD-254.2017.8]The work was supported by the Russian Foundation for Basic Research (grant nos. 10-00-000.24, 15-06-07852, and 16-37-60100), Russian Foundation for Basic Research DHSS (grant No 17-06-00503), by the grant of the President of Russia (project No MD-254.2017.8), by the Government of Russia (grant No 074-U01), by Bogazici University (project BAP 16A01P4) and by the BAGEP Award of the Science Academy

Crossref

Namik Kemal University Institutional Repository

Bridging Social Sciences and AI for Understanding Child Behaviour

Author: Holleman Gijs A.
Kaya Heysem
Kaya Heysem
Lyakso Elena
Neerincx Anouk
Schuller Björn W.
Valtakari Niilo V.
Vettori Sofie
Publication venue
Publication date: 22/10/2020
Field of study

Child behaviour is a topic of wide scientific interest among many different disciplines, including social and behavioural sciences and artificial intelligence (AI). In this workshop, we aimed to connect researchers from these fields to address topics such as the usage of AI to better understand and model child behavioural and developmental processes, challenges and opportunities for AI in large-scale child behaviour analysis and implementing explainable ML/AI on sensitive child data. The workshop served as a successful first step towards this goal and attracted contributions from different research disciplines on the analysis of child behaviour. This paper provides a summary of the activities of the workshop and the accepted papers and abstracts

Crossref

Utrecht University Repository