Search CORE

1,338 research outputs found

Machine Understanding of Human Behavior

Author: Huang Thomas
Nijholt Anton
Pantic Maja
Pentland Alex
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2007
Field of study

A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affective and social signaling. This article discusses a number of components of human behavior, how they might be integrated into computers, and how far we are from realizing the front end of human computing, that is, how far are we from enabling computers to understand human behavior

University of Twente Research Information

Initiation of communication from users of AAC and preceding communication partner\u27s utterances

Author: Welser Lauren Vaughn
Publication venue: UNI ScholarWorks
Publication date: 01/01/2016
Field of study

This study examined the effect communication partners’ have on the initiations produced by users of augmentative and alternative communication (AAC). The data was reviewed from a larger study; it included transcripts and videos of a set of four students from an elementary school classroom in the Midwest. The students had a wide range of abilities. Both the student and teacher utterances were analyzed for: different types of communication functions, environmental factors and conversational factors. It was hypothesized that the communicative function of the previous utterance and the level of aided input used would affect the number of initiations. The findings support the concept that the preceding utterance and communication partner can increase or decrease the number of student initiations. This suggests that the communication partner could make adaptions to their own speech and language, as well as the environment, to maximize therapy and the student’s skills

University of Northern Iowa

Recommended from our members

A review of parent training interventions for children with autism spectrum disorder and proposed guidelines for choosing best practices

Author: Sisavath Jessica
Publication venue
Publication date: 03/10/2014
Field of study

textThe purpose of this project is to critically analyze and review parent training interventions published between the years 2000 to 2013 focused on enhancing social and communicative behaviors in young children between 3 to 10 years old with autism spectrum disorder. All studies involved a form of parent training in combination with an intervention type such as pivotal response training, milieu approach and naturalistic approaches. Overall, each study yielded positive outcomes for children with ASD, but data collection strategies, target goals, and outcome measures were variable. This review included an in-depth analysis of 16 studies of parent intervention programs evaluated based on their goals, methodology, and effectiveness of parent training on the children with ASD’s language skills. The review will present a set of guidelines for parents and professionals to use when deciding on the most effective and efficient parent training therapy for families who have children with ASD. Critically evaluating the available empirical research can help parents, therapists, and researchers more effectively consider viable options for parent training programs tailored to support the needs of children with ASD. Tables will summarize the findings to make the information more accessible. Implications for future research will follow the literature review.Communication Sciences and Disorder

Texas ScholarWorks

JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions

Author: Aizawa Akiko
Jiang Junfeng
Saito Yuki
Saruwatari Hiroshi
Takamichi Shinnosuke
Xin Detai
Publication venue
Publication date: 09/10/2023
Field of study

We present the JVNV, a Japanese emotional speech corpus with verbal content and nonverbal vocalizations whose scripts are generated by a large-scale language model. Existing emotional speech corpora lack not only proper emotional scripts but also nonverbal vocalizations (NVs) that are essential expressions in spoken language to express emotions. We propose an automatic script generation method to produce emotional scripts by providing seed words with sentiment polarity and phrases of nonverbal vocalizations to ChatGPT using prompt engineering. We select 514 scripts with balanced phoneme coverage from the generated candidate scripts with the assistance of emotion confidence scores and language fluency scores. We demonstrate the effectiveness of JVNV by showing that JVNV has better phoneme coverage and emotion recognizability than previous Japanese emotional speech corpora. We then benchmark JVNV on emotional text-to-speech synthesis using discrete codes to represent NVs. We show that there still exists a gap between the performance of synthesizing read-aloud speech and emotional speech, and adding NVs in the speech makes the task even harder, which brings new challenges for this task and makes JVNV a valuable resource for relevant works in the future. To our best knowledge, JVNV is the first speech corpus that generates scripts automatically using large language models

arXiv.org e-Print Archive

Detection of nonverbal vocalizations using Gaussian Mixture Models: looking for fillers and laughter in conversational speech

Author: Krikke Teun F.
Truong Khiet P.
Publication venue: International Speech Communication Association
Publication date: 01/01/2013
Field of study

In this paper, we analyze acoustic profiles of fillers (i.e. filled pauses, FPs) and laughter with the aim to automatically localize these nonverbal vocalizations in a stream of audio. Among other features, we use voice quality features to capture the distinctive production modes of laughter and spectral similarity measures to capture the stability of the oral tract that is characteristic for FPs. Classification experiments with Gaussian Mixture Models and various sets of features are performed. We find that Mel-Frequency Cepstrum Coefficients are performing relatively well in comparison to other features for both FPs and laughter. In order to address the large variation in the frame-wise decision scores (e.g., log-likelihood ratios) observed in sequences of frames we apply a median filter to these scores, which yields large performance improvements. Our analyses and results are presented within the framework of this year’s Interspeech Computational Paralinguistics sub-Challenge on Social Signals

University of Twente Research Information

ACII 2009: Affective Computing and Intelligent Interaction. Proceedings of the Doctoral Consortium 2009

Author: Pelachaud C.
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 12/09/2009
Field of study

University of Twente Research Information

Mapping the Passions: Toward a High-Dimensional Taxonomy of Emotional Experience and Expression

Author: Cowen A.
Keltner D.
Sauter D.
Tracy J.L.
Publication venue: 'SAGE Publications'
Publication date: 01/07/2019
Field of study

International Migration, Integration and Social Cohesion online publications

Human roars communicate upper-body strength more effectively than do screams or aggressive and distressed speech

Author: Bond Rod
Pisanski Katarzyna
Raine Jordan
Reby David
Simner Julia
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

Despite widespread evidence that nonverbal components of human speech (e.g., voice pitch) communicate information about physical attributes of vocalizers and that listeners can judge traits such as strength and body size from speech, few studies have examined the communicative functions of human nonverbal vocalizations (such as roars, screams, grunts and laughs). Critically, no previous study has yet to examine the acoustic correlates of strength in nonverbal vocalisations, including roars, nor identified reliable vocal cues to strength in human speech. In addition to being less acoustically constrained than articulated speech, agonistic nonverbal vocalizations function primarily to express motivation and emotion, such as threat, and may therefore communicate strength and body size more effectively than speech. Here, we investigated acoustic cues to strength and size in roars compared to screams and speech sentences produced in both aggressive and distress contexts. Using playback experiments, we then tested whether listeners can reliably infer a vocalizer’s actual strength and height from roars, screams, and valenced speech equivalents, and which acoustic features predicted listeners’ judgments. While there were no consistent acoustic cues to strength in any vocal stimuli, listeners accurately judged inter-individual differences in strength, and did so most effectively from aggressive voice stimuli (roars and aggressive speech). In addition, listeners more accurately judged strength from roars than from aggressive speech. In contrast, listeners’ judgments of height were most accurate for speech stimuli. These results support the prediction that vocalizers maximize impressions of physical strength in aggressive compared to distress contexts, and that inter-individual variation in strength may only be honestly communicated in vocalizations that function to communicate threat, particularly roars. Thus, in continuity with nonhuman mammals, the acoustic structure of human aggressive roars may have been selected to communicate, and to some extent exaggerate, functional cues to physical formidability

Directory of Open Access Journals

Sussex Research Online

FigShare

Speech, laughter and everything in between: A modulation spectrum-based analysis

Author: Ludusan Bogdan
Wagner Petra
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2020
Field of study

Ludusan B, Wagner P. Speech, laughter and everything in between: A modulation spectrum-based analysis. In: Proceedings. 10th International Conference on Speech Prosody 2020. 25-28 May 2020, Tokyo, Japan. ISCA; 2020: 995-999.Laughter and speech-laughs are pervasive phenomena found in conversational speech. Nevertheless, few previous studies have compared their acoustic realization to speech. We investigated in this work the suprasegmental characteristics of these two phenomena in relation to speech, by means of a modulation spectrum analysis. Two types of modulation spectra, one encoding the variation of the envelope of the signal and the other one its temporal fine structure, were considered. Using a corpus of spontaneous dyadic interactions, we computed the modulation index spectrum and the f0 spectrum of the three classes of vocalizations considered and we fitted separate generalized additive mixed models for them. The results obtained for the former modulation showed a clear separation between speech, on the one hand, and laughter and speech-laugh, on the other hand, while the f0 spectrum was able to discriminate between all three classes. We conclude with a discussion of the importance of these findings and their implication for laughter detection

Crossref

Publications at Bielefeld University