1,376 research outputs found
Looking at the Body: Automatic Analysis of Body Gestures and Self-Adaptors in Psychological Distress
Psychological distress is a significant and growing issue in society.
Automatic detection, assessment, and analysis of such distress is an active
area of research. Compared to modalities such as face, head, and vocal,
research investigating the use of the body modality for these tasks is
relatively sparse. This is, in part, due to the limited available datasets and
difficulty in automatically extracting useful body features. Recent advances in
pose estimation and deep learning have enabled new approaches to this modality
and domain. To enable this research, we have collected and analyzed a new
dataset containing full body videos for short interviews and self-reported
distress labels. We propose a novel method to automatically detect
self-adaptors and fidgeting, a subset of self-adaptors that has been shown to
be correlated with psychological distress. We perform analysis on statistical
body gestures and fidgeting features to explore how distress levels affect
participants' behaviors. We then propose a multi-modal approach that combines
different feature representations using Multi-modal Deep Denoising
Auto-Encoders and Improved Fisher Vector Encoding. We demonstrate that our
proposed model, combining audio-visual features with automatically detected
fidgeting behavioral cues, can successfully predict distress levels in a
dataset labeled with self-reported anxiety and depression levels
Automatic Detection of Self-Adaptors for Psychological Distress
Psychological distress is a significant and growing
issue in society. Automatic detection, assessment, and analysis
of such distress is an active area of research. Compared to
modalities such as face, head, and vocal, research investigating
the use of the body modality for these tasks is relatively
sparse. This is, in part, due to the lack of available datasets
and difficulty in automatically extracting useful body features.
Recent advances in pose estimation and deep learning have
enabled new approaches to this modality and domain. We
propose a novel method to automatically detect self-adaptors
and fidgeting, a subset of self-adaptors that has been shown
to be correlated with psychological distress. We also propose
a multi-modal approach that combines different feature representations using Multi-modal Deep Denoising Auto-Encoders
and Improved Fisher Vector encoding. We also demonstrate
that our proposed model, combining audio-visual features with
automatically detected fidgeting behavioral cues, can successfully predict distress levels in a dataset labeled with self-reported anxiety and depression levels. To enable this research
we introduce a new dataset containing full body videos for short
interviews and self-reported distress labels.King's College, Cmabridg
Towards a socially adaptive digital playground
We are working towards a socially adaptive digital playground for children. To this end, we are looking into nonverbal synchrony and other social signals as a measure of social behaviour and into ways to alter game dynamics to trigger and inhibit certain social behaviours. Our first results indicate that we can indeed influence social behaviours in a digital playground by changing game dynamics. Furthermore, our first results show that we will be able to sense some of these social behaviours using only computer vision techniques. I propose an iterative method for working towards a socially adaptive digital playground
MUSICAL INSTRUMENTS, BODY MOVEMENT, SPACE, AND MOTION DATA: MUSIC AS AN EMERGENT MULTIMODAL CHOREOGRAPHY
www.humantechnology.jyu.f
What does not happen: quantifying embodied engagement using NIMI and self-adaptors
Previous research into the quantification of embodied intellectual and emotional engagement using non-verbal movement parameters has not yielded consistent results across different studies. Our research introduces NIMI (Non-Instrumental Movement Inhibition) as an alternative parameter. We propose that the absence of certain types of possible movements can be a more holistic proxy for cognitive engagement with media (in seated persons) than searching for the presence of other movements. Rather than analyzing total movement as an indicator of engagement, our research team distinguishes between instrumental movements (i.e. physical movement serving a direct purpose in the given situation) and non-instrumental movements, and investigates them in the context of the narrative rhythm of the stimulus. We demonstrate that NIMI occurs by showing viewers’ movement levels entrained (i.e. synchronised) to the repeating narrative rhythm of a timed computer-presented quiz. Finally, we discuss the role of objective metrics of engagement in future context-aware analysis of human behaviour in audience research, interactive media and responsive system and interface design
Automated Analysis of Synchronization in Human Full-body Expressive Movement
The research presented in this thesis is focused on the creation of computational models for the study of human full-body movement in order to investigate human behavior and non-verbal communication. In particular, the research concerns the analysis of synchronization of expressive movements and gestures. Synchronization can be computed both on a single user (intra-personal), e.g., to measure the degree of coordination between the joints\u2019 velocities of a dancer, and on multiple users (inter-personal), e.g., to detect the level of coordination between multiple users in a group. The thesis, through a set of experiments and results, contributes to the investigation of both intra-personal and inter-personal synchronization applied to support the study of movement expressivity, and improve the state-of-art of the available methods by presenting a new algorithm to perform the analysis of synchronization
Methodological considerations concerning manual annotation of musical audio in function of algorithm development
In research on musical audio-mining, annotated music databases are needed which allow the development of computational tools that extract from the musical audiostream the kind of high-level content that users can deal with in Music Information Retrieval (MIR) contexts. The notion of musical content, and therefore the notion of annotation, is ill-defined, however, both in the syntactic and semantic sense. As a consequence, annotation has been approached from a variety of perspectives (but mainly linguistic-symbolic oriented), and a general methodology is lacking. This paper is a step towards the definition of a general framework for manual annotation of musical audio in function of a computational approach to musical audio-mining that is based on algorithms that learn from annotated data. 1
The dancer in the eye: Towards a multi-layered computational framework of qualities in movement
This paper presents a conceptual framework for the analysis of expressive qualities of movement. Our perspective is to model an observer of a dance performance. The conceptual framework is made of four layers, ranging from the physical signals that sensors capture to the qualities that movement communicate (e.g., in terms of emotions). The framework aims to provide a conceptual background the development of computational systems can build upon, with a particular reference to systems analyzing a vocabulary of expressive movement qualities, and translating them to other sensory channels, such as the auditory modality. Such systems enable their users to "listen to a choreography" or to "feel a ballet", in a new kind of cross-modal mediated experience
Fusion of Multimodal Information in Music Content Analysis
Music is often processed through its acoustic realization. This is restrictive in the sense that music is clearly a highly multimodal concept where various types of heterogeneous information can be associated to a given piece of music (a musical score, musicians\u27 gestures, lyrics, user-generated metadata, etc.). This has recently led researchers to apprehend music through its various facets, giving rise to "multimodal music analysis" studies. This article gives a synthetic overview of methods that have been successfully employed in multimodal signal analysis. In particular, their use in music content processing is discussed in more details through five case studies that highlight different multimodal integration techniques. The case studies include an example of cross-modal correlation for music video analysis, an audiovisual drum transcription system, a description of the concept of informed source separation, a discussion of multimodal dance-scene analysis, and an example of user-interactive music analysis. In the light of these case studies, some perspectives of multimodality in music processing are finally suggested
- …