2,012 research outputs found
Automatic evaluation of eye gestural reactions to sound in video sequences
© 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/. This version of the article: Fernández, A., Ortega, M., de Moura, J., Novo, J., & Penedo, M. G. (2019). “Automatic evaluation of eye gestural reactions to sound in video sequences”, has been accepted for publication in Engineering Applications of Artificial Intelligence, 85, 164–174. The Version of Record is available online at: https://doi.org/10.1016/j.engappai.2019.06.009.[Abstract]: Hearing loss is a common disorder that often intensifies with age. In some cases, especially in the elderly population, the hearing loss may decrease in physical, mental and social well-being capacities. In particular, patients with signs of cognitive impairment typically present specific clinical–pathological conditions, which complicates the analysis and diagnosis of the type and severity of hearing loss by clinical specialists. In these patients, unconscious changes in gaze direction may indicate a certain perception of sound through their auditory system. In this context, this work presents a new system that supports clinical experts in the identification and classification of eye gestures that are associated with reactions to auditory stimuli by patients with different levels of cognitive impairment. The proposed system was validated using the public Video Audiometry Sequence Test (VAST) dataset, providing a global accuracy of 97.12% for the classification of eye gestures and a 100% for gestural reactions to auditory stimuli. The proposed system offers a complete analysis of audiometric video sequences, being applicable in daily clinical practice, improving the well-being and quality of life of the patients.This work is supported by the Instituto de Salud Carlos III, Government of Spain and FEDER funds of the European Union, Spain through the DTS18/00136 research projects and by the Ministerio de Economía y Competitividad, Government of Spain through the DPI2015-69948-R research project. Also, this work has received financial support from the European Union (European Regional Development Fund — ERDF) and the Xunta de Galicia, Centro singular de investigación de Galicia accreditation 2016–2019 , Ref. ED431G/01; and Grupos de Referencia Competitiva , Ref. ED431C 2016-047.Xunta de Galicia; ED431G/01Xunta de Galicia; ED431C 2016-04
Computer aided hearing assessment : towards an automated audiometric tool
[Resumen]La pérdida de audición consiste en una disminución parcial o total de la capacidad
para percibir sonidos que afecta a un amplio rango de población y tiene un impacto
negativo en sus actividades diarias. La audiometría tonal liminar es uno de los tests
estándard para la evaluación de la capacidad auditiva. Durante la realización de esta
evaluación el audiólogo trata paralelamente de identificar pacientes con tiempos de
respuesta anormalmente lentos. Esta identificación es relevante pues podría tratarse
de un síntoma asociado a alguna patología que debiera ser estudiada. El otro objetivo
principal es la evaluación de pacientes con deterioro cognitivo o trastornos graves de
comunicación, puesto que no es posible mantener una interacción típica de preguntarespuesta
cuando se evalúa su audición. En estos casos, el experto debe centrar su
atención en la detección de reacciones gestuales espontáneas al sonido.
La subjetividad implicada en la interpretación de ambos objetivos puede afectar
a la clasificación, introducir imprecisiones, limitar la reproducibilidad y también
producir un alto grado de inter e intra varibialidad del observador. En este sentido,
el desarrollo de un método automatizado, objetivo y sistemático para el análisis y
clasificación de los tiempos de respuesta y de las reacciones gestuales al sonido es, por
tanto, altamente conveniente, permitiendo un diagnóstico homogéneo y relevando a
los expertos de esta tediosa tarea.
El propósito de esta investigación es el diseño de un sistema automático para la
evaluación de las reacciones gestuales y los tiempos de respuesta a través del análisis
de secuencias de vídeo grabadas durante el desarrollo de la prueba audiométrica. Por
una parte, los tiempos de respuesta se miden detectando el envío de estímulos y la
respuesta positiva del paciente (expresada levantando la mano). Por otra, las reacciones
gestuales son identificadas analizando los movimientos de la mirada usando
dos aproximaciones diferentes. Las diferentes propuestas automatizadas presentadas
ahorran tiempo a los expertos, mejoran la precisión y proporcionan resultados objetivos
que no se ven afectados por factores subjetivos.[Abstract]Hearing loss is a partial or full decrease in the ability to detect or understand sounds
which affects a wide range of population, and has a negative impact on their daily
activities. Pure Tone Audiometry is the standard test for the evaluation of the
hearing capacity. During the performance of this hearing assessment the audiologist
also tries to identify patients with abnormally slow responsiveness by means of their response times to the perceived sounds. This identificacion is relevant since it could be a symptom of any medical condition that should be studied. The other main
target is the evaluation of patients with cognitive decline or severe communication
disorders, since when evaluating this specific group of patients it is not possible to
maintain a normal question-answer interaction. In these cases the expert must focus his attention on the detection of unconscious gestural reactions to the sound.
The subjective involved in the interpretation of both aims may affect the classiffication, introduces imprecisions, limits the reproducibility and also a high degree of inter- and also intra- observer variability can be produced. In this manner, the
development of a systematic, objective computerized method for the analysis and
classiffication of response times and gestural reactions to the sound is thus highly
desirable, allowing for homogeneous diagnosis and relieving the experts from this
tedious task.
The proposal of this research is the design of an automatic system to assess the
gestural reactions to the sound and the patient's response times by analyzing video
sequences recorded during the performance of the audiometric evaluations. On the
one hand, the response times are measured by detecting the auditory stimuli delivering
and the patient's hand raising (which corresponds with a positive response). On
the other hand, the gestural reactions to the sound are identifed by analyzing the
eye movements using two different approximations. The different automated assessments
proposed save time for experts, improve the precision and provide unbiased
results which are not affected by subjective factors.[Resumo]
A perda de audición consiste nunha disminución parcial ou total da capacidade para
percibir sons que afecta a un amplo rango da poboación e ten un impacto negativo
nas súas actividades diarias. A audiometría tonal liminar é un dos tests estándard
para a avaliación da capacidade auditiva. Durante a realización desta avaliación o
audiólogo trata paralelamente de identificar pacientes con tempos de resposta anormalmente
lentos. Esta identificación é relevante pois poderá tratarse dun síntoma
asociado a algunha patoldXÍa que dehese ser estudada. O autro obxectivo principal
é a avaliación de pacientes con deterioro cognitivo ou trastornos graves de comunicación,
posto que non é posible manter unha interacción típica de preguota-resposta
cando se avalía a súa audición. Nestes casos, o experto debe centrar a súa atención
na detección de reacCÍóns xestuWs espontáneas ao son.
A subxectividade implicada na interpretación de ambos obxectivos pode afectar
á clasificación, introducir impreci.,ións, limitar a reproducibilidade e tamén producir
un alto grao de inter e intra varibialidade do observador. Neste sentido, o desenvolvemento
dun método automatizado, obxectivo e sistemático para a análise e
clasificación dos tempos de resposta e das reaccións xestuais, ao son é, por tanto, altamente
conveniente, permitindo unha diagnose homoxénea e relevando aoS expertos
desta tediosa tarea.
O propósito desta investigación é o diseño dun sistema automático para a avaliación
das reaccións xestuais e os tempos de resposta a través da análise de secuencias de
vídeo grabadas durante o desenvolvemento da proba audiométrica. Por unha parte,
OS tempos de resposta mídense detectando o envío de estímulos e a resposta positiva
do paciente (expresada levantando a msn). Por outra, as reaccións xestuais
son identificadas analizando os movementos da mirada usando dúas aproximacións
diferentes. As diferentes propostas automatizadas presentadas aforran tempo aoS
expertos, melloran a precisión e proporcionan resultados obxectivos que no se ven
afectados por factores subxectivos
Gesture and Speech in Interaction - 4th edition (GESPIN 4)
International audienceThe fourth edition of Gesture and Speech in Interaction (GESPIN) was held in Nantes, France. With more than 40 papers, these proceedings show just what a flourishing field of enquiry gesture studies continues to be. The keynote speeches of the conference addressed three different aspects of multimodal interaction:gesture and grammar, gesture acquisition, and gesture and social interaction. In a talk entitled Qualitiesof event construal in speech and gesture: Aspect and tense, Alan Cienki presented an ongoing researchproject on narratives in French, German and Russian, a project that focuses especially on the verbal andgestural expression of grammatical tense and aspect in narratives in the three languages. Jean-MarcColletta's talk, entitled Gesture and Language Development: towards a unified theoretical framework,described the joint acquisition and development of speech and early conventional and representationalgestures. In Grammar, deixis, and multimodality between code-manifestation and code-integration or whyKendon's Continuum should be transformed into a gestural circle, Ellen Fricke proposed a revisitedgrammar of noun phrases that integrates gestures as part of the semiotic and typological codes of individuallanguages. From a pragmatic and cognitive perspective, Judith Holler explored the use ofgaze and hand gestures as means of organizing turns at talk as well as establishing common ground in apresentation entitled On the pragmatics of multi-modal face-to-face communication: Gesture, speech andgaze in the coordination of mental states and social interaction.Among the talks and posters presented at the conference, the vast majority of topics related, quitenaturally, to gesture and speech in interaction - understood both in terms of mapping of units in differentsemiotic modes and of the use of gesture and speech in social interaction. Several presentations explored the effects of impairments(such as diseases or the natural ageing process) on gesture and speech. The communicative relevance ofgesture and speech and audience-design in natural interactions, as well as in more controlled settings liketelevision debates and reports, was another topic addressed during the conference. Some participantsalso presented research on first and second language learning, while others discussed the relationshipbetween gesture and intonation. While most participants presented research on gesture and speech froman observer's perspective, be it in semiotics or pragmatics, some nevertheless focused on another importantaspect: the cognitive processes involved in language production and perception. Last but not least,participants also presented talks and posters on the computational analysis of gestures, whether involvingexternal devices (e.g. mocap, kinect) or concerning the use of specially-designed computer software forthe post-treatment of gestural data. Importantly, new links were made between semiotics and mocap data
Learners' perceptions of teachers' non-verbal behaviours in the foreign language class
This study explores the meanings that participants in a British ELT setting give to
teachers' non-verbal behaviours. It is a qualitative, descriptive study of the perceived functions that gestures and other non-verbal behaviours perform in the foreign language classroom, viewed mainly from the language learners' perspective. The thesis presents the stages of the research process, from the initial development of the research
questions to the discussion of the research findings that summarise and discuss the
participants' views.
There are two distinct research phases presented in the thesis. The pilot study
explores the perceptions of 18 experienced language learners of teachers' non-verbal
behaviours. The data is collected in interviews based on videotaped extracts of
classroom interaction, presented to the participants in two experimental conditions,
with and without sound. The findings of this initial study justify the later change of
method from the experimental design to a more exploratory framework. In the main
study, 22 learners explain, in interviews based on stimulated recall, their perceptions on their teachers' verbal and non-verbal behaviours as occurring within the immediate classroom context. Finally, learners' views are complemented by 20 trainee teachers' written reports of classroom observation and their opinions expressed in focus group interviews. The data for the main study were thus collected through a combination of methods, ranging from classroom direct observations and videotaped recordings, to semi-structured interviews with language learners.
The research findings indicate that participants generally believe that gestures
and other non-verbal behaviours playa key role in the language learning and teaching
process. Learners identify three types of functions that non-verbal behaviours play in
the classroom interaction: (i) cognitive, i.e. non-verbal behaviours which work as
enhancers of the learning processes, (ii) emotional, i.e. non-verbal behaviours that
function as reliable communicative devices of teachers' emotions and attitudes and (iii)
organisational, i.e. non-verbal behaviours which serve as tools of classroom
management and control.
The findings suggest that learners interpret teachers' non-verbal behaviours in a
functional manner and use these messages and cues in their learning and social
interaction with the teacher. The trainee teachers value in a similar manner the roles that non-verbal behaviours play in the language teaching and learning. However, they
seem to prioritise the cognitive and managerial functions of teachers' non-verbal
behaviours over the emotional ones and do not consider the latter as important as the
learners did.
This study is original in relation to previous studies of language classroom
interaction in that it:
• describes the kinds of teachers' behaviours which all teachers and learners are familiar with, but which have seldom been foregrounded in classroom-based
research;
• unlike previous studies of non-verbal behaviour, investigates the perceiver's
view of the others' non-verbal behaviour rather than its production;
• documents these processes of perception through an innovative methodology of
data collection and analysis;
• explores the teachers' non-verbal behaviours as perceived by the learners
themselves, suggesting that their viewpoint can be one window on the reality of
language classrooms;
• provides explanations and functional interpretations for the many spontaneous
and apparently unimportant actions that teachers use on a routine basis;
• identifies a new area which needs consideration in any future research and
pedagogy of language teaching and learning
Exploring the Affective Loop
Research in psychology and neurology shows that both body and mind are
involved when experiencing emotions (Damasio 1994, Davidson et al.
2003). People are also very physical when they try to communicate their
emotions. Somewhere in between beings consciously and unconsciously
aware of it ourselves, we produce both verbal and physical signs to make
other people understand how we feel. Simultaneously, this production of
signs involves us in a stronger personal experience of the emotions we
express.
Emotions are also communicated in the digital world, but there is little
focus on users' personal as well as physical experience of emotions in
the available digital media. In order to explore whether and how we can
expand existing media, we have designed, implemented and evaluated
/eMoto/, a mobile service for sending affective messages to others. With
eMoto, we explicitly aim to address both cognitive and physical
experiences of human emotions. Through combining affective gestures for
input with affective expressions that make use of colors, shapes and
animations for the background of messages, the interaction "pulls" the
user into an /affective loop/. In this thesis we define what we mean by
affective loop and present a user-centered design approach expressed
through four design principles inspired by previous work within Human
Computer Interaction (HCI) but adjusted to our purposes; /embodiment/
(Dourish 2001) as a means to address how people communicate emotions in
real life, /flow/ (Csikszentmihalyi 1990) to reach a state of
involvement that goes further than the current context, /ambiguity/ of
the designed expressions (Gaver et al. 2003) to allow for open-ended
interpretation by the end-users instead of simplistic, one-emotion
one-expression pairs and /natural but designed expressions/ to address
people's natural couplings between cognitively and physically
experienced emotions. We also present results from an end-user study of
eMoto that indicates that subjects got both physically and emotionally
involved in the interaction and that the designed "openness" and
ambiguity of the expressions, was appreciated and understood by our
subjects. Through the user study, we identified four potential design
problems that have to be tackled in order to achieve an affective loop
effect; the extent to which users' /feel in control/ of the interaction,
/harmony and coherence/ between cognitive and physical expressions/,/
/timing/ of expressions and feedback in a communicational setting, and
effects of users' /personality/ on their emotional expressions and
experiences of the interaction
Incrementar la presencia en entornos virtuales en primera persona a través de interfaces auditivas: un acercamiento analítico al sonido y la música adaptativos
Tesis de la Universidad Complutense de Madrid, Facultad de Informática, leída el 25-11-2019The popularisation of virtual reality devices has brought with it an increased need of telepresence and player immersion in video games. This goals are often pursued through more realistic computer graphics and sound; however, invasive graphical user interfaces are still present in industry standard products for VR, even though previous research has advised against them in order to reach better results in immersion. Non-visual, multimodal communication channels are explored throughout this thesis as a means of reducing the amount of graphical elements needed in head-up displays while increasing telepresence. Thus, the main goals of this research are to find the optimal channels that allow for semantic communication without recurring to visual interfaces, while reducing the general number of extra-diegetic elements in a video game, and to develop a total of six software applications in order to validate the obtained knowledge in real-life scenarios. The central piece of software produced as a result of this process is called LitSens, and consists of an adaptive music generator which takes human emotions as inputs...La popularización de los dispositivos de realidad virtual ha traído consigo una mayor necesidad de presencia e inmersión para los jugadores de videojuegos. Habitualmentese intenta cumplir con dicha necesidad a través de gráficos y sonido por ordenador más realistas; no obstante, las interfaces gráficas de usuario muy invasivas aún son un estándar en la industria del videojuego de RV, incluso si se tiene en cuenta que varias investigaciones previas a la redacción de este texto recomiendan no utilizarlas para conseguir un resultado más inmersivo. A lo largo de esta tesis, varios canales de comunicación multimodales y no visuales son explorados con el fin de reducir la cantidad de elementos gráficos extradiegéticos necesarios en las capas de las interfaces gráficas de usuario destinadas a la representación de datos, todo ello mientras se logra un aumento de la sensación de presencia. Por tanto, los principales objetivos de esta investigación son encontrar los canales óptimos para efectuar comunicación semántica sin recurrir a interfaces visuales —a la vez que se reduce el número de elementos extradiegéticos en un videojuego— y desarrollar un total de seis aplicaciones con el objetivo de validar todo el conocimiento obtenido mediante prototipos similares a videojuegos comerciales. De todos ellos, el más importante es LitSens: un generador de música adaptativa que toma como entradas emociones humanas...Fac. de InformáticaTRUEunpu
The effects of nonverbal behaviors exhibited by multiple conductors on the timbre, intonation, and perceptions of three university choirs, and assessed relationships between time spent in selected conductor behaviors and analysis of the choirs' performances
This investigation examined the effects of aggregate nonverbal behaviors exhibited by 10 videotaped conductors on the choral sound and perceptions of 3 university choirs (N = 61 choristers) as they sang from memory the same a cappella motet. It then assessed relationships between time spent in selected nonverbal conducting behaviors and the choirs' sung performances and perceptions. Examined nonverbal conductor behaviors were: (a) height of vertical gestural plane; (b) width of lateral gestural plane; (c) hand shape; and (d) emotional face expression. Dependent measures included Long Term Average Spectra (LTAS) data, pitch analyses, and singer questionnaires. Among primary findings: (a) aggregate singer ratings yielded significant differences among the 10 conductors with respect to perceived gestural clarity and singing efficiency; (b) each of the 3 choirs responded similarly in timbre and pitch to the 10, counter-balanced conductor videos presented; (c) significantly strong, positive correlations between LTAS and pitch results suggested that those conductors whose nonverbal behaviors evoked more spectral energy in the choirs' sound tended also to elicit more in tune singing; (d) the 10 conductors exhibited significantly different amounts of aggregate time spent in the gestural planes and hand shapes analyzed; (e) above shoulder vertical gestures related significantly to less timbral energy, while gestures below shoulder level related significantly to increased timbral energy; (f) significantly strong, positive correlations between singer questionnaire responses and both pitch and LTAS data suggested that the choirs' timbre and pitch tended to vary according to whether or not the singers perceived a conductor's nonverbal communication as clear and whether or not they perceived they sang efficiently while following a particular conductor; (g) moderately strong, though not significant, associations between lateral gestures within the torso area and both pitch (more in tune) and timbre (more spectral energy), and between lateral gestures beyond the torso area and both pitch (less in tune) and timbre (less spectral energy); and (h) weak, non-significant correlations between aggregate time spent in various hand postures and the choirs' timbre and intonation, and between identified emotional face expressions and analyses of the choirs' sound
Towards affective computing that works for everyone
Missing diversity, equity, and inclusion elements in affective computing
datasets directly affect the accuracy and fairness of emotion recognition
algorithms across different groups. A literature review reveals how affective
computing systems may work differently for different groups due to, for
instance, mental health conditions impacting facial expressions and speech or
age-related changes in facial appearance and health. Our work analyzes existing
affective computing datasets and highlights a disconcerting lack of diversity
in current affective computing datasets regarding race, sex/gender, age, and
(mental) health representation. By emphasizing the need for more inclusive
sampling strategies and standardized documentation of demographic factors in
datasets, this paper provides recommendations and calls for greater attention
to inclusivity and consideration of societal consequences in affective
computing research to promote ethical and accurate outcomes in this emerging
field.Comment: 8 pages, 2023 11th International Conference on Affective Computing
and Intelligent Interaction (ACII
Design and semantics of form and movement (DeSForM 2006)
Design and Semantics of Form and Movement (DeSForM) grew from applied research exploring emerging design methods and practices to support new generation product and interface design. The products and interfaces are concerned with: the context of ubiquitous computing and ambient technologies and the need for greater empathy in the pre-programmed behaviour of the ‘machines’ that populate our lives. Such explorative research in the CfDR has been led by Young, supported by Kyffin, Visiting Professor from Philips Design and sponsored by Philips Design over a period of four years (research funding £87k). DeSForM1 was the first of a series of three conferences that enable the presentation and debate of international work within this field: • 1st European conference on Design and Semantics of Form and Movement (DeSForM1), Baltic, Gateshead, 2005, Feijs L., Kyffin S. & Young R.A. eds. • 2nd European conference on Design and Semantics of Form and Movement (DeSForM2), Evoluon, Eindhoven, 2006, Feijs L., Kyffin S. & Young R.A. eds. • 3rd European conference on Design and Semantics of Form and Movement (DeSForM3), New Design School Building, Newcastle, 2007, Feijs L., Kyffin S. & Young R.A. eds. Philips sponsorship of practice-based enquiry led to research by three teams of research students over three years and on-going sponsorship of research through the Northumbria University Design and Innovation Laboratory (nuDIL). Young has been invited on the steering panel of the UK Thinking Digital Conference concerning the latest developments in digital and media technologies. Informed by this research is the work of PhD student Yukie Nakano who examines new technologies in relation to eco-design textiles
- …