2,012 research outputs found

    Automatic evaluation of eye gestural reactions to sound in video sequences

    Get PDF
    © 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/. This version of the article: Fernández, A., Ortega, M., de Moura, J., Novo, J., & Penedo, M. G. (2019). “Automatic evaluation of eye gestural reactions to sound in video sequences”, has been accepted for publication in Engineering Applications of Artificial Intelligence, 85, 164–174. The Version of Record is available online at: https://doi.org/10.1016/j.engappai.2019.06.009.[Abstract]: Hearing loss is a common disorder that often intensifies with age. In some cases, especially in the elderly population, the hearing loss may decrease in physical, mental and social well-being capacities. In particular, patients with signs of cognitive impairment typically present specific clinical–pathological conditions, which complicates the analysis and diagnosis of the type and severity of hearing loss by clinical specialists. In these patients, unconscious changes in gaze direction may indicate a certain perception of sound through their auditory system. In this context, this work presents a new system that supports clinical experts in the identification and classification of eye gestures that are associated with reactions to auditory stimuli by patients with different levels of cognitive impairment. The proposed system was validated using the public Video Audiometry Sequence Test (VAST) dataset, providing a global accuracy of 97.12% for the classification of eye gestures and a 100% for gestural reactions to auditory stimuli. The proposed system offers a complete analysis of audiometric video sequences, being applicable in daily clinical practice, improving the well-being and quality of life of the patients.This work is supported by the Instituto de Salud Carlos III, Government of Spain and FEDER funds of the European Union, Spain through the DTS18/00136 research projects and by the Ministerio de Economía y Competitividad, Government of Spain through the DPI2015-69948-R research project. Also, this work has received financial support from the European Union (European Regional Development Fund — ERDF) and the Xunta de Galicia, Centro singular de investigación de Galicia accreditation 2016–2019 , Ref. ED431G/01; and Grupos de Referencia Competitiva , Ref. ED431C 2016-047.Xunta de Galicia; ED431G/01Xunta de Galicia; ED431C 2016-04

    Computer aided hearing assessment : towards an automated audiometric tool

    Get PDF
    [Resumen]La pérdida de audición consiste en una disminución parcial o total de la capacidad para percibir sonidos que afecta a un amplio rango de población y tiene un impacto negativo en sus actividades diarias. La audiometría tonal liminar es uno de los tests estándard para la evaluación de la capacidad auditiva. Durante la realización de esta evaluación el audiólogo trata paralelamente de identificar pacientes con tiempos de respuesta anormalmente lentos. Esta identificación es relevante pues podría tratarse de un síntoma asociado a alguna patología que debiera ser estudiada. El otro objetivo principal es la evaluación de pacientes con deterioro cognitivo o trastornos graves de comunicación, puesto que no es posible mantener una interacción típica de preguntarespuesta cuando se evalúa su audición. En estos casos, el experto debe centrar su atención en la detección de reacciones gestuales espontáneas al sonido. La subjetividad implicada en la interpretación de ambos objetivos puede afectar a la clasificación, introducir imprecisiones, limitar la reproducibilidad y también producir un alto grado de inter e intra varibialidad del observador. En este sentido, el desarrollo de un método automatizado, objetivo y sistemático para el análisis y clasificación de los tiempos de respuesta y de las reacciones gestuales al sonido es, por tanto, altamente conveniente, permitiendo un diagnóstico homogéneo y relevando a los expertos de esta tediosa tarea. El propósito de esta investigación es el diseño de un sistema automático para la evaluación de las reacciones gestuales y los tiempos de respuesta a través del análisis de secuencias de vídeo grabadas durante el desarrollo de la prueba audiométrica. Por una parte, los tiempos de respuesta se miden detectando el envío de estímulos y la respuesta positiva del paciente (expresada levantando la mano). Por otra, las reacciones gestuales son identificadas analizando los movimientos de la mirada usando dos aproximaciones diferentes. Las diferentes propuestas automatizadas presentadas ahorran tiempo a los expertos, mejoran la precisión y proporcionan resultados objetivos que no se ven afectados por factores subjetivos.[Abstract]Hearing loss is a partial or full decrease in the ability to detect or understand sounds which affects a wide range of population, and has a negative impact on their daily activities. Pure Tone Audiometry is the standard test for the evaluation of the hearing capacity. During the performance of this hearing assessment the audiologist also tries to identify patients with abnormally slow responsiveness by means of their response times to the perceived sounds. This identificacion is relevant since it could be a symptom of any medical condition that should be studied. The other main target is the evaluation of patients with cognitive decline or severe communication disorders, since when evaluating this specific group of patients it is not possible to maintain a normal question-answer interaction. In these cases the expert must focus his attention on the detection of unconscious gestural reactions to the sound. The subjective involved in the interpretation of both aims may affect the classiffication, introduces imprecisions, limits the reproducibility and also a high degree of inter- and also intra- observer variability can be produced. In this manner, the development of a systematic, objective computerized method for the analysis and classiffication of response times and gestural reactions to the sound is thus highly desirable, allowing for homogeneous diagnosis and relieving the experts from this tedious task. The proposal of this research is the design of an automatic system to assess the gestural reactions to the sound and the patient's response times by analyzing video sequences recorded during the performance of the audiometric evaluations. On the one hand, the response times are measured by detecting the auditory stimuli delivering and the patient's hand raising (which corresponds with a positive response). On the other hand, the gestural reactions to the sound are identifed by analyzing the eye movements using two different approximations. The different automated assessments proposed save time for experts, improve the precision and provide unbiased results which are not affected by subjective factors.[Resumo] A perda de audición consiste nunha disminución parcial ou total da capacidade para percibir sons que afecta a un amplo rango da poboación e ten un impacto negativo nas súas actividades diarias. A audiometría tonal liminar é un dos tests estándard para a avaliación da capacidade auditiva. Durante a realización desta avaliación o audiólogo trata paralelamente de identificar pacientes con tempos de resposta anormalmente lentos. Esta identificación é relevante pois poderá tratarse dun síntoma asociado a algunha patoldXÍa que dehese ser estudada. O autro obxectivo principal é a avaliación de pacientes con deterioro cognitivo ou trastornos graves de comunicación, posto que non é posible manter unha interacción típica de preguota-resposta cando se avalía a súa audición. Nestes casos, o experto debe centrar a súa atención na detección de reacCÍóns xestuWs espontáneas ao son. A subxectividade implicada na interpretación de ambos obxectivos pode afectar á clasificación, introducir impreci.,ións, limitar a reproducibilidade e tamén producir un alto grao de inter e intra varibialidade do observador. Neste sentido, o desenvolvemento dun método automatizado, obxectivo e sistemático para a análise e clasificación dos tempos de resposta e das reaccións xestuais, ao son é, por tanto, altamente conveniente, permitindo unha diagnose homoxénea e relevando aoS expertos desta tediosa tarea. O propósito desta investigación é o diseño dun sistema automático para a avaliación das reaccións xestuais e os tempos de resposta a través da análise de secuencias de vídeo grabadas durante o desenvolvemento da proba audiométrica. Por unha parte, OS tempos de resposta mídense detectando o envío de estímulos e a resposta positiva do paciente (expresada levantando a msn). Por outra, as reaccións xestuais son identificadas analizando os movementos da mirada usando dúas aproximacións diferentes. As diferentes propostas automatizadas presentadas aforran tempo aoS expertos, melloran a precisión e proporcionan resultados obxectivos que no se ven afectados por factores subxectivos

    Gesture and Speech in Interaction - 4th edition (GESPIN 4)

    Get PDF
    International audienceThe fourth edition of Gesture and Speech in Interaction (GESPIN) was held in Nantes, France. With more than 40 papers, these proceedings show just what a flourishing field of enquiry gesture studies continues to be. The keynote speeches of the conference addressed three different aspects of multimodal interaction:gesture and grammar, gesture acquisition, and gesture and social interaction. In a talk entitled Qualitiesof event construal in speech and gesture: Aspect and tense, Alan Cienki presented an ongoing researchproject on narratives in French, German and Russian, a project that focuses especially on the verbal andgestural expression of grammatical tense and aspect in narratives in the three languages. Jean-MarcColletta's talk, entitled Gesture and Language Development: towards a unified theoretical framework,described the joint acquisition and development of speech and early conventional and representationalgestures. In Grammar, deixis, and multimodality between code-manifestation and code-integration or whyKendon's Continuum should be transformed into a gestural circle, Ellen Fricke proposed a revisitedgrammar of noun phrases that integrates gestures as part of the semiotic and typological codes of individuallanguages. From a pragmatic and cognitive perspective, Judith Holler explored the use ofgaze and hand gestures as means of organizing turns at talk as well as establishing common ground in apresentation entitled On the pragmatics of multi-modal face-to-face communication: Gesture, speech andgaze in the coordination of mental states and social interaction.Among the talks and posters presented at the conference, the vast majority of topics related, quitenaturally, to gesture and speech in interaction - understood both in terms of mapping of units in differentsemiotic modes and of the use of gesture and speech in social interaction. Several presentations explored the effects of impairments(such as diseases or the natural ageing process) on gesture and speech. The communicative relevance ofgesture and speech and audience-design in natural interactions, as well as in more controlled settings liketelevision debates and reports, was another topic addressed during the conference. Some participantsalso presented research on first and second language learning, while others discussed the relationshipbetween gesture and intonation. While most participants presented research on gesture and speech froman observer's perspective, be it in semiotics or pragmatics, some nevertheless focused on another importantaspect: the cognitive processes involved in language production and perception. Last but not least,participants also presented talks and posters on the computational analysis of gestures, whether involvingexternal devices (e.g. mocap, kinect) or concerning the use of specially-designed computer software forthe post-treatment of gestural data. Importantly, new links were made between semiotics and mocap data

    Learners' perceptions of teachers' non-verbal behaviours in the foreign language class

    Get PDF
    This study explores the meanings that participants in a British ELT setting give to teachers' non-verbal behaviours. It is a qualitative, descriptive study of the perceived functions that gestures and other non-verbal behaviours perform in the foreign language classroom, viewed mainly from the language learners' perspective. The thesis presents the stages of the research process, from the initial development of the research questions to the discussion of the research findings that summarise and discuss the participants' views. There are two distinct research phases presented in the thesis. The pilot study explores the perceptions of 18 experienced language learners of teachers' non-verbal behaviours. The data is collected in interviews based on videotaped extracts of classroom interaction, presented to the participants in two experimental conditions, with and without sound. The findings of this initial study justify the later change of method from the experimental design to a more exploratory framework. In the main study, 22 learners explain, in interviews based on stimulated recall, their perceptions on their teachers' verbal and non-verbal behaviours as occurring within the immediate classroom context. Finally, learners' views are complemented by 20 trainee teachers' written reports of classroom observation and their opinions expressed in focus group interviews. The data for the main study were thus collected through a combination of methods, ranging from classroom direct observations and videotaped recordings, to semi-structured interviews with language learners. The research findings indicate that participants generally believe that gestures and other non-verbal behaviours playa key role in the language learning and teaching process. Learners identify three types of functions that non-verbal behaviours play in the classroom interaction: (i) cognitive, i.e. non-verbal behaviours which work as enhancers of the learning processes, (ii) emotional, i.e. non-verbal behaviours that function as reliable communicative devices of teachers' emotions and attitudes and (iii) organisational, i.e. non-verbal behaviours which serve as tools of classroom management and control. The findings suggest that learners interpret teachers' non-verbal behaviours in a functional manner and use these messages and cues in their learning and social interaction with the teacher. The trainee teachers value in a similar manner the roles that non-verbal behaviours play in the language teaching and learning. However, they seem to prioritise the cognitive and managerial functions of teachers' non-verbal behaviours over the emotional ones and do not consider the latter as important as the learners did. This study is original in relation to previous studies of language classroom interaction in that it: • describes the kinds of teachers' behaviours which all teachers and learners are familiar with, but which have seldom been foregrounded in classroom-based research; • unlike previous studies of non-verbal behaviour, investigates the perceiver's view of the others' non-verbal behaviour rather than its production; • documents these processes of perception through an innovative methodology of data collection and analysis; • explores the teachers' non-verbal behaviours as perceived by the learners themselves, suggesting that their viewpoint can be one window on the reality of language classrooms; • provides explanations and functional interpretations for the many spontaneous and apparently unimportant actions that teachers use on a routine basis; • identifies a new area which needs consideration in any future research and pedagogy of language teaching and learning

    Exploring the Affective Loop

    Get PDF
    Research in psychology and neurology shows that both body and mind are involved when experiencing emotions (Damasio 1994, Davidson et al. 2003). People are also very physical when they try to communicate their emotions. Somewhere in between beings consciously and unconsciously aware of it ourselves, we produce both verbal and physical signs to make other people understand how we feel. Simultaneously, this production of signs involves us in a stronger personal experience of the emotions we express. Emotions are also communicated in the digital world, but there is little focus on users' personal as well as physical experience of emotions in the available digital media. In order to explore whether and how we can expand existing media, we have designed, implemented and evaluated /eMoto/, a mobile service for sending affective messages to others. With eMoto, we explicitly aim to address both cognitive and physical experiences of human emotions. Through combining affective gestures for input with affective expressions that make use of colors, shapes and animations for the background of messages, the interaction "pulls" the user into an /affective loop/. In this thesis we define what we mean by affective loop and present a user-centered design approach expressed through four design principles inspired by previous work within Human Computer Interaction (HCI) but adjusted to our purposes; /embodiment/ (Dourish 2001) as a means to address how people communicate emotions in real life, /flow/ (Csikszentmihalyi 1990) to reach a state of involvement that goes further than the current context, /ambiguity/ of the designed expressions (Gaver et al. 2003) to allow for open-ended interpretation by the end-users instead of simplistic, one-emotion one-expression pairs and /natural but designed expressions/ to address people's natural couplings between cognitively and physically experienced emotions. We also present results from an end-user study of eMoto that indicates that subjects got both physically and emotionally involved in the interaction and that the designed "openness" and ambiguity of the expressions, was appreciated and understood by our subjects. Through the user study, we identified four potential design problems that have to be tackled in order to achieve an affective loop effect; the extent to which users' /feel in control/ of the interaction, /harmony and coherence/ between cognitive and physical expressions/,/ /timing/ of expressions and feedback in a communicational setting, and effects of users' /personality/ on their emotional expressions and experiences of the interaction

    Incrementar la presencia en entornos virtuales en primera persona a través de interfaces auditivas: un acercamiento analítico al sonido y la música adaptativos

    Get PDF
    Tesis de la Universidad Complutense de Madrid, Facultad de Informática, leída el 25-11-2019The popularisation of virtual reality devices has brought with it an increased need of telepresence and player immersion in video games. This goals are often pursued through more realistic computer graphics and sound; however, invasive graphical user interfaces are still present in industry standard products for VR, even though previous research has advised against them in order to reach better results in immersion. Non-visual, multimodal communication channels are explored throughout this thesis as a means of reducing the amount of graphical elements needed in head-up displays while increasing telepresence. Thus, the main goals of this research are to find the optimal channels that allow for semantic communication without recurring to visual interfaces, while reducing the general number of extra-diegetic elements in a video game, and to develop a total of six software applications in order to validate the obtained knowledge in real-life scenarios. The central piece of software produced as a result of this process is called LitSens, and consists of an adaptive music generator which takes human emotions as inputs...La popularización de los dispositivos de realidad virtual ha traído consigo una mayor necesidad de presencia e inmersión para los jugadores de videojuegos. Habitualmentese intenta cumplir con dicha necesidad a través de gráficos y sonido por ordenador más realistas; no obstante, las interfaces gráficas de usuario muy invasivas aún son un estándar en la industria del videojuego de RV, incluso si se tiene en cuenta que varias investigaciones previas a la redacción de este texto recomiendan no utilizarlas para conseguir un resultado más inmersivo. A lo largo de esta tesis, varios canales de comunicación multimodales y no visuales son explorados con el fin de reducir la cantidad de elementos gráficos extradiegéticos necesarios en las capas de las interfaces gráficas de usuario destinadas a la representación de datos, todo ello mientras se logra un aumento de la sensación de presencia. Por tanto, los principales objetivos de esta investigación son encontrar los canales óptimos para efectuar comunicación semántica sin recurrir a interfaces visuales —a la vez que se reduce el número de elementos extradiegéticos en un videojuego— y desarrollar un total de seis aplicaciones con el objetivo de validar todo el conocimiento obtenido mediante prototipos similares a videojuegos comerciales. De todos ellos, el más importante es LitSens: un generador de música adaptativa que toma como entradas emociones humanas...Fac. de InformáticaTRUEunpu

    The effects of nonverbal behaviors exhibited by multiple conductors on the timbre, intonation, and perceptions of three university choirs, and assessed relationships between time spent in selected conductor behaviors and analysis of the choirs' performances

    Get PDF
    This investigation examined the effects of aggregate nonverbal behaviors exhibited by 10 videotaped conductors on the choral sound and perceptions of 3 university choirs (N = 61 choristers) as they sang from memory the same a cappella motet. It then assessed relationships between time spent in selected nonverbal conducting behaviors and the choirs' sung performances and perceptions. Examined nonverbal conductor behaviors were: (a) height of vertical gestural plane; (b) width of lateral gestural plane; (c) hand shape; and (d) emotional face expression. Dependent measures included Long Term Average Spectra (LTAS) data, pitch analyses, and singer questionnaires. Among primary findings: (a) aggregate singer ratings yielded significant differences among the 10 conductors with respect to perceived gestural clarity and singing efficiency; (b) each of the 3 choirs responded similarly in timbre and pitch to the 10, counter-balanced conductor videos presented; (c) significantly strong, positive correlations between LTAS and pitch results suggested that those conductors whose nonverbal behaviors evoked more spectral energy in the choirs' sound tended also to elicit more in tune singing; (d) the 10 conductors exhibited significantly different amounts of aggregate time spent in the gestural planes and hand shapes analyzed; (e) above shoulder vertical gestures related significantly to less timbral energy, while gestures below shoulder level related significantly to increased timbral energy; (f) significantly strong, positive correlations between singer questionnaire responses and both pitch and LTAS data suggested that the choirs' timbre and pitch tended to vary according to whether or not the singers perceived a conductor's nonverbal communication as clear and whether or not they perceived they sang efficiently while following a particular conductor; (g) moderately strong, though not significant, associations between lateral gestures within the torso area and both pitch (more in tune) and timbre (more spectral energy), and between lateral gestures beyond the torso area and both pitch (less in tune) and timbre (less spectral energy); and (h) weak, non-significant correlations between aggregate time spent in various hand postures and the choirs' timbre and intonation, and between identified emotional face expressions and analyses of the choirs' sound

    Towards affective computing that works for everyone

    Full text link
    Missing diversity, equity, and inclusion elements in affective computing datasets directly affect the accuracy and fairness of emotion recognition algorithms across different groups. A literature review reveals how affective computing systems may work differently for different groups due to, for instance, mental health conditions impacting facial expressions and speech or age-related changes in facial appearance and health. Our work analyzes existing affective computing datasets and highlights a disconcerting lack of diversity in current affective computing datasets regarding race, sex/gender, age, and (mental) health representation. By emphasizing the need for more inclusive sampling strategies and standardized documentation of demographic factors in datasets, this paper provides recommendations and calls for greater attention to inclusivity and consideration of societal consequences in affective computing research to promote ethical and accurate outcomes in this emerging field.Comment: 8 pages, 2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII

    Design and semantics of form and movement (DeSForM 2006)

    Get PDF
    Design and Semantics of Form and Movement (DeSForM) grew from applied research exploring emerging design methods and practices to support new generation product and interface design. The products and interfaces are concerned with: the context of ubiquitous computing and ambient technologies and the need for greater empathy in the pre-programmed behaviour of the ‘machines’ that populate our lives. Such explorative research in the CfDR has been led by Young, supported by Kyffin, Visiting Professor from Philips Design and sponsored by Philips Design over a period of four years (research funding £87k). DeSForM1 was the first of a series of three conferences that enable the presentation and debate of international work within this field: • 1st European conference on Design and Semantics of Form and Movement (DeSForM1), Baltic, Gateshead, 2005, Feijs L., Kyffin S. & Young R.A. eds. • 2nd European conference on Design and Semantics of Form and Movement (DeSForM2), Evoluon, Eindhoven, 2006, Feijs L., Kyffin S. & Young R.A. eds. • 3rd European conference on Design and Semantics of Form and Movement (DeSForM3), New Design School Building, Newcastle, 2007, Feijs L., Kyffin S. & Young R.A. eds. Philips sponsorship of practice-based enquiry led to research by three teams of research students over three years and on-going sponsorship of research through the Northumbria University Design and Innovation Laboratory (nuDIL). Young has been invited on the steering panel of the UK Thinking Digital Conference concerning the latest developments in digital and media technologies. Informed by this research is the work of PhD student Yukie Nakano who examines new technologies in relation to eco-design textiles
    corecore