1,151 research outputs found

    Sensing, interpreting, and anticipating human social behaviour in the real world

    Get PDF
    Low-level nonverbal social signals like glances, utterances, facial expressions and body language are central to human communicative situations and have been shown to be connected to important high-level constructs, such as emotions, turn-taking, rapport, or leadership. A prerequisite for the creation of social machines that are able to support humans in e.g. education, psychotherapy, or human resources is the ability to automatically sense, interpret, and anticipate human nonverbal behaviour. While promising results have been shown in controlled settings, automatically analysing unconstrained situations, e.g. in daily-life settings, remains challenging. Furthermore, anticipation of nonverbal behaviour in social situations is still largely unexplored. The goal of this thesis is to move closer to the vision of social machines in the real world. It makes fundamental contributions along the three dimensions of sensing, interpreting and anticipating nonverbal behaviour in social interactions. First, robust recognition of low-level nonverbal behaviour lays the groundwork for all further analysis steps. Advancing human visual behaviour sensing is especially relevant as the current state of the art is still not satisfactory in many daily-life situations. While many social interactions take place in groups, current methods for unsupervised eye contact detection can only handle dyadic interactions. We propose a novel unsupervised method for multi-person eye contact detection by exploiting the connection between gaze and speaking turns. Furthermore, we make use of mobile device engagement to address the problem of calibration drift that occurs in daily-life usage of mobile eye trackers. Second, we improve the interpretation of social signals in terms of higher level social behaviours. In particular, we propose the first dataset and method for emotion recognition from bodily expressions of freely moving, unaugmented dyads. Furthermore, we are the first to study low rapport detection in group interactions, as well as investigating a cross-dataset evaluation setting for the emergent leadership detection task. Third, human visual behaviour is special because it functions as a social signal and also determines what a person is seeing at a given moment in time. Being able to anticipate human gaze opens up the possibility for machines to more seamlessly share attention with humans, or to intervene in a timely manner if humans are about to overlook important aspects of the environment. We are the first to propose methods for the anticipation of eye contact in dyadic conversations, as well as in the context of mobile device interactions during daily life, thereby paving the way for interfaces that are able to proactively intervene and support interacting humans.Blick, Gesichtsausdrücke, Körpersprache, oder Prosodie spielen als nonverbale Signale eine zentrale Rolle in menschlicher Kommunikation. Sie wurden durch vielzählige Studien mit wichtigen Konzepten wie Emotionen, Sprecherwechsel, Führung, oder der Qualität des Verhältnisses zwischen zwei Personen in Verbindung gebracht. Damit Menschen effektiv während ihres täglichen sozialen Lebens von Maschinen unterstützt werden können, sind automatische Methoden zur Erkennung, Interpretation, und Antizipation von nonverbalem Verhalten notwendig. Obwohl die bisherige Forschung in kontrollierten Studien zu ermutigenden Ergebnissen gekommen ist, bleibt die automatische Analyse nonverbalen Verhaltens in weniger kontrollierten Situationen eine Herausforderung. Darüber hinaus existieren kaum Untersuchungen zur Antizipation von nonverbalem Verhalten in sozialen Situationen. Das Ziel dieser Arbeit ist, die Vision vom automatischen Verstehen sozialer Situationen ein Stück weit mehr Realität werden zu lassen. Diese Arbeit liefert wichtige Beiträge zur autmatischen Erkennung menschlichen Blickverhaltens in alltäglichen Situationen. Obwohl viele soziale Interaktionen in Gruppen stattfinden, existieren unüberwachte Methoden zur Augenkontakterkennung bisher lediglich für dyadische Interaktionen. Wir stellen einen neuen Ansatz zur Augenkontakterkennung in Gruppen vor, welcher ohne manuelle Annotationen auskommt, indem er sich den statistischen Zusammenhang zwischen Blick- und Sprechverhalten zu Nutze macht. Tägliche Aktivitäten sind eine Herausforderung für Geräte zur mobile Augenbewegungsmessung, da Verschiebungen dieser Geräte zur Verschlechterung ihrer Kalibrierung führen können. In dieser Arbeit verwenden wir Nutzerverhalten an mobilen Endgeräten, um den Effekt solcher Verschiebungen zu korrigieren. Neben der Erkennung verbessert diese Arbeit auch die Interpretation sozialer Signale. Wir veröffentlichen den ersten Datensatz sowie die erste Methode zur Emotionserkennung in dyadischen Interaktionen ohne den Einsatz spezialisierter Ausrüstung. Außerdem stellen wir die erste Studie zur automatischen Erkennung mangelnder Verbundenheit in Gruppeninteraktionen vor, und führen die erste datensatzübergreifende Evaluierung zur Detektion von sich entwickelndem Führungsverhalten durch. Zum Abschluss der Arbeit präsentieren wir die ersten Ansätze zur Antizipation von Blickverhalten in sozialen Interaktionen. Blickverhalten hat die besondere Eigenschaft, dass es sowohl als soziales Signal als auch der Ausrichtung der visuellen Wahrnehmung dient. Somit eröffnet die Fähigkeit zur Antizipation von Blickverhalten Maschinen die Möglichkeit, sich sowohl nahtloser in soziale Interaktionen einzufügen, als auch Menschen zu warnen, wenn diese Gefahr laufen wichtige Aspekte der Umgebung zu übersehen. Wir präsentieren Methoden zur Antizipation von Blickverhalten im Kontext der Interaktion mit mobilen Endgeräten während täglicher Aktivitäten, als auch während dyadischer Interaktionen mittels Videotelefonie

    Deep Emotion Recognition in Textual Conversations: A Survey

    Full text link
    While Emotion Recognition in Conversations (ERC) has seen a tremendous advancement in the last few years, new applications and implementation scenarios present novel challenges and opportunities. These range from leveraging the conversational context, speaker and emotion dynamics modelling, to interpreting common sense expressions, informal language and sarcasm, addressing challenges of real time ERC, recognizing emotion causes, different taxonomies across datasets, multilingual ERC to interpretability. This survey starts by introducing ERC, elaborating on the challenges and opportunities pertaining to this task. It proceeds with a description of the emotion taxonomies and a variety of ERC benchmark datasets employing such taxonomies. This is followed by descriptions of the most prominent works in ERC with explanations of the Deep Learning architectures employed. Then, it provides advisable ERC practices towards better frameworks, elaborating on methods to deal with subjectivity in annotations and modelling and methods to deal with the typically unbalanced ERC datasets. Finally, it presents systematic review tables comparing several works regarding the methods used and their performance. The survey highlights the advantage of leveraging techniques to address unbalanced data, the exploration of mixed emotions and the benefits of incorporating annotation subjectivity in the learning phase

    Emotion-aware voice interfaces based on speech signal processing

    Get PDF
    Voice interfaces (VIs) will become increasingly widespread in current daily lives as AI techniques progress. VIs can be incorporated into smart devices like smartphones, as well as integrated into autos, home automation systems, computer operating systems, and home appliances, among other things. Current speech interfaces, however, are unaware of users’ emotional states and hence cannot support real communication. To overcome these limitations, it is necessary to implement emotional awareness in future VIs. This thesis focuses on how speech signal processing (SSP) and speech emotion recognition (SER) can enable VIs to gain emotional awareness. Following an explanation of what emotion is and how neural networks are implemented, this thesis presents the results of several user studies and surveys. Emotions are complicated, and they are typically characterized using category and dimensional models. They can be expressed verbally or nonverbally. Although existing voice interfaces are unaware of users’ emotional states and cannot support natural conversations, it is possible to perceive users’ emotions by speech based on SSP in future VIs. One section of this thesis, based on SSP, investigates mental restorative effects on humans and their measures from speech signals. SSP is less intrusive and more accessible than traditional measures such as attention scales or response tests, and it can provide a reliable assessment for attention and mental restoration. SSP can be implemented into future VIs and utilized in future HCI user research. The thesis then moves on to present a novel attention neural network based on sparse correlation features. The detection accuracy of emotions in the continuous speech was demonstrated in a user study utilizing recordings from a real classroom. In this section, a promising result will be shown. In SER research, it is unknown if existing emotion detection methods detect acted emotions or the genuine emotion of the speaker. Another section of this thesis is concerned with humans’ ability to act on their emotions. In a user study, participants were instructed to imitate five fundamental emotions. The results revealed that they struggled with this task; nevertheless, certain emotions were easier to replicate than others. A further study concern is how VIs should respond to users’ emotions if SER techniques are implemented in VIs and can recognize users’ emotions. The thesis includes research on ways for dealing with the emotions of users. In a user study, users were instructed to make sad, angry, and terrified VI avatars happy and were asked if they would like to be treated the same way if the situation were reversed. According to the results, the majority of participants tended to respond to these unpleasant emotions with neutral emotion, but there is a difference among genders in emotion selection. For a human-centered design approach, it is important to understand what the users’ preferences for future VIs are. In three distinct cultures, a questionnaire-based survey on users’ attitudes and preferences for emotion-aware VIs was conducted. It was discovered that there are almost no gender differences. Cluster analysis found that there are three fundamental user types that exist in all cultures: Enthusiasts, Pragmatists, and Sceptics. As a result, future VI development should consider diverse sorts of consumers. In conclusion, future VIs systems should be designed for various sorts of users as well as be able to detect the users’ disguised or actual emotions using SER and SSP technologies. Furthermore, many other applications, such as restorative effects assessments, can be included in the VIs system

    Research Topic: Typical and Atypical Processing of Gaze

    Get PDF

    Attention Restraint, Working Memory Capacity, and Mind Wandering: Do Emotional Valence or Intentionality Matter?

    Get PDF
    Attention restraint appears to mediate the relationship between working memory capacity (WMC) and mind wandering (Kane et al., 2016). Prior work has identifed two dimensions of mind wandering—emotional valence and intentionality. However, less is known about how WMC and attention restraint correlate with these dimensions. Te current study examined the relationship between WMC, attention restraint, and mind wandering by emotional valence and intentionality. A confrmatory factor analysis demonstrated that WMC and attention restraint were strongly correlated, but only attention restraint was related to overall mind wandering, consistent with prior fndings. However, when examining the emotional valence of mind wandering, attention restraint and WMC were related to negatively and positively valenced, but not neutral, mind wandering. Attention restraint was also related to intentional but not unintentional mind wandering. Tese results suggest that WMC and attention restraint predict some, but not all, types of mind wandering

    Cognitive Abilities in Hearing Loss: Perceived and Performance Abilities of Adults Related to Attention, Memory, and Social Cognition

    Get PDF
    Hearing loss is the most common sensory deficit noted in aging adults. It is commonly known to reduce an individual’s ability to detect, identify, and localize sounds and speech and to cause issues in communication. However, there are other less commonly discussed impacts that hearing loss has beyond the auditory system. Literature suggests a correlation between hearing loss and cognition in aging adults. Similar to hearing loss, the domains of cognition experience performance and functional changes across the life span. In an aging adult, changes related to cognition are also suggested to be associated with hearing loss. This study aimed to add to the corpus of literature surrounding the relationship between hearing loss and cognition, specifically memory, attention, executive functioning, and social cognition in adults with and without hearing loss. The purpose of this multi-methods study was to describe if group differences in adults with and without hearing loss existed between perceived and performance-related cognitive abilities. The study focused on twenty-eight adults between the ages of 50-69 years; fourteen adults had normal hearing, while fourteen adults had hearing loss which ranged in the mild to moderate sensorineural range. Based on age and hearing loss, adults were separated into four distinct groups: normal hearing between the ages of 50-59 years, hearing loss between the ages of 50-59 years, normal hearing between the ages of 60-69 years, and hearing loss between the ages of 60-69 years. Performance-related cognitive abilities were assessed through five different cognitive assessments: the Weschler Memory Scale, Weschler Adult Intelligence Scale, Faux Pas stories, Advanced Clinical Solutions, and Bluegrass Short-Term Memory task. Perceived abilities were addressed through structured, open-ended questions that centered around the impacts of hearing and hearing loss and an individual’s communication abilities. The first aim examined how adults described the impacts of hearing loss and their communicative abilities. Individual responses highlighted what impacts adults thought hearing loss had beyond communication and their communicative abilities. The majority of adults expressed that they did not have any communication errors and could accurately express their own thoughts/viewpoints/emotions and understand others’ thoughts/viewpoints/emotions. The second aim determined that group differences were present on memory subtests from the Weschler Memory Scale and a subtest from the Weschler Adult Intelligence Scale. While there was no significant difference between responses on the Bluegrass Short-Term Memory task, there was a group interaction on left frontal theta oscillation (memory & decision-making related), and right frontal beta frequency (attention-related) during data collection on EEG resting state eyes open. The final aim determined that there were group differences on the social cognitive assessment. Auditory and cognitive processing have previously been viewed as separate and distinct factors that are crucial for communication, yet the growing body of literature suggests that these elements are actually intimately coupled. This research yielded evidence that even a mild HL in adults between the ages of 50-69 is associated with changes in cognitive functioning, specifically on memory, attention, and social cognition. Singularly, the auditory system and cognitive domains are each complex, yet these must be assessed as factors that have the potential to influence each other. The open-ended questions revealed that researchers and clinicians need to continue to address the wideranging impacts of hearing loss among adults. While adults did recognize impacts of hearing loss beyond communication, some participants also reported no thoughts on the impact beyond communication. This is a strong suggestion that adults need to be further educated about hearing loss as a critically prevalent public health matter

    Coaching Imagery to Athletes with Aphantasia

    Get PDF
    We administered the Plymouth Sensory Imagery Questionnaire (Psi-Q) which tests multi-sensory imagery, to athletes (n=329) from 9 different sports to locate poor/aphantasic (baseline scores <4.2/10) imagers with the aim to subsequently enhance imagery ability. The low imagery sample (n=27) were randomly split into two groups who received the intervention: Functional Imagery Training (FIT), either immediately, or delayed by one month at which point the delayed group were tested again on the Psi-Q. All participants were tested after FIT delivery and six months post intervention. The delayed group showed no significant change between baseline and the start of FIT delivery but both groups imagery score improved significantly (p=0.001) after the intervention which was maintained six months post intervention. This indicates that imagery can be trained, with those who identify as having aphantasia (although one participant did not improve on visual scores), and improvements maintained in poor imagers. Follow up interviews (n=22) on sporting application revealed that the majority now use imagery daily on process goals. Recommendations are given for ways to assess and train imagery in an applied sport setting

    MINDFULNESS, SELF-COMPASSION AND THREAT RELATED ATTENTIONAL BIAS: IMPLICATIONS FOR SOCIAL ANXIETY AND LONELINESS IN LATE ADOLESCENT COLLEGE STUDENTS.

    Get PDF
    The present study examined associations among mindfulness, self-compassion, social anxiety, loneliness and attention bias toward threat in a community sample of undergraduate university students (n = 176). Of interest was whether threat bias would mediate the independent effects of mindfulness and self-compassion on social anxiety and loneliness. Threat bias is characterized by the display of exaggerated attention toward threatening cues in the environment and has been repeatedly linked to the origin and continuation of anxious symptoms. Given the focus of contemplative practices on attentional awareness and control, this study examined whether mindfulness and self-compassion may contribute to decreased levels of anxiety and loneliness, in part, by influencing attentional mechanisms that may lower threat bias. The specificity of individual facets of mindfulness and self-compassion were also examined in relation to threat bias, social anxiety and loneliness. Overall, findings confirm previous studies, demonstrating strong predictive associations between mindfulness, self-compassion, social anxiety and loneliness in college students. Results also highlight the indirect effect of attention bias in the relationship between self-compassion, loneliness and social anxiety. The implications of these findings are discussed in relation to contemplative intervention programs to promote healthy social adjustment and relations in late adolescent college students. Keywords Attentional biases, Anxiety disorders, Dot probe, Mindfulness, Self-Compassion, Adolescence, College Student

    The role of phonology in visual word recognition: evidence from Chinese

    Get PDF
    Posters - Letter/Word Processing V: abstract no. 5024The hypothesis of bidirectional coupling of orthography and phonology predicts that phonology plays a role in visual word recognition, as observed in the effects of feedforward and feedback spelling to sound consistency on lexical decision. However, because orthography and phonology are closely related in alphabetic languages (homophones in alphabetic languages are usually orthographically similar), it is difficult to exclude an influence of orthography on phonological effects in visual word recognition. Chinese languages contain many written homophones that are orthographically dissimilar, allowing a test of the claim that phonological effects can be independent of orthographic similarity. We report a study of visual word recognition in Chinese based on a mega-analysis of lexical decision performance with 500 characters. The results from multiple regression analyses, after controlling for orthographic frequency, stroke number, and radical frequency, showed main effects of feedforward and feedback consistency, as well as interactions between these variables and phonological frequency and number of homophones. Implications of these results for resonance models of visual word recognition are discussed.postprin
    • …
    corecore