58 research outputs found

    ACII 2009: Affective Computing and Intelligent Interaction. Proceedings of the Doctoral Consortium 2009

    Get PDF

    Real-time generation and adaptation of social companion robot behaviors

    Get PDF
    Social robots will be part of our future homes. They will assist us in everyday tasks, entertain us, and provide helpful advice. However, the technology still faces challenges that must be overcome to equip the machine with social competencies and make it a socially intelligent and accepted housemate. An essential skill of every social robot is verbal and non-verbal communication. In contrast to voice assistants, smartphones, and smart home technology, which are already part of many people's lives today, social robots have an embodiment that raises expectations towards the machine. Their anthropomorphic or zoomorphic appearance suggests they can communicate naturally with speech, gestures, or facial expressions and understand corresponding human behaviors. In addition, robots also need to consider individual users' preferences: everybody is shaped by their culture, social norms, and life experiences, resulting in different expectations towards communication with a robot. However, robots do not have human intuition - they must be equipped with the corresponding algorithmic solutions to these problems. This thesis investigates the use of reinforcement learning to adapt the robot's verbal and non-verbal communication to the user's needs and preferences. Such non-functional adaptation of the robot's behaviors primarily aims to improve the user experience and the robot's perceived social intelligence. The literature has not yet provided a holistic view of the overall challenge: real-time adaptation requires control over the robot's multimodal behavior generation, an understanding of human feedback, and an algorithmic basis for machine learning. Thus, this thesis develops a conceptual framework for designing real-time non-functional social robot behavior adaptation with reinforcement learning. It provides a higher-level view from the system designer's perspective and guidance from the start to the end. It illustrates the process of modeling, simulating, and evaluating such adaptation processes. Specifically, it guides the integration of human feedback and social signals to equip the machine with social awareness. The conceptual framework is put into practice for several use cases, resulting in technical proofs of concept and research prototypes. They are evaluated in the lab and in in-situ studies. These approaches address typical activities in domestic environments, focussing on the robot's expression of personality, persona, politeness, and humor. Within this scope, the robot adapts its spoken utterances, prosody, and animations based on human explicit or implicit feedback.Soziale Roboter werden Teil unseres zukünftigen Zuhauses sein. Sie werden uns bei alltäglichen Aufgaben unterstützen, uns unterhalten und uns mit hilfreichen Ratschlägen versorgen. Noch gibt es allerdings technische Herausforderungen, die zunächst überwunden werden müssen, um die Maschine mit sozialen Kompetenzen auszustatten und zu einem sozial intelligenten und akzeptierten Mitbewohner zu machen. Eine wesentliche Fähigkeit eines jeden sozialen Roboters ist die verbale und nonverbale Kommunikation. Im Gegensatz zu Sprachassistenten, Smartphones und Smart-Home-Technologien, die bereits heute Teil des Lebens vieler Menschen sind, haben soziale Roboter eine Verkörperung, die Erwartungen an die Maschine weckt. Ihr anthropomorphes oder zoomorphes Aussehen legt nahe, dass sie in der Lage sind, auf natürliche Weise mit Sprache, Gestik oder Mimik zu kommunizieren, aber auch entsprechende menschliche Kommunikation zu verstehen. Darüber hinaus müssen Roboter auch die individuellen Vorlieben der Benutzer berücksichtigen. So ist jeder Mensch von seiner Kultur, sozialen Normen und eigenen Lebenserfahrungen geprägt, was zu unterschiedlichen Erwartungen an die Kommunikation mit einem Roboter führt. Roboter haben jedoch keine menschliche Intuition - sie müssen mit entsprechenden Algorithmen für diese Probleme ausgestattet werden. In dieser Arbeit wird der Einsatz von bestärkendem Lernen untersucht, um die verbale und nonverbale Kommunikation des Roboters an die Bedürfnisse und Vorlieben des Benutzers anzupassen. Eine solche nicht-funktionale Anpassung des Roboterverhaltens zielt in erster Linie darauf ab, das Benutzererlebnis und die wahrgenommene soziale Intelligenz des Roboters zu verbessern. Die Literatur bietet bisher keine ganzheitliche Sicht auf diese Herausforderung: Echtzeitanpassung erfordert die Kontrolle über die multimodale Verhaltenserzeugung des Roboters, ein Verständnis des menschlichen Feedbacks und eine algorithmische Basis für maschinelles Lernen. Daher wird in dieser Arbeit ein konzeptioneller Rahmen für die Gestaltung von nicht-funktionaler Anpassung der Kommunikation sozialer Roboter mit bestärkendem Lernen entwickelt. Er bietet eine übergeordnete Sichtweise aus der Perspektive des Systemdesigners und eine Anleitung vom Anfang bis zum Ende. Er veranschaulicht den Prozess der Modellierung, Simulation und Evaluierung solcher Anpassungsprozesse. Insbesondere wird auf die Integration von menschlichem Feedback und sozialen Signalen eingegangen, um die Maschine mit sozialem Bewusstsein auszustatten. Der konzeptionelle Rahmen wird für mehrere Anwendungsfälle in die Praxis umgesetzt, was zu technischen Konzeptnachweisen und Forschungsprototypen führt, die in Labor- und In-situ-Studien evaluiert werden. Diese Ansätze befassen sich mit typischen Aktivitäten in häuslichen Umgebungen, wobei der Schwerpunkt auf dem Ausdruck der Persönlichkeit, dem Persona, der Höflichkeit und dem Humor des Roboters liegt. In diesem Rahmen passt der Roboter seine Sprache, Prosodie, und Animationen auf Basis expliziten oder impliziten menschlichen Feedbacks an

    Détection de marqueurs affectifs et attentionnels de personnes âgées en interaction avec un robot

    Get PDF
    This thesis work focuses on audio-visual detection of emotional (laugh and smile) and attentional markers for elderly people in social interaction with a robot. To effectively understand and model the pattern of behavior of very old people in the presence of a robot, relevant data are needed. I participated in the collection of a corpus of elderly people in particular for recording visual data. The system used to control the robot is a Wizard of Oz, several daily conversation scenarios were used to encourage people to interact with the robot. These scenarios were developed as part of the ROMEO2 project with the Approche association. We described at first the corpus collected which contains 27 subjects of 85 years' old on average for a total of 9 hours, annotations and we discussed the results obtained from the analysis of annotations and two questionnaires.My research then focuses on the attention detection and the laughter and smile detection. The motivations for the attention detection are to detect when the subject is not addressing to the robot and adjust the robot's behavior to the situation. After considering the difficulties related to the elderly people and the analytical results obtained by the study of the corpus annotations, we focus on the rotation of the head at the visual index and energy and quality vote for the detection of the speech recipient. The laughter and smile detection can be used to study on the profile of the speaker and her emotions. My interests focus on laughter and smile detection in the visual modality and the fusion of audio-visual information to improve the performance of the automatic system. Spontaneous expressions are different from posed or acted expression in both appearance and timing. Designing a system that works on realistic data of the elderly is even more difficult because of several difficulties to consider such as the lack data for training the statistical model, the influence of the facial texture and the smiling pattern for visual detection, the influence of voice quality for auditory detection, the variety of reaction time, the level of listening comprehension, loss of sight for elderly people, etc. The systems of head-turning detection, attention detection and laughter and smile detection are evaluated on ROMEO2 corpus and partially evaluated (visual detections) on standard corpus Pointing04 and GENKI-4K to compare with the scores of the methods on the state of the art. We also found a negative correlation between laughter and smile detection performance and the number of laughter and smile events for the visual detection system and the audio-visual system. This phenomenon can be explained by the fact that elderly people who are more interested in experimentation laugh more often and therefore perform more various poses. The variety of poses and the lack of corresponding data bring difficulties for the laughter and smile recognition for our statistical systems. The experiments show that the head-turning can be effectively used to detect the loss of the subject's attention in the interaction with the robot. For the attention detection, the potential of a cascade method using both methods in a complementary manner is shown. This method gives better results than the audio system. For the laughter and smile detection, under the same leave-one-out protocol, the fusion of the two monomodal systems significantly improves the performance of the system at the segmental evaluation.Ces travaux de thèse portent sur la détection audio-visuelle de marqueurs affectifs (rire et sourire) et attentionnels de personnes âgées en interaction sociale avec un robot. Pour comprendre efficacement et modéliser le comportement des personnes très âgées en présence d'un robot, des données pertinentes sont nécessaires. J'ai participé à la collection d'un corpus de personnes âgées notamment pour l'enregistrement des données visuelles. Le système utilisé pour contrôler le robot est un magicien d'Oz, plusieurs scénarios de conversation au quotidien ont été utilisés pour encourager les gens à coopérer avec le robot. Ces scénarios ont été élaborés dans le cadre du projet ROMEO2 avec l'association Approche.Nous avons décrit tout d'abord le corpus recueilli qui contient 27 sujets de 85 ans en moyenne pour une durée totale de 9 heures, les annotations et nous avons discuté des résultats obtenus à partir de l'analyse des annotations et de deux questionnaires. Ma recherche se focalise ensuite sur la détection de l'attention et la détection de rire et de sourire. Les motivations pour la détection de l'attention consistent à détecter quand le sujet ne s'adresse pas au robot et à adapter le comportement du robot à la situation. Après avoir considéré les difficultés liées aux personnes âgées et les résultats d'analyse obtenus par l'étude des annotations du corpus, nous nous intéressons à la rotation de la tête au niveau de l'indice visuel et à l'énergie et la qualité de voix pour la détection du destinataire de la parole. La détection de rire et sourire peut être utilisée pour l'étude sur le profil du locuteur et de ses émotions. Mes intérêts se concentrent sur la détection de rire et sourire dans la modalité visuelle et la fusion des informations audio-visuelles afin d'améliorer la performance du système automatique. Les expressions sont différentes des expressions actées ou posés à la fois en apparence et en temps de réaction. La conception d'un système qui marche sur les données réalistes des personnes âgées est encore plus difficile à cause de plusieurs difficultés à envisager telles que le manque de données pour l'entrainement du modèle statistique, l'influence de la texture faciale et de la façon de sourire pour la détection visuelle, l'influence de la qualité vocale pour la détection auditive, la variété du temps de réaction, le niveau de compréhension auditive, la perte de la vue des personnes âgées, etc. Les systèmes de détection de la rotation de la tête, de la détection de l'attention et de la détection de rire et sourire sont évalués sur le corpus ROMEO2 et partiellement évalués (détections visuelles) sur les corpus standard Pointing04 et GENKI-4K pour comparer avec les scores des méthodes de l'état de l'art. Nous avons également trouvé une corrélation négative entre la performance de détection de rire et sourire et le nombre d'évènement de rire et sourire pour le système visuel et le système audio-visuel. Ce phénomène peut être expliqué par le fait que les personnes âgées qui sont plus intéressées par l'expérimentation rient plus souvent et sont plus à l'aise donc avec des poses variées. La variété des poses et le manque de données correspondantes amènent des difficultés pour la reconnaissance de rire et de sourire pour les systèmes statistiques.Les expérimentations montrent que la rotation de la tête peut être efficacement utilisée pour détecter la perte de l'attention du sujet dans l'interaction avec le robot. Au niveau de la détection de l'attention, le potentiel d'une méthode en cascade qui utilise les modalités d'une manière complémentaire est montré. Cette méthode donne de meilleurs résultats que le système auditif seul. Pour la détection de rire et sourire, en suivant le même protocole « Leave-one-out », la fusion des deux systèmes monomodaux améliore aussi significativement la performance par rapport à un système monomodal au niveau de l'évaluation segmentale

    Multipar-T: Multiparty-Transformer for Capturing Contingent Behaviors in Group Conversations

    Full text link
    As we move closer to real-world AI systems, AI agents must be able to deal with multiparty (group) conversations. Recognizing and interpreting multiparty behaviors is challenging, as the system must recognize individual behavioral cues, deal with the complexity of multiple streams of data from multiple people, and recognize the subtle contingent social exchanges that take place amongst group members. To tackle this challenge, we propose the Multiparty-Transformer (Multipar-T), a transformer model for multiparty behavior modeling. The core component of our proposed approach is the Crossperson Attention, which is specifically designed to detect contingent behavior between pairs of people. We verify the effectiveness of Multipar-T on a publicly available video-based group engagement detection benchmark, where it outperforms state-of-the-art approaches in average F-1 scores by 5.2% and individual class F-1 scores by up to 10.0%. Through qualitative analysis, we show that our Crossperson Attention module is able to discover contingent behavior.Comment: 7 pages, 4 figures, IJCA

    Artificial Emotional Intelligence in Socially Assistive Robots

    Get PDF
    Artificial Emotional Intelligence (AEI) bridges the gap between humans and machines by demonstrating empathy and affection towards each other. This is achieved by evaluating the emotional state of human users, adapting the machine’s behavior to them, and hence giving an appropriate response to those emotions. AEI is part of a larger field of studies called Affective Computing. Affective computing is the integration of artificial intelligence, psychology, robotics, biometrics, and many more fields of study. The main component in AEI and affective computing is emotion, and how we can utilize emotion to create a more natural and productive relationship between humans and machines. An area in which AEI can be particularly beneficial is in building machines and robots for healthcare applications. Socially Assistive Robotics (SAR) is a subfield in robotics that aims at developing robots that can provide companionship to assist people with social interaction and companionship. For example, residents living in housing designed for older adults often feel lonely, isolated, and depressed; therefore, having social interaction and mental stimulation is critical to improve their well-being. Socially Assistive Robots are designed to address these needs by monitoring and improving the quality of life of patients with depression and dementia. Nevertheless, developing robots with AEI that understand users’ emotions and can reply to them naturally and effectively is in early infancy, and much more research needs to be carried out in this field. This dissertation presents the results of my work in developing a social robot, called Ryan, equipped with AEI for effective and engaging dialogue with older adults with depression and dementia. Over the course of this research there has been three versions of Ryan. Each new version of Ryan is created using the lessons learned after conducting the studies presented in this dissertation. First, two human-robot-interaction studies were conducted showing validity of using a rear-projected robot to convey emotion and intent. Then, the feasibility of using Ryan to interact with older adults is studied. This study investigated the possible improvement of the quality of life of older adults. Ryan the Companionbot used in this project is a rear-projected lifelike conversational robot. Ryan is equipped with many features such as games, music, video, reminders, and general conversation. Ryan engages users in cognitive games and reminiscence activities. A pilot study was conducted with six older adults with early-stage dementia and/or depression living in a senior living facility. Each individual had 24/7 access to a Ryan in his/her room for a period of 4-6 weeks. The observations of these individuals, interviews with them and their caregivers, and analysis of their interactions during this period revealed that they established rapport with the robot and greatly valued and enjoyed having a companionbot in their room. A multi-modal emotion recognition algorithm was developed as well as a multi-modal emotion expression system. These algorithms were then integrated into Ryan. To engage the subjects in a more empathic interaction with Ryan, a corpus of dialogues on different topics were created by English major students. An emotion recognition algorithm was designed and implemented and then integrated into the dialogue management system to empathize with users based on their perceived emotion. This study investigates the effects of this emotionally intelligent robot on older adults in the early stage of depression and dementia. The results of this study suggest that Ryan equipped with AEI is more engaging, likable, and attractive to users than Ryan without AEI. The long-term effect of the last version of Ryan (Ryan V3.0) was studied in a study involving 17 subjects from 5 different senior care facilities. The participants in this study experienced a general improvement in their cognitive and depression scores

    Machine Medical Ethics

    Get PDF
    In medical settings, machines are in close proximity with human beings: with patients who are in vulnerable states of health, who have disabilities of various kinds, with the very young or very old, and with medical professionals. Machines in these contexts are undertaking important medical tasks that require emotional sensitivity, knowledge of medical codes, human dignity, and privacy. As machine technology advances, ethical concerns become more urgent: should medical machines be programmed to follow a code of medical ethics? What theory or theories should constrain medical machine conduct? What design features are required? Should machines share responsibility with humans for the ethical consequences of medical actions? How ought clinical relationships involving machines to be modeled? Is a capacity for empathy and emotion detection necessary? What about consciousness? The essays in this collection by researchers from both humanities and science describe various theoretical and experimental approaches to adding medical ethics to a machine, what design features are necessary in order to achieve this, philosophical and practical questions concerning justice, rights, decision-making and responsibility, and accurately modeling essential physician-machine-patient relationships. This collection is the first book to address these 21st-century concerns

    Socially intelligent robots that understand and respond to human touch

    Get PDF
    Touch is an important nonverbal form of interpersonal interaction which is used to communicate emotions and other social messages. As interactions with social robots are likely to become more common in the near future these robots should also be able to engage in tactile interaction with humans. Therefore, the aim of the research presented in this dissertation is to work towards socially intelligent robots that can understand and respond to human touch. To become a socially intelligent actor a robot must be able to sense, classify and interpret human touch and respond to this in an appropriate manner. To this end we present work that addresses different parts of this interaction cycle. The contributions of this dissertation are the following. We have made a touch gesture dataset available to the research community and have presented benchmark results. Furthermore, we have sparked interest into the new field of social touch recognition by organizing a machine learning challenge and have pinpointed directions for further research. Also, we have exposed potential difficulties for the recognition of social touch in more naturalistic settings. Moreover, the findings presented in this dissertation can help to inform the design of a behavioral model for robot pet companions that can understand and respond to human touch. Additionally, we have focused on the requirements for tactile interaction with robot pets for health care applications
    corecore