Search CORE

37 research outputs found

Joint Proceedings of the Intelligent Virtual Agents 2012 Workshops:Santa Cruz, CA, September 15, 2012

Author: Böck Ronald
Edlund Jens
Traum David
Publication venue: Otto von Guericke University Magdeburg
Publication date: 01/09/2012
Field of study

University of Twente Research Information

Impact of Iris Size and Eyelids Coupling on the Estimation of the Gaze Direction of a Robotic Talking Head by Human Viewers

Author: Bailly Gérard
Elisei Frédéric
Foerster François
Publication venue: HAL CCSD
Publication date: 03/11/2015
Field of study

International audiencePrimates - and in particular humans-are very sensitive to the eye direction of congeners. Estimation of gaze of others is one of the basic skills for estimating goals, intentions and desires of social agents, whether they are humans or avatars. When building robots, one should not only supply them with gaze trackers but also check for the readability of their own gaze by human partners. We conducted experiments that demonstrate the strong impact of the iris size and the position of the eyelids of an iCub humanoid robot on gaze reading performance by human observers. We comment on the importance of assessing the robot's ability of displaying its intentions via clearly legible and readable gestures

Hal - Université Grenoble Alpes

Conversational AI and Knowledge Graphs for Social Robot Interaction

Author: Jokinen Kristiina
Wilcock Graham
Publication venue: IEEE
Publication date: 07/03/2022
Field of study

The paper describes an approach that combines work from three fields with previously separate research commu-nities: social robotics, conversational AI, and graph databases. The aim is to develop a generic framework in which a variety of social robots can provide high-quality information to users by accessing semantically-rich knowledge graphs about multiple different domains. An example implementation uses a Furhat robot with Rasa open source conversational AI and knowledge graphs in Neo4j graph databases.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition

Author: Beskow Jonas
Salvi Giampiero
Stefanov Kalin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

This paper presents a self-supervised method for visual detection of the active speaker in a multi-person spoken interaction scenario. Active speaker detection is a fundamental prerequisite for any artificial cognitive system attempting to acquire language in social settings. The proposed method is intended to complement the acoustic detection of the active speaker, thus improving the system robustness in noisy conditions. The method can detect an arbitrary number of possibly overlapping active speakers based exclusively on visual information about their face. Furthermore, the method does not rely on external annotations, thus complying with cognitive development. Instead, the method uses information from the auditory modality to support learning in the visual domain. This paper reports an extensive evaluation of the proposed method using a large multi-person face-to-face interaction dataset. The results show good performance in a speaker dependent setting. However, in a speaker independent setting the proposed method yields a significantly lower performance. We believe that the proposed method represents an essential component of any artificial cognitive system or robotic platform engaging in social interactions.Comment: 10 pages, IEEE Transactions on Cognitive and Developmental System

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

NORA - Norwegian Open Research Archives

ICMI'12:Proceedings of the ACM SIGCHI 14th International Conference on Multimodal Interaction

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/10/2012
Field of study

University of Twente Research Information

A multimodal multiparty human-robot dialogue corpus for real world interaction

Author: Kotaro Funakoshi
Publication venue: Center for Corpus Development, National Institute for Japanese Language and Linguistics
Publication date: 09/05/2018
Field of study

Kyoto University/Honda Research Institute Japan Co.,Ltd.LREC 2018 Special Speech Sessions "Speech Resources Collection in Real-World Situations"; Phoenix Seagaia Conference Center, Miyazaki; 2018-05-09We have developed the MPR multimodal dialogue corpus and describe research activities using the corpus aimed for enabling multiparty human-robot verbal communication in real-world settings. While aiming for that as the final goal, the immediate focus of our project and the corpus is non-verbal communication, especially social signal processing by machines as the foundation of human-machine verbal communication. The MPR corpus stores annotated audio-visual recordings of dialogues between one robot and one or multiple (up to tree) participants. The annotations include speech segment, addressee of speech, transcript, interaction state, and, dialogue act types. Our research on multiparty dialogue management, boredom recognition, response obligation recognition, surprise detection and repair detection using the corpus is briefly introduced, and an analysis on repair in multiuser situations is presented. It exhibits richer repair behaviors and demands more sophisticated repair handling by machines

Institutional Repositories DataBase (IRDB)

Academic Repository of the National Institute for Japanese Language and Linguistics / 国立国語研究所学術情報リポジトリ

Evaluation of artificial mouths in social robots

Author: Alcocer Luna Jonatan Héctor
Alonso Martín Fernando
Castro González Álvaro
Malfaz Vázquez María Ángeles
Salichs Sánchez-Caballero Miguel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/03/2018
Field of study

The external aspects of a robot affect how people behave and perceive it while interacting. In this paper, we study the importance of the mouth displayed by a social robot and explore how different designs of an artificial LED-based mouths alter the participants' judgments of a robot's attributes and their attention to the robot's message. We evaluated participants' judgments of a speaking robot under four conditions: 1) without a mouth; 2) with a static smile; 3) with a vibrating, wave-shaped mouth; and 4) with a moving, human-like mouth. A total of 79 participants evaluated their perceptions of an on-video robot showing one of the four conditions. The results show that the presence of a mouth, as well as its design, alters the perception of the robot. In particular, the presence of a mouth makes the robot to be perceived more lifelike and less sad. The human-like mouth was the one participants liked the most and, along with the smile, they were the friendliest ones. On the contrary, participants rated the mouthless robot and the one with the wave-like mouth as the most dangerous ones.Ministerio de Economia y Competitividad (DPI2014-57684-R); in part by the MOnarCH, funded by the European Commission (Grant Agreement 601033); and in part by the RoboCity2030-III-CM, funded by the Comunidad de Madrid and cofunded by the Structural Funds of the EU (S2013/MIT-2748)

Universidad Carlos III de Madrid e-Archivo

Building Embodied Conversational Agents:Observations on human nonverbal behaviour as a resource for the development of artificial characters

Author: Blomsma Pieter A.
Publication venue: PrintPartners Ipskamp B.V. (SIKS Dissertation Series, 16)
Publication date: 20/06/2023
Field of study

"Wow this is so cool!" This is what I most probably yelled, back in the 90s, when my first computer program on our MSX computer turned out to do exactly what I wanted it to do. The program contained the following instruction: COLOR 10(1.1) After hitting enter, it would change the screen color from light blue to dark yellow. A few years after that experience, Microsoft Windows was introduced. Windows came with an intuitive graphical user interface that was designed to allow all people, so also those who would not consider themselves to be experienced computer addicts, to interact with the computer. This was a major step forward in human-computer interaction, as from that point forward no complex programming skills were required anymore to perform such actions as adapting the screen color. Changing the background was just a matter of pointing the mouse to the desired color on a color palette. "Wow this is so cool!". This is what I shouted, again, 20 years later. This time my new smartphone successfully skipped to the next song on Spotify because I literally told my smartphone, with my voice, to do so. Being able to operate your smartphone with natural language through voice-control can be extremely handy, for instance when listening to music while showering. Again, the option to handle a computer with voice instructions turned out to be a significant optimization in human-computer interaction. From now on, computers could be instructed without the use of a screen, mouse or keyboard, and instead could operate successfully simply by telling the machine what to do. In other words, I have personally witnessed how, within only a few decades, the way people interact with computers has changed drastically, starting as a rather technical and abstract enterprise to becoming something that was both natural and intuitive, and did not require any advanced computer background. Accordingly, while computers used to be machines that could only be operated by technically-oriented individuals, they had gradually changed into devices that are part of many people’s household, just as much as a television, a vacuum cleaner or a microwave oven. The introduction of voice control is a significant feature of the newer generation of interfaces in the sense that these have become more "antropomorphic" and try to mimic the way people interact in daily life, where indeed the voice is a universally used device that humans exploit in their exchanges with others. The question then arises whether it would be possible to go even one step further, where people, like in science-fiction movies, interact with avatars or humanoid robots, whereby users can have a proper conversation with a computer-simulated human that is indistinguishable from a real human. An interaction with a human-like representation of a computer that behaves, talks and reacts like a real person would imply that the computer is able to not only produce and understand messages transmitted auditorily through the voice, but also could rely on the perception and generation of different forms of body language, such as facial expressions, gestures or body posture. At the time of writing, developments of this next step in human-computer interaction are in full swing, but the type of such interactions is still rather constrained when compared to the way humans have their exchanges with other humans. It is interesting to reflect on how such future humanmachine interactions may look like. When we consider other products that have been created in history, it sometimes is striking to see that some of these have been inspired by things that can be observed in our environment, yet at the same do not have to be exact copies of those phenomena. For instance, an airplane has wings just as birds, yet the wings of an airplane do not make those typical movements a bird would produce to fly. Moreover, an airplane has wheels, whereas a bird has legs. At the same time, an airplane has made it possible for a humans to cover long distances in a fast and smooth manner in a way that was unthinkable before it was invented. The example of the airplane shows how new technologies can have "unnatural" properties, but can nonetheless be very beneficial and impactful for human beings. This dissertation centers on this practical question of how virtual humans can be programmed to act more human-like. The four studies presented in this dissertation all have the equivalent underlying question of how parts of human behavior can be captured, such that computers can use it to become more human-like. Each study differs in method, perspective and specific questions, but they are all aimed to gain insights and directions that would help further push the computer developments of human-like behavior and investigate (the simulation of) human conversational behavior. The rest of this introductory chapter gives a general overview of virtual humans (also known as embodied conversational agents), their potential uses and the engineering challenges, followed by an overview of the four studies

Tilburg University Repository