7 research outputs found

    Modelling emotional valence and arousal of non-linguistic utterances for sound design support

    Get PDF
    Non-Linguistic Utterances (NLUs), produced for popular media, computers, robots, and public spaces, can quickly and wordlessly convey emotional characteristics of a message. They have been studied in terms of their ability to convey affect in robot communication. The objective of this research is to develop a model that correctly infers the emotional Valence and Arousal of an NLU. On a Likert scale, 17 subjects evaluated the relative Valence and Arousal of 560 sounds collected from popular movies, TV shows, and video games, including NLUs and other character utterances. Three audio feature sets were used to extract features including spectral energy, spectral spread, zero-crossing rate (ZCR), Mel Frequency Cepstral Coefficients (MFCCs), and audio chroma, as well as pitch, jitter, formant, shimmer, loudness, and Harmonics-to-Noise Ratio, among others. After feature reduction by Factor Analysis, the best-performing models inferred average Valence with a Mean Absolute Error (MAE) of 0.107 and Arousal with MAE of 0.097 on audio samples removed from the training stages. These results suggest the model infers Valence and Arousal of most NLUs to less than the difference between successive rating points on the 7-point Likert scale (0.14). This inference system is applicable to the development of novel NLUs to augment robot-human communication or to the design of sounds for other systems, machines, and settings

    Designing Sound for Social Robots: Advancing Professional Practice through Design Principles

    Full text link
    Sound is one of the core modalities social robots can use to communicate with the humans around them in rich, engaging, and effective ways. While a robot's auditory communication happens predominantly through speech, a growing body of work demonstrates the various ways non-verbal robot sound can affect humans, and researchers have begun to formulate design recommendations that encourage using the medium to its full potential. However, formal strategies for successful robot sound design have so far not emerged, current frameworks and principles are largely untested and no effort has been made to survey creative robot sound design practice. In this dissertation, I combine creative practice, expert interviews, and human-robot interaction studies to advance our understanding of how designers can best ideate, create, and implement robot sound. In a first step, I map out a design space that combines established sound design frameworks with insights from interviews with robot sound design experts. I then systematically traverse this space across three robot sound design explorations, investigating (i) the effect of artificial movement sound on how robots are perceived, (ii) the benefits of applying compositional theory to robot sound design, and (iii) the role and potential of spatially distributed robot sound. Finally, I implement the designs from prior chapters into humanoid robot Diamandini, and deploy it as a case study. Based on a synthesis of the data collection and design practice conducted across the thesis, I argue that the creation of robot sound is best guided by four design perspectives: fiction (sound as a means to convey a narrative), composition (sound as its own separate listening experience), plasticity (sound as something that can vary and adapt over time), and space (spatial distribution of sound as a separate communication channel). The conclusion of the thesis presents these four perspectives and proposes eleven design principles across them which are supported by detailed examples. This work contributes an extensive body of design principles, process models, and techniques providing researchers and designers with new tools to enrich the way robots communicate with humans

    Machine Learning Driven Emotional Musical Prosody for Human-Robot Interaction

    Get PDF
    This dissertation presents a method for non-anthropomorphic human-robot interaction using a newly developed concept entitled Emotional Musical Prosody (EMP). EMP consists of short expressive musical phrases capable of conveying emotions, which can be embedded in robots to accompany mechanical gestures. The main objective of EMP is to improve human engagement with, and trust in robots while avoiding the uncanny valley. We contend that music - one of the most emotionally meaningful human experiences - can serve as an effective medium to support human-robot engagement and trust. EMP allows for the development of personable, emotion-driven agents, capable of giving subtle cues to collaborators while presenting a sense of autonomy. We present four research areas aimed at developing and understanding the potential role of EMP in human-robot interaction. The first research area focuses on collecting and labeling a new EMP dataset from vocalists, and using this dataset to generate prosodic emotional phrases through deep learning methods. Through extensive listening tests, the collected dataset and generated phrases were validated with a high level of accuracy by a large subject pool. The second research effort focuses on understanding the effect of EMP in human-robot interaction with industrial and humanoid robots. Here, significant results were found for improved trust, perceived intelligence, and likeability of EMP enabled robotic arms, but not for humanoid robots. We also found significant results for improved trust in a social robot, as well as perceived intelligence, creativity and likeability in a robotic musician. The third and fourth research areas shift to broader use cases and potential methods to use EMP in HRI. The third research area explores the effect of robotic EMP on different personality types focusing on extraversion and neuroticism. For robots, personality traits offer a unique way to implement custom responses, individualized to human collaborators. We discovered that humans prefer robots with emotional responses based on high extraversion and low neuroticism, with some correlation between the humans collaborator’s own personality traits. The fourth and final research question focused on scaling up EMP to support interaction between groups of robots and humans. Here, we found that improvements in trust and likeability carried across from single robots to groups of industrial arms. Overall, the thesis suggests EMP is useful for improving trust and likeability for industrial, social and robot musicians but not in humanoid robots. The thesis bears future implications for HRI designers, showing the extensive potential of careful audio design, and the wide range of outcomes audio can have on HRI.Ph.D

    Affective Expressions in Conversational Agents for Learning Environments: Effects of curiosity, humour, and expressive auditory gestures

    Get PDF
    Conversational agents -- systems that imitate natural language discourse -- are becoming an increasingly prevalent human-computer interface, being employed in various domains including healthcare, customer service, and education. In education, conversational agents, also known as pedagogical agents, can be used to encourage interaction; which is considered crucial for the learning process. Though pedagogical agents have been designed for learners of diverse age groups and subject matter, they retain the overarching goal of eliciting learning outcomes, which can be broken down into cognitive, skill-based, and affective outcomes. Motivation is a particularly important affective outcome, as it can influence what, when, and how we learn. Understanding, supporting, and designing for motivation is therefore of great importance for the advancement of learning technologies. This thesis investigates how pedagogical agents can promote motivation in learners. Prior research has explored various features of the design of pedagogical agents and what effects they have on learning outcomes, and suggests that agents using social cues can adapt the learning environment to enhance both affective and cognitive outcomes. One social cue that is suggested to be of importance for enhancing learner motivation is the expression or simulation of affect in the agent. Informed by research and theory across multiple domains, three affective expressions are investigated: curiosity, humour, and expressive auditory gestures -- each aimed at enhancing motivation by adapting the learning environment in different ways, i.e., eliciting contagion effects, creating a positive learning experience, and strengthening the learner-agent relationship, respectively. Three studies are presented in which each expression was implemented in a separate type of agent: physically-embodied, text-based, and voice-based; with all agents taking on the role of a companion or less knowledgeable peer to the learner. The overall focus is on how each expression can be displayed, what the effects are on perception of the agent, and how it influences behaviour and learning outcomes. The studies result in theoretical contributions that add to our understanding of conversational agent design for learning environments. The findings provide support for: the simulation of curiosity, the use of certain humour styles, and the addition of expressive auditory gestures, in enhancing motivation in learners interacting with conversational agents; as well as indicating a need for further exploration of these strategies in future work

    KEER2022

    Get PDF
    AvanttĂ­tol: KEER2022. DiversitiesDescripciĂł del recurs: 25 juliol 202
    corecore