1,012 research outputs found

    Expressive movement generation with machine learning

    Get PDF
    Movement is an essential aspect of our lives. Not only do we move to interact with our physical environment, but we also express ourselves and communicate with others through our movements. In an increasingly computerized world where various technologies and devices surround us, our movements are essential parts of our interaction with and consumption of computational devices and artifacts. In this context, incorporating an understanding of our movements within the design of the technologies surrounding us can significantly improve our daily experiences. This need has given rise to the field of movement computing – developing computational models of movement that can perceive, manipulate, and generate movements. In this thesis, we contribute to the field of movement computing by building machine-learning-based solutions for automatic movement generation. In particular, we focus on using machine learning techniques and motion capture data to create controllable, generative movement models. We also contribute to the field by creating datasets, tools, and libraries that we have developed during our research. We start our research by reviewing the works on building automatic movement generation systems using machine learning techniques and motion capture data. Our review covers background topics such as high-level movement characterization, training data, features representation, machine learning models, and evaluation methods. Building on our literature review, we present WalkNet, an interactive agent walking movement controller based on neural networks. The expressivity of virtual, animated agents plays an essential role in their believability. Therefore, WalkNet integrates controlling the expressive qualities of movement with the goal-oriented behaviour of an animated virtual agent. It allows us to control the generation based on the valence and arousal levels of affect, the movement’s walking direction, and the mover’s movement signature in real-time. Following WalkNet, we look at controlling movement generation using more complex stimuli such as music represented by audio signals (i.e., non-symbolic music). Music-driven dance generation involves a highly non-linear mapping between temporally dense stimuli (i.e., the audio signal) and movements, which renders a more challenging modelling movement problem. To this end, we present GrooveNet, a real-time machine learning model for music-driven dance generation

    Machine perception of natural musical conducting gestures

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1996.Includes bibliographical references (p. 53-54).by Matthew Wayne Krom.M.S

    Tools for expressive gesture recognition and mapping in rehearsal and performance

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 97-101).As human movement is an incredibly rich mode of communication and expression, performance artists working with digital media often use performers' movement and gestures to control and shape that digital media as part of a theatrical, choreographic, or musical performance. In my own work, I have found that strong, semantically-meaningful mappings between gesture and sound or visuals are necessary to create compelling performance interactions. However, the existing systems for developing mappings between incoming data streams and output media have extremely low-level concepts of "gesture." The actual programming process focuses on low-level sensor data, such as the voltage values of a particular sensor, which limits the user in his or her thinking process, requires users to have significant programming experience, and loses the expressive, meaningful, and metaphor-rich content of the movement. To remedy these difficulties, I have created a new framework and development environment for gestural control of media in rehearsal and performance, allowing users to create clear and intuitive mappings in a simple and flexible manner by using high-level descriptions of gestures and of gestural qualities. This approach, the Gestural Media Framework, recognizes continuous gesture and translates Laban Effort Notation into the realm of technological gesture analysis, allowing for the abstraction and encapsulation of sensor data into movement descriptions. As part of the evaluation of this system, I choreographed four performance pieces that use this system throughout the performance and rehearsal process to map dancers' movements to manipulation of sound and visual elements. This work has been supported by the MIT Media Laboratory.by Elena Naomi Jessop.S.M

    16th Biennial Symposium on Arts & Technology Proceedings

    Get PDF

    A Study of Non-Linguistic Utterances for Social Human-Robot Interaction

    Get PDF
    The world of animation has painted an inspiring image of what the robots of the future could be. Taking the robots R2D2 and C3PO from the Star Wars films as representative examples, these robots are portrayed as being more than just machines, rather, they are presented as intelligent and capable social peers, exhibiting many of the traits that people have also. These robots have the ability to interact with people, understand us, and even relate to us in very personal ways through a wide repertoire of social cues. As robotic technologies continue to make their way into society at large, there is a growing trend toward making social robots. The field of Human-Robot Interaction concerns itself with studying, developing and realising these socially capable machines, equipping them with a very rich variety of capabilities that allow them to interact with people in natural and intuitive ways, ranging from the use of natural language, body language and facial gestures, to more unique ways such as expression through colours and abstract sounds. This thesis studies the use of abstract, expressive sounds, like those used iconically by the robot R2D2. These are termed Non-Linguistic Utterances (NLUs) and are a means of communication which has a rich history in film and animation. However, very little is understood about how such expressive sounds may be utilised by social robots, and how people respond to these. This work presents a series of experiments aimed at understanding how NLUs can be utilised by a social robot in order to convey affective meaning to people both young and old, and what factors impact on the production and perception of NLUs. Firstly, it is shown that not all robots should use NLUs. The morphology of the robot matters. People perceive NLUs differently across different robots, and not always in a desired manner. Next it is shown that people readily project affective meaning onto NLUs though not in a coherent manner. Furthermore, people's affective inferences are not subtle, rather they are drawn to well established, basic affect prototypes. Moreover, it is shown that the valence of the situation in which an NLU is made, overrides the initial valence of the NLU itself: situational context biases how people perceive utterances made by a robot, and through this, coherence between people in their affective inferences is found to increase. Finally, it is uncovered that NLUs are best not used as a replacement to natural language (as they are by R2D2), rather, people show a preference for them being used alongside natural language where they can play a supportive role by providing essential social cues
    • …
    corecore