92 research outputs found

    Take the Lead: Toward a Virtual Video Dance Partner

    Get PDF
    My work focuses on taking a single person as input and predicting the intentional movement of one dance partner based on the other dance partner\u27s movement. Human pose estimation has been applied to dance and computer vision, but many existing applications focus on a single individual or multiple individuals performing. Currently there are very few works that focus specifically on dance couples combined with pose prediction. This thesis is applicable to the entertainment and gaming industry by training people to dance with a virtual dance partner. Many existing interactive or virtual dance partners require a motion capture system, multiple cameras or a robot which creates an expensive cost. This thesis does not use a motion capture system and combines OpenPose with swing dance YouTube videos to create a virtual dance partner. By taking in the current dancer\u27s moves as input, the system predicts the dance partner\u27s corresponding moves in the video frames. In order to create a virtual dance partner, datasets that contain information about the skeleton keypoints are necessary to predict a dance partner\u27s pose. There are existing dance datasets for a specific type of dance, but these datasets do not cover swing dance. Furthermore, the dance datasets that do include swing have a limited number of videos. The contribution of this thesis is a large swing dataset that contains three different types of swing dance: East Coast, Lindy Hop and West Coast. I also provide a basic framework to extend the work to create a real-time and interactive dance partner

    Pathway to Future Symbiotic Creativity

    Full text link
    This report presents a comprehensive view of our vision on the development path of the human-machine symbiotic art creation. We propose a classification of the creative system with a hierarchy of 5 classes, showing the pathway of creativity evolving from a mimic-human artist (Turing Artists) to a Machine artist in its own right. We begin with an overview of the limitations of the Turing Artists then focus on the top two-level systems, Machine Artists, emphasizing machine-human communication in art creation. In art creation, it is necessary for machines to understand humans' mental states, including desires, appreciation, and emotions, humans also need to understand machines' creative capabilities and limitations. The rapid development of immersive environment and further evolution into the new concept of metaverse enable symbiotic art creation through unprecedented flexibility of bi-directional communication between artists and art manifestation environments. By examining the latest sensor and XR technologies, we illustrate the novel way for art data collection to constitute the base of a new form of human-machine bidirectional communication and understanding in art creation. Based on such communication and understanding mechanisms, we propose a novel framework for building future Machine artists, which comes with the philosophy that a human-compatible AI system should be based on the "human-in-the-loop" principle rather than the traditional "end-to-end" dogma. By proposing a new form of inverse reinforcement learning model, we outline the platform design of machine artists, demonstrate its functions and showcase some examples of technologies we have developed. We also provide a systematic exposition of the ecosystem for AI-based symbiotic art form and community with an economic model built on NFT technology. Ethical issues for the development of machine artists are also discussed

    Dance Gesture Recognition Using Space Component And Effort Component Of Laban Movement Analysis

    Get PDF
    Dance is a collection of gestures that have many meanings. Dance is a culture that is owned by every country whose every movement has beauty or meaning contained in the dance movement. One obstacle in the development of dance is to recognize dance moves. In the process of recognizing dance movements one of them is information technology by recording motion data using the Kinect sensor, where the results of the recording will produce a motion data format with the Biovision Hierarchy (BVH) file format. BVH motion data have position compositions (x, y, z). The results of the existing dance motion record will be extracted features using Laban Movement Analysis (LMA), where the LMA has four main components namely Body, Shape, Space, and Effort. After extracting the features, quantization, normalization, and classification will be performed. Using Hidden Markov Model (HMM). In this study using two LMA components, namely Space and Effort in extracting features in motion recognition patterns. From the results of the test and the resulting accuracy is approaching 99% for dance motion data

    Dance Gesture Recognition using Laban Movement Analysis with J48 Classification

    Get PDF
    This study describes the introduction of classical dance movements using the Laban Movement Analysis (LMA) method which consists of 3 main components, namely Body, Space, and Shape. How to carry out the classical motion recognition process using Kinect which is then read by the screen using the Brekel Kinect and produces dance motion pictures in different formats (. * BVH). After that, it is calculated using the LMA method by obtaining the results obtained in the form of numerical data from each joint from the direction of the axis (xyz), then classification is carried out using the J48 classification method provided at WEKA tools after 50 training data is carried out. 96% truth is recognized, because it guarantees those who meet the requirements, 12 data tests are carried out apart from training data, which can be 92% accurate on average, so it is very possible that this method can be used in dance preparation, especially in classical dance

    Being the center of attention: A Person-Context CNN framework for Personality Recognition

    Full text link
    This paper proposes a novel study on personality recognition using video data from different scenarios. Our goal is to jointly model nonverbal behavioral cues with contextual information for a robust, multi-scenario, personality recognition system. Therefore, we build a novel multi-stream Convolutional Neural Network framework (CNN), which considers multiple sources of information. From a given scenario, we extract spatio-temporal motion descriptors from every individual in the scene, spatio-temporal motion descriptors encoding social group dynamics, and proxemics descriptors to encode the interaction with the surrounding context. All the proposed descriptors are mapped to the same feature space facilitating the overall learning effort. Experiments on two public datasets demonstrate the effectiveness of jointly modeling the mutual Person-Context information, outperforming the state-of-the art-results for personality recognition in two different scenarios. Lastly, we present CNN class activation maps for each personality trait, shedding light on behavioral patterns linked with personality attributes

    Attention-Based Recurrent Autoencoder for Motion Capture Denoising

    Get PDF
    To resolve the problem of massive loss of MoCap data from optical motion capture, we propose a novel network architecture based on attention mechanism and recurrent network. Its advantage is that the use of encoder-decoder enables automatic human motion manifold learning, capturing the hidden spatial-temporal relationships in motion sequences. In addition, by using the multi-head attention mechanism, it is possible to identify the most relevant corrupted frames with specific position information to recovery the missing markers, which can lead to more accurate motion reconstruction. Simulation experiments demonstrate that the network model we proposed can effectively handle the large-scale missing markers problem with better robustness, smaller errors and more natural recovered motion sequence compared to the reference method
    • …
    corecore