175 research outputs found

    QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars

    Full text link
    Real-time tracking of human body motion is crucial for interactive and immersive experiences in AR/VR. However, very limited sensor data about the body is available from standalone wearable devices such as HMDs (Head Mounted Devices) or AR glasses. In this work, we present a reinforcement learning framework that takes in sparse signals from an HMD and two controllers, and simulates plausible and physically valid full body motions. Using high quality full body motion as dense supervision during training, a simple policy network can learn to output appropriate torques for the character to balance, walk, and jog, while closely following the input signals. Our results demonstrate surprisingly similar leg motions to ground truth without any observations of the lower body, even when the input is only the 6D transformations of the HMD. We also show that a single policy can be robust to diverse locomotion styles, different body sizes, and novel environments

    MOVIN: Real-time Motion Capture using a Single LiDAR

    Full text link
    Recent advancements in technology have brought forth new forms of interactive applications, such as the social metaverse, where end users interact with each other through their virtual avatars. In such applications, precise full-body tracking is essential for an immersive experience and a sense of embodiment with the virtual avatar. However, current motion capture systems are not easily accessible to end users due to their high cost, the requirement for special skills to operate them, or the discomfort associated with wearable devices. In this paper, we present MOVIN, the data-driven generative method for real-time motion capture with global tracking, using a single LiDAR sensor. Our autoregressive conditional variational autoencoder (CVAE) model learns the distribution of pose variations conditioned on the given 3D point cloud from LiDAR.As a central factor for high-accuracy motion capture, we propose a novel feature encoder to learn the correlation between the historical 3D point cloud data and global, local pose features, resulting in effective learning of the pose prior. Global pose features include root translation, rotation, and foot contacts, while local features comprise joint positions and rotations. Subsequently, a pose generator takes into account the sampled latent variable along with the features from the previous frame to generate a plausible current pose. Our framework accurately predicts the performer's 3D global information and local joint details while effectively considering temporally coherent movements across frames. We demonstrate the effectiveness of our architecture through quantitative and qualitative evaluations, comparing it against state-of-the-art methods. Additionally, we implement a real-time application to showcase our method in real-world scenarios. MOVIN dataset is available at \url{https://movin3d.github.io/movin_pg2023/}

    {Physical Inertial Poser (PIP)}: {P}hysics-Aware Real-Time Human Motion Tracking From Sparse Inertial Sensors

    Get PDF

    An inertial motion capture framework for constructing body sensor networks

    Get PDF
    Motion capture is the process of measuring and subsequently reconstructing the movement of an animated object or being in virtual space. Virtual reconstructions of human motion play an important role in numerous application areas such as animation, medical science, ergonomics, etc. While optical motion capture systems are the industry standard, inertial body sensor networks are becoming viable alternatives due to portability, practicality and cost. This thesis presents an innovative inertial motion capture framework for constructing body sensor networks through software environments, smartphones and web technologies. The first component of the framework is a unique inertial motion capture software environment aimed at providing an improved experimentation environment, accompanied by programming scaffolding and a driver development kit, for users interested in studying or engineering body sensor networks. The software environment provides a bespoke 3D engine for kinematic motion visualisations and a set of tools for hardware integration. The software environment is used to develop the hardware behind a prototype motion capture suit focused on low-power consumption and hardware-centricity. Additional inertial measurement units, which are available commercially, are also integrated to demonstrate the functionality the software environment while providing the framework with additional sources for motion data. The smartphone is the most ubiquitous computing technology and its worldwide uptake has prompted many advances in wearable inertial sensing technologies. Smartphones contain gyroscopes, accelerometers and magnetometers, a combination of sensors that is commonly found in inertial measurement units. This thesis presents a mobile application that investigates whether the smartphone is capable of inertial motion capture by constructing a novel omnidirectional body sensor network. This thesis proposes a novel use for web technologies through the development of the Motion Cloud, a repository and gateway for inertial data. Web technologies have the potential to replace motion capture file formats with online repositories and to set a new standard for how motion data is stored. From a single inertial measurement unit to a more complex body sensor network, the proposed architecture is extendable and facilitates the integration of any inertial hardware configuration. The Motion Cloud’s data can be accessed through an application-programming interface or through a web portal that provides users with the functionality for visualising and exporting the motion data

    Full-body human motion reconstruction with sparse joint tracking using flexible sensors

    Get PDF
    Human motion tracking is a fundamental building block for various applications including computer animation, human-computer interaction, healthcare, etc. To reduce the burden of wearing multiple sensors, human motion prediction from sparse sensor inputs has become a hot topic in human motion tracking. However, such predictions are non-trivial as i) the widely adopted data-driven approaches can easily collapse to average poses. ii) the predicted motions contain unnatural jitters. In this work, we address the aforementioned issues by proposing a novel framework which can accurately predict the human joint moving angles from the signals of only four flexible sensors, thereby achieving the tracking of human joints in multi-degrees of freedom. Specifically, we mitigate the collapse to average poses by implementing the model with a Bi-LSTM neural network that makes full use of short-time sequence information; we reduce jitters by adding a median pooling layer to the network, which smooths consecutive motions. Although being bio-compatible and ideal for improving the wearing experience, the flexible sensors are prone to aging which increases prediction errors. Observing that the aging of flexible sensors usually results in drifts of their resistance ranges, we further propose a novel dynamic calibration technique to rescale sensor ranges, which further improves the prediction accuracy. Experimental results show that our method achieves a low and stable tracking error of 4.51 degrees across different motion types with only four sensors

    Human Motion Analysis Using Very Few Inertial Measurement Units

    Get PDF
    Realistic character animation and human motion analysis have become major topics of research. In this doctoral research work, three different aspects of human motion analysis and synthesis have been explored. Firstly, on the level of better management of tens of gigabytes of publicly available human motion capture data sets, a relational database approach has been proposed. We show that organizing motion capture data in a relational database provides several benefits such as centralized access to major freely available mocap data sets, fast search and retrieval of data, annotations based retrieval of contents, entertaining data from non-mocap sensor modalities etc. Moreover, the same idea is also proposed for managing quadruped motion capture data. Secondly, a new method of full body human motion reconstruction using very sparse configuration of sensors is proposed. In this setup, two sensor are attached to the upper extremities and one sensor is attached to the lower trunk. The lower trunk sensor is used to estimate ground contacts, which are later used in the reconstruction process along with the low dimensional inputs from the sensors attached to the upper extremities. The reconstruction results of the proposed method have been compared with the reconstruction results of the existing approaches and it has been observed that the proposed method generates lower average reconstruction errors. Thirdly, in the field of human motion analysis, a novel method of estimation of human soft biometrics such as gender, height, and age from the inertial data of a simple human walk is proposed. The proposed method extracts several features from the time and frequency domains for each individual step. A random forest classifier is fed with the extracted features in order to estimate the soft biometrics of a human. The results of classification have shown that it is possible with a higher accuracy to estimate the gender, height, and age of a human from the inertial data of a single step of his/her walk

    Combining motion matching and orientation prediction to animate avatars for consumer-grade VR devices

    Get PDF
    The animation of user avatars plays a crucial role in conveying their pose, gestures, and relative distances to virtual objects or other users. Self-avatar animation in immersive VR helps improve the user experience and provides a Sense of Embodiment. However, consumer-grade VR devices typically include at most three trackers, one at the Head Mounted Display (HMD), and two at the handheld VR controllers. Since the problem of reconstructing the user pose from such sparse data is ill-defined, especially for the lower body, the approach adopted by most VR games consists of assuming the body orientation matches that of the HMD, and applying animation blending and time-warping from a reduced set of animations. Unfortunately, this approach produces noticeable mismatches between user and avatar movements. In this work we present a new approach to animate user avatars that is suitable for current mainstream VR devices. First, we use a neural network to estimate the user's body orientation based on the tracking information from the HMD and the hand controllers. Then we use this orientation together with the velocity and rotation of the HMD to build a feature vector that feeds a Motion Matching algorithm. We built a MoCap database with animations of VR users wearing a HMD and used it to test our approach on both self-avatars and other users’ avatars. Our results show that our system can provide a large variety of lower body animations while correctly matching the user orientation, which in turn allows us to represent not only forward movements but also stepping in any direction.This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 860768 (CLIPE project) and the Spanish Ministry of Science and Innovation (PID2021-122136OB-C21).Peer ReviewedPostprint (published version

    Survey of Motion Tracking Methods Based on Inertial Sensors: A Focus on Upper Limb Human Motion

    Get PDF
    Motion tracking based on commercial inertial measurements units (IMUs) has been widely studied in the latter years as it is a cost-effective enabling technology for those applications in which motion tracking based on optical technologies is unsuitable. This measurement method has a high impact in human performance assessment and human-robot interaction. IMU motion tracking systems are indeed self-contained and wearable, allowing for long-lasting tracking of the user motion in situated environments. After a survey on IMU-based human tracking, five techniques for motion reconstruction were selected and compared to reconstruct a human arm motion. IMU based estimation was matched against motion tracking based on the Vicon marker-based motion tracking system considered as ground truth. Results show that all but one of the selected models perform similarly (about 35 mm average position estimation error)

    Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments

    Full text link
    Synthesizing interaction-involved human motions has been challenging due to the high complexity of 3D environments and the diversity of possible human behaviors within. We present LAMA, Locomotion-Action-MAnipulation, to synthesize natural and plausible long-term human movements in complex indoor environments. The key motivation of LAMA is to build a unified framework to encompass a series of everyday motions including locomotion, scene interaction, and object manipulation. Unlike existing methods that require motion data "paired" with scanned 3D scenes for supervision, we formulate the problem as a test-time optimization by using human motion capture data only for synthesis. LAMA leverages a reinforcement learning framework coupled with a motion matching algorithm for optimization, and further exploits a motion editing framework via manifold learning to cover possible variations in interaction and manipulation. Throughout extensive experiments, we demonstrate that LAMA outperforms previous approaches in synthesizing realistic motions in various challenging scenarios. Project page: https://jiyewise.github.io/projects/LAMA/ .Comment: Accepted to ICCV 202
    • …
    corecore