1 research outputs found
MOVIN: Real-time Motion Capture using a Single LiDAR
Recent advancements in technology have brought forth new forms of interactive
applications, such as the social metaverse, where end users interact with each
other through their virtual avatars. In such applications, precise full-body
tracking is essential for an immersive experience and a sense of embodiment
with the virtual avatar. However, current motion capture systems are not easily
accessible to end users due to their high cost, the requirement for special
skills to operate them, or the discomfort associated with wearable devices. In
this paper, we present MOVIN, the data-driven generative method for real-time
motion capture with global tracking, using a single LiDAR sensor. Our
autoregressive conditional variational autoencoder (CVAE) model learns the
distribution of pose variations conditioned on the given 3D point cloud from
LiDAR.As a central factor for high-accuracy motion capture, we propose a novel
feature encoder to learn the correlation between the historical 3D point cloud
data and global, local pose features, resulting in effective learning of the
pose prior. Global pose features include root translation, rotation, and foot
contacts, while local features comprise joint positions and rotations.
Subsequently, a pose generator takes into account the sampled latent variable
along with the features from the previous frame to generate a plausible current
pose. Our framework accurately predicts the performer's 3D global information
and local joint details while effectively considering temporally coherent
movements across frames. We demonstrate the effectiveness of our architecture
through quantitative and qualitative evaluations, comparing it against
state-of-the-art methods. Additionally, we implement a real-time application to
showcase our method in real-world scenarios. MOVIN dataset is available at
\url{https://movin3d.github.io/movin_pg2023/}