1,219 research outputs found
Skeleton2Humanoid: Animating Simulated Characters for Physically-plausible Motion In-betweening
Human motion synthesis is a long-standing problem with various applications
in digital twins and the Metaverse. However, modern deep learning based motion
synthesis approaches barely consider the physical plausibility of synthesized
motions and consequently they usually produce unrealistic human motions. In
order to solve this problem, we propose a system ``Skeleton2Humanoid'' which
performs physics-oriented motion correction at test time by regularizing
synthesized skeleton motions in a physics simulator. Concretely, our system
consists of three sequential stages: (I) test time motion synthesis network
adaptation, (II) skeleton to humanoid matching and (III) motion imitation based
on reinforcement learning (RL). Stage I introduces a test time adaptation
strategy, which improves the physical plausibility of synthesized human
skeleton motions by optimizing skeleton joint locations. Stage II performs an
analytical inverse kinematics strategy, which converts the optimized human
skeleton motions to humanoid robot motions in a physics simulator, then the
converted humanoid robot motions can be served as reference motions for the RL
policy to imitate. Stage III introduces a curriculum residual force control
policy, which drives the humanoid robot to mimic complex converted reference
motions in accordance with the physical law. We verify our system on a typical
human motion synthesis task, motion-in-betweening. Experiments on the
challenging LaFAN1 dataset show our system can outperform prior methods
significantly in terms of both physical plausibility and accuracy. Code will be
released for research purposes at:
https://github.com/michaelliyunhao/Skeleton2HumanoidComment: Accepted by ACMMM202
Enhancing egocentric 3D pose estimation with third person views
© 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-NDWe propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The main technical contribution consists of leveraging high-level features linking first- and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 2000 videos depicting human activities captured from both first- and third-view perspectives. We explicitly consider spatial- and motion-domain features, combined using a semi-Siamese architecture trained in a self-supervised fashion. Experimental results demonstrate that the joint multi-view embedded space learned with our dataset is useful to extract discriminatory features from arbitrary single-view egocentric videos, with no need to perform any sort of domain adaptation or knowledge of camera parameters. An extensive evalu- ation demonstrates that we achieve significant improvement in egocentric 3D body pose estimation per- formance on two unconstrained datasets, over three supervised state-of-the-art approaches. The collected dataset and pre-trained model are available for research purposes.This work has been partially supported by projects PID2020-120 049RB-I00 and PID2019-110977GA-I00 funded by MCIN/ AEI/10.13039/501100 011033 and by the “European Union NextGener-ationEU/PRTR”, as well as by grant RYC-2017-22563 funded by MCIN/ AEI /10.13039/501100 011033 and by “ESF Investing in your future”, and network RED2018-102511-T funded by MCIN/ AEIPeer ReviewedPostprint (published version
- …