Search CORE

19 research outputs found

Skeleton-aided Articulated Motion Generation

Author: Ni Bingbing
Xu Jingwei
Yan Yichao
Yang Xiaokang
Publication venue
Publication date: 14/09/2017
Field of study

This work make the first attempt to generate articulated human motion sequence from a single image. On the one hand, we utilize paired inputs including human skeleton information as motion embedding and a single human image as appearance reference, to generate novel motion frames, based on the conditional GAN infrastructure. On the other hand, a triplet loss is employed to pursue appearance-smoothness between consecutive frames. As the proposed framework is capable of jointly exploiting the image appearance space and articulated/kinematic motion space, it generates realistic articulated motion sequence, in contrast to most previous video generation methods which yield blurred motion effects. We test our model on two human action datasets including KTH and Human3.6M, and the proposed framework generates very promising results on both datasets.Comment: ACM MM 201

arXiv.org e-Print Archive

Crossref

Deep Video Generation, Prediction and Completion of Human Action Sequences

Author: A Newell
A Odena
C Dong
C Ionescu
J Jia
J Johnson
J Walker
J-Y Zhu
L Wang
Olaf Ronneberger
R Zhang
RH Byrd
X Wang
Y Wexler
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/12/2017
Field of study

Current deep learning results on video generation are limited while there are only a few first results on video prediction and no relevant significant results on video completion. This is due to the severe ill-posedness inherent in these three problems. In this paper, we focus on human action videos, and propose a general, two-stage deep framework to generate human action videos with no constraints or arbitrary number of constraints, which uniformly address the three problems: video generation given no input frames, video prediction given the first few frames, and video completion given the first and last frames. To make the problem tractable, in the first stage we train a deep generative model that generates a human pose sequence from random noise. In the second stage, a skeleton-to-image network is trained, which is used to generate a human action video given the complete human pose sequence generated in the first stage. By introducing the two-stage strategy, we sidestep the original ill-posed problems while producing for the first time high-quality video generation/prediction/completion results of much longer duration. We present quantitative and qualitative evaluation to show that our two-stage approach outperforms state-of-the-art methods in video generation, prediction and video completion. Our video result demonstration can be viewed at https://iamacewhite.github.io/supp/index.htmlComment: Under review for CVPR 2018. Haoye and Chunyan have equal contributio

arXiv.org e-Print Archive

Crossref