1,222 research outputs found
Human from Blur: Human Pose Tracking from Blurry Images
We propose a method to estimate 3D human poses from substantially blurred
images. The key idea is to tackle the inverse problem of image deblurring by
modeling the forward problem with a 3D human model, a texture map, and a
sequence of poses to describe human motion. The blurring process is then
modeled by a temporal image aggregation step. Using a differentiable renderer,
we can solve the inverse problem by backpropagating the pixel-wise reprojection
error to recover the best human motion representation that explains a single or
multiple input images. Since the image reconstruction loss alone is
insufficient, we present additional regularization terms. To the best of our
knowledge, we present the first method to tackle this problem. Our method
consistently outperforms other methods on significantly blurry inputs since
they lack one or multiple key functionalities that our method unifies, i.e.
image deblurring with sub-frame accuracy and explicit 3D modeling of non-rigid
human motion.Comment: typos and minor error fixe
Video Propagation Networks
We propose a technique that propagates information forward through video
data. The method is conceptually simple and can be applied to tasks that
require the propagation of structured information, such as semantic labels,
based on video content. We propose a 'Video Propagation Network' that processes
video frames in an adaptive manner. The model is applied online: it propagates
information forward without the need to access future frames. In particular we
combine two components, a temporal bilateral network for dense and video
adaptive filtering, followed by a spatial network to refine features and
increased flexibility. We present experiments on video object segmentation and
semantic video segmentation and show increased performance comparing to the
best previous task-specific methods, while having favorable runtime.
Additionally we demonstrate our approach on an example regression task of color
propagation in a grayscale video.Comment: Appearing in Computer Vision and Pattern Recognition, 2017 (CVPR'17
Deep Video Generation, Prediction and Completion of Human Action Sequences
Current deep learning results on video generation are limited while there are
only a few first results on video prediction and no relevant significant
results on video completion. This is due to the severe ill-posedness inherent
in these three problems. In this paper, we focus on human action videos, and
propose a general, two-stage deep framework to generate human action videos
with no constraints or arbitrary number of constraints, which uniformly address
the three problems: video generation given no input frames, video prediction
given the first few frames, and video completion given the first and last
frames. To make the problem tractable, in the first stage we train a deep
generative model that generates a human pose sequence from random noise. In the
second stage, a skeleton-to-image network is trained, which is used to generate
a human action video given the complete human pose sequence generated in the
first stage. By introducing the two-stage strategy, we sidestep the original
ill-posed problems while producing for the first time high-quality video
generation/prediction/completion results of much longer duration. We present
quantitative and qualitative evaluation to show that our two-stage approach
outperforms state-of-the-art methods in video generation, prediction and video
completion. Our video result demonstration can be viewed at
https://iamacewhite.github.io/supp/index.htmlComment: Under review for CVPR 2018. Haoye and Chunyan have equal contributio
- …