5,077 research outputs found
Human Motion Capture Data Tailored Transform Coding
Human motion capture (mocap) is a widely used technique for digitalizing
human movements. With growing usage, compressing mocap data has received
increasing attention, since compact data size enables efficient storage and
transmission. Our analysis shows that mocap data have some unique
characteristics that distinguish themselves from images and videos. Therefore,
directly borrowing image or video compression techniques, such as discrete
cosine transform, does not work well. In this paper, we propose a novel
mocap-tailored transform coding algorithm that takes advantage of these
features. Our algorithm segments the input mocap sequences into clips, which
are represented in 2D matrices. Then it computes a set of data-dependent
orthogonal bases to transform the matrices to frequency domain, in which the
transform coefficients have significantly less dependency. Finally, the
compression is obtained by entropy coding of the quantized coefficients and the
bases. Our method has low computational cost and can be easily extended to
compress mocap databases. It also requires neither training nor complicated
parameter setting. Experimental results demonstrate that the proposed scheme
significantly outperforms state-of-the-art algorithms in terms of compression
performance and speed
Low-latency compression of mocap data using learned spatial decorrelation transform
Due to the growing needs of human motion capture (mocap) in movie, video
games, sports, etc., it is highly desired to compress mocap data for efficient
storage and transmission. This paper presents two efficient frameworks for
compressing human mocap data with low latency. The first framework processes
the data in a frame-by-frame manner so that it is ideal for mocap data
streaming and time critical applications. The second one is clip-based and
provides a flexible tradeoff between latency and compression performance. Since
mocap data exhibits some unique spatial characteristics, we propose a very
effective transform, namely learned orthogonal transform (LOT), for reducing
the spatial redundancy. The LOT problem is formulated as minimizing square
error regularized by orthogonality and sparsity and solved via alternating
iteration. We also adopt a predictive coding and temporal DCT for temporal
decorrelation in the frame- and clip-based frameworks, respectively.
Experimental results show that the proposed frameworks can produce higher
compression performance at lower computational cost and latency than the
state-of-the-art methods.Comment: 15 pages, 9 figure
Learning from Synthetic Humans
Estimating human pose, shape, and motion from images and videos are
fundamental challenges with many applications. Recent advances in 2D human pose
estimation use large amounts of manually-labeled training data for learning
convolutional neural networks (CNNs). Such data is time consuming to acquire
and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion
is impractical. In this work we present SURREAL (Synthetic hUmans foR REAL
tasks): a new large-scale dataset with synthetically-generated but realistic
images of people rendered from 3D sequences of human motion capture data. We
generate more than 6 million frames together with ground truth pose, depth
maps, and segmentation masks. We show that CNNs trained on our synthetic
dataset allow for accurate human depth estimation and human part segmentation
in real RGB images. Our results and the new dataset open up new possibilities
for advancing person analysis using cheap and large-scale synthetic data.Comment: Appears in: 2017 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2017). 9 page
Subspace procrustes analysis
Postprint (author's final draft
- …