83,228 research outputs found
Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis
Deep person generation has attracted extensive research attention due to its
wide applications in virtual agents, video conferencing, online shopping and
art/movie production. With the advancement of deep learning, visual appearances
(face, pose, cloth) of a person image can be easily generated or manipulated on
demand. In this survey, we first summarize the scope of person generation, and
then systematically review recent progress and technical trends in deep person
generation, covering three major tasks: talking-head generation (face),
pose-guided person generation (pose) and garment-oriented person generation
(cloth). More than two hundred papers are covered for a thorough overview, and
the milestone works are highlighted to witness the major technical
breakthrough. Based on these fundamental tasks, a number of applications are
investigated, e.g., virtual fitting, digital human, generative data
augmentation. We hope this survey could shed some light on the future prospects
of deep person generation, and provide a helpful foundation for full
applications towards digital human
Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
Motion representation plays a vital role in human action recognition in
videos. In this study, we introduce a novel compact motion representation for
video action recognition, named Optical Flow guided Feature (OFF), which
enables the network to distill temporal information through a fast and robust
approach. The OFF is derived from the definition of optical flow and is
orthogonal to the optical flow. The derivation also provides theoretical
support for using the difference between two frames. By directly calculating
pixel-wise spatiotemporal gradients of the deep feature maps, the OFF could be
embedded in any existing CNN based video action recognition framework with only
a slight additional cost. It enables the CNN to extract spatiotemporal
information, especially the temporal information between frames simultaneously.
This simple but powerful idea is validated by experimental results. The network
with OFF fed only by RGB inputs achieves a competitive accuracy of 93.3% on
UCF-101, which is comparable with the result obtained by two streams (RGB and
optical flow), but is 15 times faster in speed. Experimental results also show
that OFF is complementary to other motion modalities such as optical flow. When
the proposed method is plugged into the state-of-the-art video action
recognition framework, it has 96:0% and 74:2% accuracy on UCF-101 and HMDB-51
respectively. The code for this project is available at
https://github.com/kevin-ssy/Optical-Flow-Guided-Feature.Comment: CVPR 2018. code available at
https://github.com/kevin-ssy/Optical-Flow-Guided-Featur
- …