4,663 research outputs found
VIBE: Video Inference for Human Body Pose and Shape Estimation
Human motion is fundamental to understanding behavior. Despite progress on
single-image 3D pose and shape estimation, existing video-based
state-of-the-art methods fail to produce accurate and natural motion sequences
due to a lack of ground-truth 3D motion data for training. To address this
problem, we propose Video Inference for Body Pose and Shape Estimation (VIBE),
which makes use of an existing large-scale motion capture dataset (AMASS)
together with unpaired, in-the-wild, 2D keypoint annotations. Our key novelty
is an adversarial learning framework that leverages AMASS to discriminate
between real human motions and those produced by our temporal pose and shape
regression networks. We define a temporal network architecture and show that
adversarial training, at the sequence level, produces kinematically plausible
motion sequences without in-the-wild ground-truth 3D labels. We perform
extensive experimentation to analyze the importance of motion and demonstrate
the effectiveness of VIBE on challenging 3D pose estimation datasets, achieving
state-of-the-art performance. Code and pretrained models are available at
https://github.com/mkocabas/VIBE.Comment: CVPR-2020 camera ready. Code is available at
https://github.com/mkocabas/VIB
Self Adversarial Training for Human Pose Estimation
This paper presents a deep learning based approach to the problem of human
pose estimation. We employ generative adversarial networks as our learning
paradigm in which we set up two stacked hourglass networks with the same
architecture, one as the generator and the other as the discriminator. The
generator is used as a human pose estimator after the training is done. The
discriminator distinguishes ground-truth heatmaps from generated ones, and
back-propagates the adversarial loss to the generator. This process enables the
generator to learn plausible human body configurations and is shown to be
useful for improving the prediction accuracy.Comment: CVPR 2017 Workshop on Visual Understanding of Humans in Crowd Scene
and the 1st Look Into Person (LIP) Challeng
Establishing a Fusion Model of Attention Mechanism and Generative Adversarial Network to Estimate Students\u27 Attitudes in English Classes
With the rapid development of science and technology, artificial intelligence has been widely used in various fields and a new model of AI-aided education has been developed in the new era. In the education industry, AI-aided education can save teachers\u27 energy, improve teaching efficiency and help to refine teaching methods. In order to estimate students\u27 attitudes towards English teachers\u27 lectures, this paper proposed an AI-aided feedback system. In the constructed system, DG-Net was used to expand the data sets of students, and combined with Attention\u27s Alphapose model to collect students\u27 listening poses. The whole model provided feedback of students\u27 listening postures in English speaking and listening classes, assisting teachers to estimate students\u27 attitudes through data analysis and realizing AI-aided education in English classes
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
- …