Search CORE

4,663 research outputs found

VIBE: Video Inference for Human Body Pose and Shape Estimation

Author: Athanasiou Nikos
Black Michael J.
Kocabas Muhammed
Publication venue
Publication date: 01/01/2020
Field of study

Human motion is fundamental to understanding behavior. Despite progress on single-image 3D pose and shape estimation, existing video-based state-of-the-art methods fail to produce accurate and natural motion sequences due to a lack of ground-truth 3D motion data for training. To address this problem, we propose Video Inference for Body Pose and Shape Estimation (VIBE), which makes use of an existing large-scale motion capture dataset (AMASS) together with unpaired, in-the-wild, 2D keypoint annotations. Our key novelty is an adversarial learning framework that leverages AMASS to discriminate between real human motions and those produced by our temporal pose and shape regression networks. We define a temporal network architecture and show that adversarial training, at the sequence level, produces kinematically plausible motion sequences without in-the-wild ground-truth 3D labels. We perform extensive experimentation to analyze the importance of motion and demonstrate the effectiveness of VIBE on challenging 3D pose estimation datasets, achieving state-of-the-art performance. Code and pretrained models are available at https://github.com/mkocabas/VIBE.Comment: CVPR-2020 camera ready. Code is available at https://github.com/mkocabas/VIB

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Self Adversarial Training for Human Pose Estimation

Author: arjovsky
belagiannis
berthelot
bulat
cao
carreira
chen
chen
chu
gkioxari
gong
goodfellow
gulrajani
insafutdinov
isola
ledig
lifshitz
luc
mirza
newell
pan
pishchulin
radford
rafi
ramakrishna
tompson
wei
zhao
Publication venue
Publication date: 15/08/2017
Field of study

This paper presents a deep learning based approach to the problem of human pose estimation. We employ generative adversarial networks as our learning paradigm in which we set up two stacked hourglass networks with the same architecture, one as the generator and the other as the discriminator. The generator is used as a human pose estimator after the training is done. The discriminator distinguishes ground-truth heatmaps from generated ones, and back-propagates the adversarial loss to the generator. This process enables the generator to learn plausible human body configurations and is shown to be useful for improving the prediction accuracy.Comment: CVPR 2017 Workshop on Visual Understanding of Humans in Crowd Scene and the 1st Look Into Person (LIP) Challeng

arXiv.org e-Print Archive

Crossref

Establishing a Fusion Model of Attention Mechanism and Generative Adversarial Network to Estimate Students\u27 Attitudes in English Classes

Author: Song Tianyi
Zhao Tong
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2022
Field of study

With the rapid development of science and technology, artificial intelligence has been widely used in various fields and a new model of AI-aided education has been developed in the new era. In the education industry, AI-aided education can save teachers\u27 energy, improve teaching efficiency and help to refine teaching methods. In order to estimate students\u27 attitudes towards English teachers\u27 lectures, this paper proposed an AI-aided feedback system. In the constructed system, DG-Net was used to expand the data sets of students, and combined with Attention\u27s Alphapose model to collect students\u27 listening poses. The whole model provided feedback of students\u27 listening postures in English speaking and listening classes, assisting teachers to estimate students\u27 attitudes through data analysis and realizing AI-aided education in English classes

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Pedestrian Attribute Recognition: A Survey

Author: Luo Bin
Tang Jin
Wang Xiao
Yang Rui
Zheng Shaofei
Publication venue
Publication date: 22/01/2019
Field of study

Recognizing pedestrian attributes is an important task in computer vision community due to it plays an important role in video surveillance. Many algorithms has been proposed to handle this task. The goal of this paper is to review existing works using traditional methods or based on deep learning networks. Firstly, we introduce the background of pedestrian attributes recognition (PAR, for short), including the fundamental concepts of pedestrian attributes and corresponding challenges. Secondly, we introduce existing benchmarks, including popular datasets and evaluation criterion. Thirdly, we analyse the concept of multi-task learning and multi-label learning, and also explain the relations between these two learning algorithms and pedestrian attribute recognition. We also review some popular network architectures which have widely applied in the deep learning community. Fourthly, we analyse popular solutions for this task, such as attributes group, part-based, \emph{etc}. Fifthly, we shown some applications which takes pedestrian attributes into consideration and achieve better performance. Finally, we summarized this paper and give several possible research directions for pedestrian attributes recognition. The project page of this paper can be found from the following website: \url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey: https://sites.google.com/view/ahu-pedestrianattributes

arXiv.org e-Print Archive