Search CORE

1,881 research outputs found

VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera

Author: Casas Dan
Mehta Dushyant
Rhodin Helge
Seidel Hans-Peter
Shafiei Mohammad
Sotnychenko Oleksandr
Sridhar Srinath
Theobalt Christian
Xu Weipeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

We present the first real-time method to capture the full global 3D skeletal pose of a human in a stable, temporally consistent manner using a single RGB camera. Our method combines a new convolutional neural network (CNN) based pose regressor with kinematic skeleton fitting. Our novel fully-convolutional pose formulation regresses 2D and 3D joint positions jointly in real time and does not require tightly cropped input frames. A real-time kinematic skeleton fitting method uses the CNN output to yield temporally stable 3D global pose reconstructions on the basis of a coherent kinematic skeleton. This makes our approach the first monocular RGB method usable in real-time applications such as 3D character control---thus far, the only monocular methods for such applications employed specialized RGB-D cameras. Our method's accuracy is quantitatively on par with the best offline 3D monocular RGB pose estimation methods. Our results are qualitatively comparable to, and sometimes better than, results from monocular RGB-D approaches, such as the Kinect. However, we show that our approach is more broadly applicable than RGB-D solutions, i.e. it works for outdoor scenes, community videos, and low quality commodity RGB cameras.Comment: Accepted to SIGGRAPH 201

arXiv.org e-Print Archive

MPG.PuRe

Radar and RGB-depth sensors for fall detection: a review

Author: Cippitelli Enea
Fioranelli Francesco
Gambi Ennio
Spinsante Susanna
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

This paper reviews recent works in the literature on the use of systems based on radar and RGB-Depth (RGB-D) sensors for fall detection, and discusses outstanding research challenges and trends related to this research field. Systems to detect reliably fall events and promptly alert carers and first responders have gained significant interest in the past few years in order to address the societal issue of an increasing number of elderly people living alone, with the associated risk of them falling and the consequences in terms of health treatments, reduced well-being, and costs. The interest in radar and RGB-D sensors is related to their capability to enable contactless and non-intrusive monitoring, which is an advantage for practical deployment and users’ acceptance and compliance, compared with other sensor technologies, such as video-cameras, or wearables. Furthermore, the possibility of combining and fusing information from The heterogeneous types of sensors is expected to improve the overall performance of practical fall detection systems. Researchers from different fields can benefit from multidisciplinary knowledge and awareness of the latest developments in radar and RGB-D sensors that this paper is discussing

Crossref

IRIS UniversitÃ Politecnica delle Marche

Enlighten

A multi-viewpoint feature-based re-identification system driven by skeleton keypoints

Author: Ghidoni Stefano
Munaro Matteo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Thanks to the increasing popularity of 3D sensors, robotic vision has experienced huge improvements in a wide range of applications and systems in the last years. Besides the many benefits, this migration caused some incompatibilities with those systems that cannot be based on range sensors, like intelligent video surveillance systems, since the two kinds of sensor data lead to different representations of people and objects. This work goes in the direction of bridging the gap, and presents a novel re-identification system that takes advantage of multiple video flows in order to enhance the performance of a skeletal tracking algorithm, which is in turn exploited for driving the re-identification. A new, geometry-based method for joining together the detections provided by the skeletal tracker from multiple video flows is introduced, which is capable of dealing with many people in the scene, coping with the errors introduced in each view by the skeletal tracker. Such method has a high degree of generality, and can be applied to any kind of body pose estimation algorithm. The system was tested on a public dataset for video surveillance applications, demonstrating the improvements achieved by the multi-viewpoint approach in the accuracy of both body pose estimation and re-identification. The proposed approach was also compared with a skeletal tracking system working on 3D data: the comparison assessed the good performance level of the multi-viewpoint approach. This means that the lack of the rich information provided by 3D sensors can be compensated by the availability of more than one viewpoint

Crossref

Archivio istituzionale della ricerca - Università di Padova

Environment capturing with Microsoft Kinect

Author: Komura Taku
Mackay Kevin
Shum Hubert P. H.
Publication venue
Publication date: 01/01/2012
Field of study

Northumbria Research Link

An original framework for understanding human actions and body language by using deep neural networks

Author: MASSARONI CRISTIANO
Publication venue
Publication date: 28/02/2020
Field of study

The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

Archivio della ricerca- Università di Roma La Sapienza

Human Body Posture Recognition Approaches: A Review

Author: Ali Mohammed A.
Hussain Abir J.
Sadiq Ahmed T.
Publication venue: 'Koya University'
Publication date: 13/06/2022
Field of study

Human body posture recognition has become the focus of many researchers in recent years. Recognition of body posture is used in various applications, including surveillance, security, and health monitoring. However, these systems that determine the body’s posture through video clips, images, or data from sensors have many challenges when used in the real world. This paper provides an important review of how most essential ‎ hardware technologies are ‎used in posture recognition systems‎. These systems capture and collect datasets through ‎accelerometer sensors or computer vision. In addition, this paper presents a comparison ‎study with state-of-the-art in terms of accuracy. We also present the advantages and ‎limitations of each system and suggest promising future ideas that can increase the ‎efficiency of the existing posture recognition system. Finally, the most common datasets ‎applied in these systems are described in detail. It aims to be a resource to help choose one of the methods in recognizing the posture of the human body and the techniques that suit each method. It analyzes more than 80 papers between 2015 and 202

ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY

Fusion of pose and head tracking data for immersive mixed-reality application development

Author: Cabrera Quesada Julian
Carballeira López Pablo
Czesak Katarzyna
García Santos Narciso
Mohedano del Pozo Raúl
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2016
Field of study

This work addresses the creation of a development framework where application developers can create, in a natural way, immersive physical activities where users experience a 3D first-person perception of full body control. The proposed frame-work is based on commercial motion sensors and a Head-Mounted Display (HMD), and a uses Unity 3D as a unifying environment where user pose, virtual scene and immersive visualization functions are coordinated. Our proposal is exemplified by the development of a toy application showing its practical us

Crossref

Archivo Digital UPM

Knowledge Representation for Robots through Human-Robot Interaction

Author: Bastianelli Emanuele
Bloisi Domenico
Capobianco Roberto
Gemignani Guglielmo
Iocchi Luca
Nardi Daniele
Publication venue
Publication date: 01/01/2013
Field of study

The representation of the knowledge needed by a robot to perform complex tasks is restricted by the limitations of perception. One possible way of overcoming this situation and designing "knowledgeable" robots is to rely on the interaction with the user. We propose a multi-modal interaction framework that allows to effectively acquire knowledge about the environment where the robot operates. In particular, in this paper we present a rich representation framework that can be automatically built from the metric map annotated with the indications provided by the user. Such a representation, allows then the robot to ground complex referential expressions for motion commands and to devise topological navigation plans to achieve the target locations.Comment: Knowledge Representation and Reasoning in Robotics Workshop at ICLP 201

arXiv.org e-Print Archive

Archivio della Ricerca - Università della Basilicata

Archivio della ricerca- Università di Roma La Sapienza