Search CORE

3,325 research outputs found

User Modelling for Personalised Dressing Assistance by Humanoid Robots

Author: Chang HJ
Demiris Y
Gao Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/06/2015
Field of study

University of Birmingham Research Portal

Spiral - Imperial College Digital Repository

Enhancing egocentric 3D pose estimation with third person views

Author: Corona Puyane Enric
Dhamanaskar Ameya
Dimiccoli Mariella
Moreno-Noguer Francesc
Pumarola Peris Albert
Publication venue: Elsevier
Publication date: 01/06/2023
Field of study

© 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-NDWe propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The main technical contribution consists of leveraging high-level features linking first- and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 2000 videos depicting human activities captured from both first- and third-view perspectives. We explicitly consider spatial- and motion-domain features, combined using a semi-Siamese architecture trained in a self-supervised fashion. Experimental results demonstrate that the joint multi-view embedded space learned with our dataset is useful to extract discriminatory features from arbitrary single-view egocentric videos, with no need to perform any sort of domain adaptation or knowledge of camera parameters. An extensive evalu- ation demonstrates that we achieve significant improvement in egocentric 3D body pose estimation per- formance on two unconstrained datasets, over three supervised state-of-the-art approaches. The collected dataset and pre-trained model are available for research purposes.This work has been partially supported by projects PID2020-120 049RB-I00 and PID2019-110977GA-I00 funded by MCIN/ AEI/10.13039/501100 011033 and by the “European Union NextGener-ationEU/PRTR”, as well as by grant RYC-2017-22563 funded by MCIN/ AEI /10.13039/501100 011033 and by “ESF Investing in your future”, and network RED2018-102511-T funded by MCIN/ AEIPeer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Articulated Clinician Detection Using 3D Pictorial Structures on RGB-D Data

Author: Abdolrahim Kadkhodamohammadi
Afshin Gangi
Agarwal
Amin
Amin
Andriluka
Baak
Bardram
Belagiannis
Beyl
Burenius
Eichner
Eichner
Felzenszwalb
Felzenszwalb
Felzenszwalb
Gentric
Haque
Hofmann
Jafari
Kadkhodamohammadi
Kadkhodamohammadi
Kiefel
Ladikos
Ladikos
Lea
Liu
Loy Rodas
Michel de Mathelin
Nicolas Padoy
Padoy
Ramanan
Sapp
Shotton
Sigal
Tang
Tokola
Tompson
Toshev
Twinanda
Yang
Ye
Zuffi
Publication venue: 'Elsevier BV'
Publication date: 06/07/2016
Field of study

Reliable human pose estimation (HPE) is essential to many clinical applications, such as surgical workflow analysis, radiation safety monitoring and human-robot cooperation. Proposed methods for the operating room (OR) rely either on foreground estimation using a multi-camera system, which is a challenge in real ORs due to color similarities and frequent illumination changes, or on wearable sensors or markers, which are invasive and therefore difficult to introduce in the room. Instead, we propose a novel approach based on Pictorial Structures (PS) and on RGB-D data, which can be easily deployed in real ORs. We extend the PS framework in two ways. First, we build robust and discriminative part detectors using both color and depth images. We also present a novel descriptor for depth images, called histogram of depth differences (HDD). Second, we extend PS to 3D by proposing 3D pairwise constraints and a new method that makes exact inference tractable. Our approach is evaluated for pose estimation and clinician detection on a challenging RGB-D dataset recorded in a busy operating room during live surgeries. We conduct series of experiments to study the different part detectors in conjunction with the various 2D or 3D pairwise constraints. Our comparisons demonstrate that 3D PS with RGB-D part detectors significantly improves the results in a visually challenging operating environment.Comment: The supplementary video is available at https://youtu.be/iabbGSqRSg

arXiv.org e-Print Archive

Crossref

HAL-Inserm

INRIA a CCSD electronic archive server

Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

Author: Black Michael J.
Pons-Moll Gerard
Rosenhahn Bodo
von Marcard Timo
Publication venue
Publication date: 24/03/2017
Field of study

We address the problem of making human motion capture in the wild more practical by using a small set of inertial sensors attached to the body. Since the problem is heavily under-constrained, previous methods either use a large number of sensors, which is intrusive, or they require additional video input. We take a different approach and constrain the problem by: (i) making use of a realistic statistical body model that includes anthropometric constraints and (ii) using a joint optimization framework to fit the model to orientation and acceleration measurements over multiple frames. The resulting tracker Sparse Inertial Poser (SIP) enables 3D human pose estimation using only 6 sensors (attached to the wrists, lower legs, back and head) and works for arbitrary human motions. Experiments on the recently released TNT15 dataset show that, using the same number of sensors, SIP achieves higher accuracy than the dataset baseline without using any video data. We further demonstrate the effectiveness of SIP on newly recorded challenging motions in outdoor scenarios such as climbing or jumping over a wall.Comment: 12 pages, Accepted at Eurographics 201

arXiv.org e-Print Archive

MPG.PuRe

Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Author: Horaud Radu
Lathuilière Stéphane
Massé Benoit
Mesejo Pablo
Publication venue: 'Elsevier BV'
Publication date: 23/04/2018
Field of study

This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and to adapt its gaze control strategy for human-robot interaction neither with the use of external sensors nor with human supervision. The robot learns to focus its attention onto groups of people from its own audio-visual experiences, independently of the number of people, of their positions and of their physical appearances. In particular, we use a recurrent neural network architecture in combination with Q-learning to find an optimal action-selection policy; we pre-train the network using a simulated environment that mimics realistic scenarios that involve speaking/silent participants, thus avoiding the need of tedious sessions of a robot interacting with people. Our experimental evaluation suggests that the proposed method is robust against parameter estimation, i.e. the parameter values yielded by the method do not have a decisive impact on the performance. The best results are obtained when both audio and visual information is jointly used. Experiments with the Nao robot indicate that our framework is a step forward towards the autonomous learning of socially acceptable gaze behavior.Comment: Paper submitted to Pattern Recognition Letter

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1