1,574 research outputs found
RGBD Datasets: Past, Present and Future
Since the launch of the Microsoft Kinect, scores of RGBD datasets have been
released. These have propelled advances in areas from reconstruction to gesture
recognition. In this paper we explore the field, reviewing datasets across
eight categories: semantics, object pose estimation, camera tracking, scene
reconstruction, object tracking, human actions, faces and identification. By
extracting relevant information in each category we help researchers to find
appropriate data for their needs, and we consider which datasets have succeeded
in driving computer vision forward and why.
Finally, we examine the future of RGBD datasets. We identify key areas which
are currently underexplored, and suggest that future directions may include
synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style
RGB-D-based Action Recognition Datasets: A Survey
Human action recognition from RGB-D (Red, Green, Blue and Depth) data has
attracted increasing attention since the first work reported in 2010. Over this
period, many benchmark datasets have been created to facilitate the development
and evaluation of new algorithms. This raises the question of which dataset to
select and how to use it in providing a fair and objective comparative
evaluation against state-of-the-art methods. To address this issue, this paper
provides a comprehensive review of the most commonly used action recognition
related RGB-D video datasets, including 27 single-view datasets, 10 multi-view
datasets, and 7 multi-person datasets. The detailed information and analysis of
these datasets is a useful resource in guiding insightful selection of datasets
for future research. In addition, the issues with current algorithm evaluation
vis-\'{a}-vis limitations of the available datasets and evaluation protocols
are also highlighted; resulting in a number of recommendations for collection
of new datasets and use of evaluation protocols
Multimodal Signal Processing and Learning Aspects of Human-Robot Interaction for an Assistive Bathing Robot
We explore new aspects of assistive living on smart human-robot interaction
(HRI) that involve automatic recognition and online validation of speech and
gestures in a natural interface, providing social features for HRI. We
introduce a whole framework and resources of a real-life scenario for elderly
subjects supported by an assistive bathing robot, addressing health and hygiene
care issues. We contribute a new dataset and a suite of tools used for data
acquisition and a state-of-the-art pipeline for multimodal learning within the
framework of the I-Support bathing robot, with emphasis on audio and RGB-D
visual streams. We consider privacy issues by evaluating the depth visual
stream along with the RGB, using Kinect sensors. The audio-gestural recognition
task on this new dataset yields up to 84.5%, while the online validation of the
I-Support system on elderly users accomplishes up to 84% when the two
modalities are fused together. The results are promising enough to support
further research in the area of multimodal recognition for assistive social
HRI, considering the difficulties of the specific task. Upon acceptance of the
paper part of the data will be publicly available
Skeleton-based human action and gesture recognition for human-robot collaboration
openThe continuous development of robotic and sensing technologies has led in recent years to an increased interest in human-robot collaborative systems, in which humans and robots perform tasks in shared spaces and interact with close and direct contacts. In these scenarios, it is fundamental for the robot to be aware of the behaviour that a person in its proximity has, to ensure their safety and anticipate their actions in performing a shared and collaborative task. To this end, human activity recognition (HAR) techniques have been often applied in human-robot collaboration (HRC) settings. The works in this field usually focus on case-specific applications. Instead, in this thesis we propose a general framework for human action and gesture recognition in a HRC scenario. In particular, a transfer learning enabled skeleton-based approach that employs as backbone the Shift-GCN architecture is used to classify general actions related to HRC scenarios. Pose-based body and hands features are exploited to recognise actions in a way that is independent from the environment in which these are performed and from the tools and objects involved in their execution. The fusion of small network modules, each dedicated to the recognition of either the body or hands movements, is then explored. This allows to better understand the importance of different body parts in the recognition of the actions as well as to improve the classification outcomes. For our experiments, we used the large-scale NTU RGB+D dataset to pre-train the networks. Moreover, a new HAR dataset, named IAS-Lab Collaborative HAR dataset, was collected, containing general actions and gestures related to HRC contexts. On this dataset, our approach reaches a 76.54% accuracy
- …