27,025 research outputs found

    RGB-D-based Action Recognition Datasets: A Survey

    Get PDF
    Human action recognition from RGB-D (Red, Green, Blue and Depth) data has attracted increasing attention since the first work reported in 2010. Over this period, many benchmark datasets have been created to facilitate the development and evaluation of new algorithms. This raises the question of which dataset to select and how to use it in providing a fair and objective comparative evaluation against state-of-the-art methods. To address this issue, this paper provides a comprehensive review of the most commonly used action recognition related RGB-D video datasets, including 27 single-view datasets, 10 multi-view datasets, and 7 multi-person datasets. The detailed information and analysis of these datasets is a useful resource in guiding insightful selection of datasets for future research. In addition, the issues with current algorithm evaluation vis-\'{a}-vis limitations of the available datasets and evaluation protocols are also highlighted; resulting in a number of recommendations for collection of new datasets and use of evaluation protocols

    Fall Prediction and Prevention Systems: Recent Trends, Challenges, and Future Research Directions.

    Get PDF
    Fall prediction is a multifaceted problem that involves complex interactions between physiological, behavioral, and environmental factors. Existing fall detection and prediction systems mainly focus on physiological factors such as gait, vision, and cognition, and do not address the multifactorial nature of falls. In addition, these systems lack efficient user interfaces and feedback for preventing future falls. Recent advances in internet of things (IoT) and mobile technologies offer ample opportunities for integrating contextual information about patient behavior and environment along with physiological health data for predicting falls. This article reviews the state-of-the-art in fall detection and prediction systems. It also describes the challenges, limitations, and future directions in the design and implementation of effective fall prediction and prevention systems

    Semantic Pose using Deep Networks Trained on Synthetic RGB-D

    Full text link
    In this work we address the problem of indoor scene understanding from RGB-D images. Specifically, we propose to find instances of common furniture classes, their spatial extent, and their pose with respect to generalized class models. To accomplish this, we use a deep, wide, multi-output convolutional neural network (CNN) that predicts class, pose, and location of possible objects simultaneously. To overcome the lack of large annotated RGB-D training sets (especially those with pose), we use an on-the-fly rendering pipeline that generates realistic cluttered room scenes in parallel to training. We then perform transfer learning on the relatively small amount of publicly available annotated RGB-D data, and find that our model is able to successfully annotate even highly challenging real scenes. Importantly, our trained network is able to understand noisy and sparse observations of highly cluttered scenes with a remarkable degree of accuracy, inferring class and pose from a very limited set of cues. Additionally, our neural network is only moderately deep and computes class, pose and position in tandem, so the overall run-time is significantly faster than existing methods, estimating all output parameters simultaneously in parallel on a GPU in seconds.Comment: ICCV 2015 Submissio

    Fast and Robust Detection of Fallen People from a Mobile Robot

    Full text link
    This paper deals with the problem of detecting fallen people lying on the floor by means of a mobile robot equipped with a 3D depth sensor. In the proposed algorithm, inspired by semantic segmentation techniques, the 3D scene is over-segmented into small patches. Fallen people are then detected by means of two SVM classifiers: the first one labels each patch, while the second one captures the spatial relations between them. This novel approach showed to be robust and fast. Indeed, thanks to the use of small patches, fallen people in real cluttered scenes with objects side by side are correctly detected. Moreover, the algorithm can be executed on a mobile robot fitted with a standard laptop making it possible to exploit the 2D environmental map built by the robot and the multiple points of view obtained during the robot navigation. Additionally, this algorithm is robust to illumination changes since it does not rely on RGB data but on depth data. All the methods have been thoroughly validated on the IASLAB-RGBD Fallen Person Dataset, which is published online as a further contribution. It consists of several static and dynamic sequences with 15 different people and 2 different environments

    Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade

    Full text link
    Camera pose estimation is an important problem in computer vision. Common techniques either match the current image against keyframes with known poses, directly regress the pose, or establish correspondences between keypoints in the image and points in the scene to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achieve accurate results, but have traditionally needed to be trained offline on the target scene, preventing relocalisation in new environments. Recently, we showed how to circumvent this limitation by adapting a pre-trained forest to a new scene on the fly. The adapted forests achieved relocalisation performance that was on par with that of offline forests, and our approach was able to estimate the camera pose in close to real time. In this paper, we present an extension of this work that achieves significantly better relocalisation performance whilst running fully in real time. To achieve this, we make several changes to the original approach: (i) instead of accepting the camera pose hypothesis without question, we make it possible to score the final few hypotheses using a geometric approach and select the most promising; (ii) we chain several instantiations of our relocaliser together in a cascade, allowing us to try faster but less accurate relocalisation first, only falling back to slower, more accurate relocalisation as necessary; and (iii) we tune the parameters of our cascade to achieve effective overall performance. These changes allow us to significantly improve upon the performance our original state-of-the-art method was able to achieve on the well-known 7-Scenes and Stanford 4 Scenes benchmarks. As additional contributions, we present a way of visualising the internal behaviour of our forests and show how to entirely circumvent the need to pre-train a forest on a generic scene.Comment: Tommaso Cavallari, Stuart Golodetz, Nicholas Lord and Julien Valentin assert joint first authorshi

    Human behavioural analysis with self-organizing map for ambient assisted living

    Get PDF
    This paper presents a system for automatically classifying the resting location of a moving object in an indoor environment. The system uses an unsupervised neural network (Self Organising Feature Map) fully implemented on a low-cost, low-power automated home-based surveillance system, capable of monitoring activity level of elders living alone independently. The proposed system runs on an embedded platform with a specialised ceiling-mounted video sensor for intelligent activity monitoring. The system has the ability to learn resting locations, to measure overall activity levels and to detect specific events such as potential falls. First order motion information, including first order moving average smoothing, is generated from the 2D image coordinates (trajectories). A novel edge-based object detection algorithm capable of running at a reasonable speed on the embedded platform has been developed. The classification is dynamic and achieved in real-time. The dynamic classifier is achieved using a SOFM and a probabilistic model. Experimental results show less than 20% classification error, showing the robustness of our approach over others in literature with minimal power consumption. The head location of the subject is also estimated by a novel approach capable of running on any resource limited platform with power constraints
    • …
    corecore