6,761 research outputs found

    Computer vision based posture estimation and fall detection.

    Get PDF
    Falls are a major health problem, especially in the elderly population. Increasing fall events demands a high quality of service and dedicated medical treatment which is an economic burden. Serious injuries due to fall can cost lives in the absence of immediate care and support. There- fore, a monitoring system that can accurately detect fall events and generate instant alerts for immediate care is extremely necessary. To address this problem, this research aims to develop a computer vision-based fall detection system. This study proposes fall detection in three stages: (A) Detection of human silhouette and recognition of the pose, (B) Detection of the human as three regions for different postures including fall and (C) Recognise fall and non-fall using locations of human body regions as distinguishing features. The first stages of work comprise human silhouette detection and identification of activities in the form of different poses. Identifying a pose is important to understand a fall event where a change of pose defines its characteristics. A fall event comprises of sequential change of poses and ends up in a lying pose. Initial pose during a fall can be standing, sitting or bending but the final pose is usually a lying pose. It would, therefore, be beneficial if lying pose is recognised more accurately than other normal activities such as standing, sitting, bending or crawling to address a fall. Hence in the first stage, Background Subtraction (BS) is used to detect human silhouette. After background subtraction, the foreground images were used in a Convolutional Neural Network (CNN) to recognise different poses. The RGB and the Depth images were captured from a Kinect Sensor. The fusion of RGB and Depth images were explored for feeding to a convolutional neural net- work. Depth together with RGB complimented each other to overcome their weakness respectively and proved to be a significant strategy. The classification was performed using CNN to recognise different activities with 81% accuracy on validation. The other challenge in fall detection is the tracking of a person during a fall. Background Subtraction is not sufficient to track a fallen person especially when there are lighting and viewpoint variations in the environment and present of another object like furniture, a pet or even another person. Furthermore, tracking be- comes tougher during the fall in comparison to normal activities like walking or sitting because the rate of change pose is higher during a fall. To overcome this, the idea is to locate the regions in the body in every frame and consider it as a stable tracking strategy. The location of the body parts provides crucial information to distinguish falls from the other normal activities as the person is detected all the time during these activities. Hence the second stage of this research consists of posture detection using the pose estimation technique. This research proposes to use CNN based pose estimation using simplified human postures. The available joints are grouped according to three regions: Head, Torso and Leg and then finally fed to the CNN model with just three inputs instead of several available joints. This strategy added stability in pose detection and proved to be more effective against complex poses observed during a fall. To train the CNN model, transfer learning technique was used. The model was able to achieve 96.7% accuracy in detecting the three regions on different human postures on the publicly available dataset. A system which considers all the lying poses as falls can also generate a higher false alarm. Lying on bed or sofa can easily generate a fall alarm if they are recognised as falls. Hence, it is important to recognise actual fall by considering a sequence of frames that defines a fall and not just the lying pose. In the third and final stage, this study proposes Long Short-Term Memory (LSTM) recurrent networks-based fall detection. The proposed LSTM model uses the detected three region’s location as input features. LSTM is capable of using contextual information from the sequential input patterns. Therefore, the LSTM model was fed with location features of different postures in a sequence for training. The model was able to learn fall patterns and distinguish them from other activities with 88.33% accuracy. Furthermore, the precision of the fall class was 1.0. This is highly desirable in the case of fall detection as there is no false alarm and this means that the cost incurred in calling medical support for a false alarm can be completely avoided

    Radar and RGB-depth sensors for fall detection: a review

    Get PDF
    This paper reviews recent works in the literature on the use of systems based on radar and RGB-Depth (RGB-D) sensors for fall detection, and discusses outstanding research challenges and trends related to this research field. Systems to detect reliably fall events and promptly alert carers and first responders have gained significant interest in the past few years in order to address the societal issue of an increasing number of elderly people living alone, with the associated risk of them falling and the consequences in terms of health treatments, reduced well-being, and costs. The interest in radar and RGB-D sensors is related to their capability to enable contactless and non-intrusive monitoring, which is an advantage for practical deployment and users’ acceptance and compliance, compared with other sensor technologies, such as video-cameras, or wearables. Furthermore, the possibility of combining and fusing information from The heterogeneous types of sensors is expected to improve the overall performance of practical fall detection systems. Researchers from different fields can benefit from multidisciplinary knowledge and awareness of the latest developments in radar and RGB-D sensors that this paper is discussing

    LiveCap: Real-time Human Performance Capture from Monocular Video

    Full text link
    We present the first real-time human performance capture approach that reconstructs dense, space-time coherent deforming geometry of entire humans in general everyday clothing from just a single RGB video. We propose a novel two-stage analysis-by-synthesis optimization whose formulation and implementation are designed for high performance. In the first stage, a skinned template model is jointly fitted to background subtracted input video, 2D and 3D skeleton joint positions found using a deep neural network, and a set of sparse facial landmark detections. In the second stage, dense non-rigid 3D deformations of skin and even loose apparel are captured based on a novel real-time capable algorithm for non-rigid tracking using dense photometric and silhouette constraints. Our novel energy formulation leverages automatically identified material regions on the template to model the differing non-rigid deformation behavior of skin and apparel. The two resulting non-linear optimization problems per-frame are solved with specially-tailored data-parallel Gauss-Newton solvers. In order to achieve real-time performance of over 25Hz, we design a pipelined parallel architecture using the CPU and two commodity GPUs. Our method is the first real-time monocular approach for full-body performance capture. Our method yields comparable accuracy with off-line performance capture techniques, while being orders of magnitude faster

    Automated Markerless Extraction of Walking People Using Deformable Contour Models

    No full text
    We develop a new automated markerless motion capture system for the analysis of walking people. We employ global evidence gathering techniques guided by biomechanical analysis to robustly extract articulated motion. This forms a basis for new deformable contour models, using local image cues to capture shape and motion at a more detailed level. We extend the greedy snake formulation to include temporal constraints and occlusion modelling, increasing the capability of this technique when dealing with cluttered and self-occluding extraction targets. This approach is evaluated on a large database of indoor and outdoor video data, demonstrating fast and autonomous motion capture for walking people

    New Fast Fall Detection Method Based on Spatio-Temporal Context Tracking of Head by Using Depth Images

    Get PDF
    © 2015 by the authors; licensee MDPI, Basel, Switzerland. In order to deal with the problem of projection occurring in fall detection with two-dimensional (2D) grey or color images, this paper proposed a robust fall detection method based on spatio-temporal context tracking over three-dimensional (3D) depth images that are captured by the Kinect sensor. In the pre-processing procedure, the parameters of the Single-Gauss-Model (SGM) are estimated and the coefficients of the floor plane equation are extracted from the background images. Once human subject appears in the scene, the silhouette is extracted by SGM and the foreground coefficient of ellipses is used to determine the head position. The dense spatio-temporal context (STC) algorithm is then applied to track the head position and the distance from the head to floor plane is calculated in every following frame of the depth image. When the distance is lower than an adaptive threshold, the centroid height of the human will be used as the second judgment criteria to decide whether a fall incident happened. Lastly, four groups of experiments with different falling directions are performed. Experimental results show that the proposed method can detect fall incidents that occurred in different orientations, and they only need a low computation complexity

    Robust recognition and segmentation of human actions using HMMs with missing observations

    Get PDF
    This paper describes the integration of missing observation data with hidden Markov models to create a framework that is able to segment and classify individual actions from a stream of human motion using an incomplete 3D human pose estimation. Based on this framework, a model is trained to automatically segment and classify an activity sequence into its constituent subactions during inferencing. This is achieved by introducing action labels into the observation vector and setting these labels as missing data during inferencing, thus forcing the system to infer the probability of each action label. Additionally, missing data provides recognition-level support for occlusions and imperfect silhouette segmentation, permitting the use of a fast (real-time) pose estimation that delegates the burden of handling undetected limbs onto the action recognition system. Findings show that the use of missing data to segment activities is an accurate and elegant approach. Furthermore, action recognition can be accurate even when almost half of the pose feature data is missing due to occlusions, since not all of the pose data is important all of the time
    corecore