462 research outputs found
Comprehensive review of vision-based fall detection systems
Vision-based fall detection systems have experienced fast development over the last years. To determine the course of its evolution and help new researchers, the main audience of this paper, a comprehensive revision of all published articles in the main scientific databases regarding this area during the last five years has been made. After a selection process, detailed in the Materials and Methods Section, eighty-one systems were thoroughly reviewed. Their characterization and classification techniques were analyzed and categorized. Their performance data were also studied, and comparisons were made to determine which classifying methods best work in this field. The evolution of artificial vision technology, very positively influenced by the incorporation of artificial neural networks, has allowed fall characterization to become more resistant to noise resultant from illumination phenomena or occlusion. The classification has also taken advantage of these networks, and the field starts using robots to make these systems mobile. However, datasets used to train them lack real-world data, raising doubts about their performances facing real elderly falls. In addition, there is no evidence of strong connections between the elderly and the communities of researchers
The basic assembly of skeletal models in the fall detection problem
The paper considers the appliance of the featureless approach to the human activity recognition problem, which exclude the direct anthropomorphic and visual characteristics of human figure from further analysis and thus increase the privacy of the monitoring system. A generalized pairwise comparison function of two human skeletal models, invariant to the sensor type, is used to project the object of interest to the secondary feature space, formed by the basic assembly of skeletons. A sequence of such projections in time forms an activity map, which allows an application of deep learning methods based on convolution neural networks for activity recognition. The proper ordering of skeletal models in a basic assembly plays an important role in secondary space design. The study of ordering of the basic assembly by the shortest unclosed path algorithm and correspondent activity maps for video streams from the TST Fall Detection v2 database are presented.The work was funded by the Ministry of Science and Higher Education of RF within the framework of the state task FEWG-2021-0012
A SKELETON FEATURES-BASED FALL DETECTION USING MICROSOFT KINECT V2 WITH ONE CLASS-CLASSIFIER OUTLIER REMOVAL
The real-time and robust fall detection is one of the key components of elderly people care and monitoring systems. Depth sensors, as they became more available, occupy an increasing place in event recognition systems. Some of them can directly produce a skeletal description of the human figure for compact representation of a person’s posture. Skeleton description makes the output of source video or detailed information about the depth outside the system unnecessary and raises the privacy of the entire system. Based on a comparative study of different RGB-D cameras, the most promising model for further development was chosen - Microsoft Kinect v2. The TST Fall Detection Dataset v2 is used here as a base for experiments. The proposed algorithm is based on the skeleton features encoding on the sequence of neighboring frames and support vector machine classifier. A version of a cumulative sum method is applied for combining the individual decisions on the consecutive frames. It is offered to use the one-class classifier for detection of low-quality skeletons. The 0.958 accuracy of our fall detection procedure was obtained in the cross-validation procedure based on the removal of records of a particular person from the database (Leave-one-Person-out)
Feature based dynamic intra-video indexing
A thesis submitted in partial fulfillment for the degree of Doctor of PhilosophyWith the advent of digital imagery and its wide spread application in all vistas of life, it has become an important component in the world of communication. Video content ranging from broadcast news, sports, personal videos, surveillance, movies and entertainment and similar domains is increasing exponentially in quantity and it is becoming a challenge to retrieve content of interest from the corpora. This has led to an increased interest amongst the researchers to investigate concepts of video structure analysis, feature extraction, content annotation, tagging, video indexing, querying and retrieval to fulfil the requirements. However, most of the previous work is confined within specific domain and constrained by the quality, processing and storage capabilities. This thesis presents a novel framework agglomerating the established approaches from feature extraction to browsing in one system of content based video retrieval. The proposed framework significantly fills the gap identified while satisfying the imposed constraints of processing, storage, quality and retrieval times. The output entails a framework, methodology and prototype application to allow the user to efficiently and effectively retrieved content of interest such as age, gender and activity by specifying the relevant query. Experiments have shown plausible results with an average precision and recall of 0.91 and 0.92 respectively for face detection using Haar wavelets based approach. Precision of age ranges from 0.82 to 0.91 and recall from 0.78 to 0.84. The recognition of gender gives better precision with males (0.89) compared to females while recall gives a higher value with females (0.92). Activity of the subject has been detected using Hough transform and classified using Hiddell Markov Model. A comprehensive dataset to support similar studies has also been developed as part of the research process. A Graphical User Interface (GUI) providing a friendly and intuitive interface has been integrated into the developed system to facilitate the retrieval process. The comparison results of the intraclass correlation coefficient (ICC) shows that the performance of the system closely resembles with that of the human annotator. The performance has been optimised for time and error rate
Articulated human tracking and behavioural analysis in video sequences
Recently, there has been a dramatic growth of interest in the observation and tracking
of human subjects through video sequences. Arguably, the principal impetus has come
from the perceived demand for technological surveillance, however applications in entertainment,
intelligent domiciles and medicine are also increasing. This thesis examines
human articulated tracking and the classi cation of human movement, rst separately
and then as a sequential process.
First, this thesis considers the development and training of a 3D model of human body
structure and dynamics. To process video sequences, an observation model is also designed
with a multi-component likelihood based on edge, silhouette and colour. This is de ned on
the articulated limbs, and visible from a single or multiple cameras, each of which may be
calibrated from that sequence. Second, for behavioural analysis, we develop a methodology
in which actions and activities are described by semantic labels generated from a Movement
Cluster Model (MCM). Third, a Hierarchical Partitioned Particle Filter (HPPF) was
developed for human tracking that allows multi-level parameter search consistent with the
body structure. This tracker relies on the articulated motion prediction provided by the
MCM at pose or limb level. Fourth, tracking and movement analysis are integrated to
generate a probabilistic activity description with action labels.
The implemented algorithms for tracking and behavioural analysis are tested extensively
and independently against ground truth on human tracking and surveillance
datasets. Dynamic models are shown to predict and generate synthetic motion, while
MCM recovers both periodic and non-periodic activities, de ned either on the whole body
or at the limb level. Tracking results are comparable with the state of the art, however
the integrated behaviour analysis adds to the value of the approach.Overseas Research Students Awards Scheme (ORSAS
Joint optimization of manifold learning and sparse representations for face and gesture analysis
Face and gesture understanding algorithms are powerful enablers in intelligent vision systems for surveillance, security, entertainment, and smart spaces. In the future, complex networks of sensors and cameras may disperse directions to lost tourists, perform directory lookups in the office lobby, or contact the proper authorities in case of an emergency. To be effective, these systems will need to embrace human subtleties while interacting with people in their natural conditions. Computer vision and machine learning techniques have recently become adept at solving face and gesture tasks using posed datasets in controlled conditions. However, spontaneous human behavior under unconstrained conditions, or in the wild, is more complex and is subject to considerable variability from one person to the next. Uncontrolled conditions such as lighting, resolution, noise, occlusions, pose, and temporal variations complicate the matter further. This thesis advances the field of face and gesture analysis by introducing a new machine learning framework based upon dimensionality reduction and sparse representations that is shown to be robust in posed as well as natural conditions. Dimensionality reduction methods take complex objects, such as facial images, and attempt to learn lower dimensional representations embedded in the higher dimensional data. These alternate feature spaces are computationally more efficient and often more discriminative. The performance of various dimensionality reduction methods on geometric and appearance based facial attributes are studied leading to robust facial pose and expression recognition models. The parsimonious nature of sparse representations (SR) has successfully been exploited for the development of highly accurate classifiers for various applications. Despite the successes of SR techniques, large dictionaries and high dimensional data can make these classifiers computationally demanding. Further, sparse classifiers are subject to the adverse effects of a phenomenon known as coefficient contamination, where for example variations in pose may affect identity and expression recognition. This thesis analyzes the interaction between dimensionality reduction and sparse representations to present a unified sparse representation classification framework that addresses both issues of computational complexity and coefficient contamination. Semi-supervised dimensionality reduction is shown to mitigate the coefficient contamination problems associated with SR classifiers. The combination of semi-supervised dimensionality reduction with SR systems forms the cornerstone for a new face and gesture framework called Manifold based Sparse Representations (MSR). MSR is shown to deliver state-of-the-art facial understanding capabilities. To demonstrate the applicability of MSR to new domains, MSR is expanded to include temporal dynamics. The joint optimization of dimensionality reduction and SRs for classification purposes is a relatively new field. The combination of both concepts into a single objective function produce a relation that is neither convex, nor directly solvable. This thesis studies this problem to introduce a new jointly optimized framework. This framework, termed LGE-KSVD, utilizes variants of Linear extension of Graph Embedding (LGE) along with modified K-SVD dictionary learning to jointly learn the dimensionality reduction matrix, sparse representation dictionary, sparse coefficients, and sparsity-based classifier. By injecting LGE concepts directly into the K-SVD learning procedure, this research removes the support constraints K-SVD imparts on dictionary element discovery. Results are shown for facial recognition, facial expression recognition, human activity analysis, and with the addition of a concept called active difference signatures, delivers robust gesture recognition from Kinect or similar depth cameras
Intelligent Sensors for Human Motion Analysis
The book, "Intelligent Sensors for Human Motion Analysis," contains 17 articles published in the Special Issue of the Sensors journal. These articles deal with many aspects related to the analysis of human movement. New techniques and methods for pose estimation, gait recognition, and fall detection have been proposed and verified. Some of them will trigger further research, and some may become the backbone of commercial systems
Recommended from our members
Fuzzy transfer learning in human activity recognition.
Assisted living environments are incorporated with different technological solutions to improve the quality of life and well-being. In recent years, there has been a growing interest in the research community on how to develop evolving solutions to aid assisted living. Different techniques have been studied to address the need for technological systems which are intelligent enough to evolve their knowledge to solve tasks which have not been previously encountered. One such approach is Transfer Learning (TL), for example, between humans and robots.
Humans excel at dealing with everyday activities, learning and adapting to different activities. This comprises different complex techniques which enable the lifelong learning process from observation of our environment. To obtain similar learning in assistive agents, TL is needed. The aim of the research reported in this thesis is to address the challenge associated with learning and reuse of knowledge by assistive agents in an Ambient Assisted Living (AAL) environment. In this thesis, a novel approach to transfer learning of human activities through the combination of three methods; TL, Fuzzy Systems (FS) and Human Activity Recognition (HAR) is presented. Through the incorporation of FS into the proposed approach, uncertainty that is evident in the dynamic nature of human activities are embedded into the learning model.
This research is focused on applications in assistive robotics. This is with a purpose of enabling assistive robots in AAL environments to acquire knowledge of such activities as are performed by humans. To achieve this, an extensive investigation into existing learning methods applied in human activities is conducted. The investigation encompasses current state-of-the-art of TL approaches employed in skill transfer across different but contextually related activities.
To address the research questions identified in the thesis, the contributions of the methodology employed are in three main categories; 1) Firstly, a novel framework for human activity learning from information observed. Experiments are conducted on selected human activities to acquire enough information for building the framework. From the acquired information, relevant features extracted are used in a learning model to recognise different activities. 2) Secondly, the sequence of occurrence(s) of tasks in an activity needs to be considered in the learning process. Therefore, in this research, a novel technique for adaptive learning of activity sequences from acquired information is developed. 3) Finally, from the sequence obtained, a novel technique for transfer of human activity across heterogeneous feature space existing between a human and an assistive robot is developed. These categories form the basis of the TL framework modelled in this research.
The framework proposed is applied to TL of human activity from data generated experimentally and benchmark datasets of various classes of human activities. The results presented in this thesis show that exploring the process of human activity learning is an important aspect in the TL framework. The features extracted sufficiently distinguish relevant patterns for each activity. Also, the results demonstrate the ability of the methodology to learn and predict human actions with a high degree of certainty. This encourages the use of TL in assisted living environments and other applications. This and many more applications of TL in technology would be a potential driver of the next revolution in artificial intelligence
- …