25 research outputs found

    Human Body Posture Recognition Approaches: A Review

    Get PDF
    Human body posture recognition has become the focus of many researchers in recent years. Recognition of body posture is used in various applications, including surveillance, security, and health monitoring. However, these systems that determine the body’s posture through video clips, images, or data from sensors have many challenges when used in the real world. This paper provides an important review of how most essential ‎ hardware technologies are ‎used in posture recognition systems‎. These systems capture and collect datasets through ‎accelerometer sensors or computer vision. In addition, this paper presents a comparison ‎study with state-of-the-art in terms of accuracy. We also present the advantages and ‎limitations of each system and suggest promising future ideas that can increase the ‎efficiency of the existing posture recognition system. Finally, the most common datasets ‎applied in these systems are described in detail. It aims to be a resource to help choose one of the methods in recognizing the posture of the human body and the techniques that suit each method. It analyzes more than 80 papers between 2015 and 202

    Deep Multi Temporal Scale Networks for Human Motion Analysis

    Get PDF
    The movement of human beings appears to respond to a complex motor system that contains signals at different hierarchical levels. For example, an action such as ``grasping a glass on a table'' represents a high-level action, but to perform this task, the body needs several motor inputs that include the activation of different joints of the body (shoulder, arm, hand, fingers, etc.). Each of these different joints/muscles have a different size, responsiveness, and precision with a complex non-linearly stratified temporal dimension where every muscle has its temporal scale. Parts such as the fingers responds much faster to brain input than more voluminous body parts such as the shoulder. The cooperation we have when we perform an action produces smooth, effective, and expressive movement in a complex multiple temporal scale cognitive task. Following this layered structure, the human body can be described as a kinematic tree, consisting of joints connected. Although it is nowadays well known that human movement and its perception are characterised by multiple temporal scales, very few works in the literature are focused on studying this particular property. In this thesis, we will focus on the analysis of human movement using data-driven techniques. In particular, we will focus on the non-verbal aspects of human movement, with an emphasis on full-body movements. The data-driven methods can interpret the information in the data by searching for rules, associations or patterns that can represent the relationships between input (e.g. the human action acquired with sensors) and output (e.g. the type of action performed). Furthermore, these models may represent a new research frontier as they can analyse large masses of data and focus on aspects that even an expert user might miss. The literature on data-driven models proposes two families of methods that can process time series and human movement. The first family, called shallow models, extract features from the time series that can help the learning algorithm find associations in the data. These features are identified and designed by domain experts who can identify the best ones for the problem faced. On the other hand, the second family avoids this phase of extraction by the human expert since the models themselves can identify the best set of features to optimise the learning of the model. In this thesis, we will provide a method that can apply the multi-temporal scales property of the human motion domain to deep learning models, the only data-driven models that can be extended to handle this property. We will ask ourselves two questions: what happens if we apply knowledge about how human movements are performed to deep learning models? Can this knowledge improve current automatic recognition standards? In order to prove the validity of our study, we collected data and tested our hypothesis in specially designed experiments. Results support both the proposal and the need for the use of deep multi-scale models as a tool to better understand human movement and its multiple time-scale nature

    Human Body Posture Recognition Approaches

    Get PDF
    Human body posture recognition has become the focus of many researchers in recent years. Recognition of body posture is used in various applications, including surveillance, security, and health monitoring. However, these systems that determine the body’s posture through video clips, images, or data from sensors have many challenges when used in the real world. This paper provides an important review of how most essential ‎ hardware technologies are ‎used in posture recognition systems‎. These systems capture and collect datasets through ‎accelerometer sensors or computer vision. In addition, this paper presents a comparison ‎study with state-of-the-art in terms of accuracy. We also present the advantages and ‎limitations of each system and suggest promising future ideas that can increase the ‎efficiency of the existing posture recognition system. Finally, the most common datasets ‎applied in these systems are described in detail. It aims to be a resource to help choose one of the methods in recognizing the posture of the human body and the techniques that suit each method. It analyzes more than 80 papers between 2015 and 202

    Review of three-dimensional human-computer interaction with focus on the leap motion controller

    Get PDF
    Modern hardware and software development has led to an evolution of user interfaces from command-line to natural user interfaces for virtual immersive environments. Gestures imitating real-world interaction tasks increasingly replace classical two-dimensional interfaces based on Windows/Icons/Menus/Pointers (WIMP) or touch metaphors. Thus, the purpose of this paper is to survey the state-of-the-art Human-Computer Interaction (HCI) techniques with a focus on the special field of three-dimensional interaction. This includes an overview of currently available interaction devices, their applications of usage and underlying methods for gesture design and recognition. Focus is on interfaces based on the Leap Motion Controller (LMC) and corresponding methods of gesture design and recognition. Further, a review of evaluation methods for the proposed natural user interfaces is given

    Dynamic deep learning for automatic facial expression recognition and its application in diagnosis of ADHD & ASD

    Get PDF
    Neurodevelopmental conditions like Attention Deficit Hyperactivity Disorder (ADHD) and Autism Spectrum Disorder (ASD) impact a significant number of children and adults worldwide. Currently, the means of diagnosing of such conditions is carried out by experts, who employ standard questionnaires and look for certain behavioural markers through manual observation. Such methods are not only subjective, difficult to repeat, and costly but also extremely time consuming. However, with the recent surge of research into automatic facial behaviour analysis and it's varied applications, it could prove to be a potential way of tackling these diagnostic difficulties. Automatic facial expression recognition is one of the core components of this field but it has always been challenging to do it accurately in an unconstrained environment. This thesis presents a dynamic deep learning framework for robust automatic facial expression recognition. It also proposes an approach to apply this method for facial behaviour analysis which can help in the diagnosis of conditions like ADHD and ASD. The proposed facial expression algorithm uses a deep Convolutional Neural Networks (CNN) to learn models of facial Action Units (AU). It attempts to model three main distinguishing features of AUs: shape, appearance and short term dynamics, jointly in a CNN. The appearance is modelled through local image regions relevant to each AU, shape is encoded using binary masks computed from automatically detected facial landmarks and dynamics is encoded by using a short sequence of image as input to CNN. In addition, the method also employs Bi-directional Long Short Memory (BLSTM) recurrent neural networks for modelling long term dynamics. The proposed approach is evaluated on a number of databases showing state-of-the-art performance for both AU detection and intensity estimation tasks. The AU intensities estimated using this approach along with other 3D face tracking data, are used for encoding facial behaviour. The encoded facial behaviour is applied for learning models which can help in detection of ADHD and ASD. This approach was evaluated on the KOMAA database which was specially collected for this purpose. Experimental results show that facial behaviour encoded in this way provide a high discriminative power for classification of people with these conditions. It is shown that the proposed system is a potentially useful, objective and time saving contribution to the clinical diagnosis of ADHD and ASD

    Device-free indoor localisation with non-wireless sensing techniques : a thesis by publications presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Electronics and Computer Engineering, Massey University, Albany, New Zealand

    Get PDF
    Global Navigation Satellite Systems provide accurate and reliable outdoor positioning to support a large number of applications across many sectors. Unfortunately, such systems do not operate reliably inside buildings due to the signal degradation caused by the absence of a clear line of sight with the satellites. The past two decades have therefore seen intensive research into the development of Indoor Positioning System (IPS). While considerable progress has been made in the indoor localisation discipline, there is still no widely adopted solution. The proliferation of Internet of Things (IoT) devices within the modern built environment provides an opportunity to localise human subjects by utilising such ubiquitous networked devices. This thesis presents the development, implementation and evaluation of several passive indoor positioning systems using ambient Visible Light Positioning (VLP), capacitive-flooring, and thermopile sensors (low-resolution thermal cameras). These systems position the human subject in a device-free manner (i.e., the subject is not required to be instrumented). The developed systems improve upon the state-of-the-art solutions by offering superior position accuracy whilst also using more robust and generalised test setups. The developed passive VLP system is one of the first reported solutions making use of ambient light to position a moving human subject. The capacitive-floor based system improves upon the accuracy of existing flooring solutions as well as demonstrates the potential for automated fall detection. The system also requires very little calibration, i.e., variations of the environment or subject have very little impact upon it. The thermopile positioning system is also shown to be robust to changes in the environment and subjects. Improvements are made over the current literature by testing across multiple environments and subjects whilst using a robust ground truth system. Finally, advanced machine learning methods were implemented and benchmarked against a thermopile dataset which has been made available for other researchers to use

    Degradation Modeling and Remaining Useful Life Estimation: From Statistical Signal Processing to Deep Learning Models

    Get PDF
    Aging critical infrastructures and valuable machineries together with recent catastrophic incidents such as the collapse of Morandi bridge, or the Gulf of Mexico oil spill disaster, call for an urgent quest to design advanced and innovative prognostic solutions, and efficiently incorporate multi-sensor streaming data sources for industrial development. Prognostic health management (PHM) is among the most critical disciplines that employs the advancement of the great interdependency between signal processing and machine learning techniques to form a key enabling technology to cope with maintenance development tasks of complex industrial and safety-critical systems. Recent advancements in predictive analytics have empowered the PHM paradigm to move from the traditional condition-based monitoring solutions and preventive maintenance programs to predictive maintenance to provide an early warning of failure, in several domains ranging from manufacturing and industrial systems to transportation and aerospace. The focus of the PHM is centered on two core dimensions; the first is taking into account the behavior and the evolution over time of a fault once it occurs, while the second one aims at estimating/predicting the remaining useful life (RUL) during which a device can perform its intended function. The first dimension is the degradation that is usually determined by a degradation model derived from measurements of critical parameters of relevance to the system. Developing an accurate model for the degradation process is a primary objective in prognosis and health management. Extensive research has been conducted to develop new theories and methodologies for degradation modeling and to accurately capture the degradation dynamics of a system. However, a unified degradation framework has yet not been developed due to: (i) structural uncertainties in the state dynamics of the system and (ii) the complex nature of the degradation process that is often non-linear and difficult to model statistically. Thus even for a single system, there is no consensus on the best degradation model. In this regard, this thesis tries to bridge this gap by proposing a general model that able to model the true degradation path without having any prior knowledge of the true degradation model of the system. Modeling and analysis of degradation behavior lead us to RUL estimation, which is the second dimension of the PHM and the second part of the thesis. The RUL is the main pillar of preventive maintenance, which is the time a machine is expected to work before requiring repair or replacement. Effective and accurate RUL estimation can avoid catastrophic failures, maximize operational availability, and consequently reduce maintenance costs. The RUL estimation is, therefore, of paramount importance and has gained significant attention for its importance to improve systems health management in complex fields including automotive, nuclear, chemical, and aerospace industries to name but a few. A vast number of researches related to different approaches to the concept of remaining useful life have been proposed, and they can be divided into three broad categories: (i) Physics-based; (ii) Data-driven, and; (iii) Hybrid approaches (multiple-model). Each category has its own limitations and issues, such as, hardly adapt to different prognostic applications, in the first one, and accuracy degradation issues, in the second one, because of the deviation of the learned models from the real behavior of the system. In addition to hardly sustain good generalization. Our thesis belongs to the third category, as it is the most promising category, in particular, the new hybrid models, on basis of two different architectures of deep neural networks, which have great potentials to tackle complex prognostic issues associated with systems with complex and unknown degradation processes

    Artificial Intelligence for Cognitive Health Assessment: State-of-the-Art, Open Challenges and Future Directions

    Get PDF
    The subjectivity and inaccuracy of in-clinic Cognitive Health Assessments (CHA) have led many researchers to explore ways to automate the process to make it more objective and to facilitate the needs of the healthcare industry. Artificial Intelligence (AI) and machine learning (ML) have emerged as the most promising approaches to automate the CHA process. In this paper, we explore the background of CHA and delve into the extensive research recently undertaken in this domain to provide a comprehensive survey of the state-of-the-art. In particular, a careful selection of significant works published in the literature is reviewed to elaborate a range of enabling technologies and AI/ML techniques used for CHA, including conventional supervised and unsupervised machine learning, deep learning, reinforcement learning, natural language processing, and image processing techniques. Furthermore, we provide an overview of various means of data acquisition and the benchmark datasets. Finally, we discuss open issues and challenges in using AI and ML for CHA along with some possible solutions. In summary, this paper presents CHA tools, lists various data acquisition methods for CHA, provides technological advancements, presents the usage of AI for CHA, and open issues, challenges in the CHA domain. We hope this first-of-its-kind survey paper will significantly contribute to identifying research gaps in the complex and rapidly evolving interdisciplinary mental health field

    Channel Phase Processing in Wireless Networks for Human Activity Recognition

    Full text link
    The phase of the channel state information (CSI) is underutilized as a source of information in wireless sensing due to its sensitivity to synchronization errors of the signal reception. A linear transformation of the phase is commonly applied to correct linear offsets and, in a few cases, some filtering in time or frequency is carried out to smooth the data. This paper presents a novel processing method of the CSI phase to improve the accuracy of human activity recognition (HAR) in indoor environments. This new method, coined Time Smoothing and Frequency Rebuild (TSFR), consists of performing a CSI phase sanitization method to remove phase impairments based on a linear regression and rotation method, then a time domain filtering stage with a Savitzy-Golay (SG) filter for denoising purposes and, finally, the phase is rebuilt, eliminating distortions in frequency caused by SG filtering. The TSFR method has been tested on five datasets obtained from experimental measurements, using three different deep learning algorithms, and compared against five other types of CSI phase processing. The results show an accuracy improvement using TSFR in all the cases. Concretely, accuracy performance higher than 90\% in most of the studied scenarios has been achieved with the proposed solution. In few-shot learning strategies, TSFR outperforms the state-of-the-art performance from 35\% to 85\%.Comment: submitted to IEEE Transactions on Mobile Computing (under review

    Analysis of Sign Language Facial Expressions and Deaf Students\u27 Retention Using Machine Learning and Agent-based Modeling

    Get PDF
    There are currently about 466 million people worldwide who have a hearing disability, and that number is expected to increase to 900 million by 2050. About 15% of adult Americans have hearing disabilities and about every three in 1,000 U.S. children are born with hearing loss in one or both ears. The World Health Organization (WHO) estimates that unaddressed hearing loss poses an annual global cost of $980 billion, including cost of educational support, loss of productivity, and societal costs. These are all evident that people with hearing loss are experiencing several kinds and levels of difficulties. In this dissertation, we are addressing two main challenges of hearing impaired people; sign language recognition and post-secondary education. Both sign language recognition and reliable education systems that properly support the deaf community are essential needs of the globe and in this dissertation we aim to attack these exact problems. For the first part, we introduce novel dataset and methodology using machine learning while for the second part, a novel agent-based model framework is proposed. Facial expressions are important parts of both gesture and sign language recognition systems. Despite the recent advances in both fields, annotated facial expression datasets in the context of sign language are still scarce resources. In this dissertation, we introduce an annotated sequenced facial expression dataset in the context of sign language, comprising over 3000 facial images extracted from the daily news and weather forecast of the public tv-station PHOENIX. Unlike the majority of currently existing facial expression datasets, FePh provides sequenced semi-blurry facial images with different head poses, orientations, and movements. In addition, in the majority of images, identities are mouthing the words, which makes the data more challenging. To annotate this dataset we consider primary, secondary, and tertiary dyads of seven basic emotions of sad , surprise , fear , angry , neutral , disgust , and happy . We also considered the None class if the image\u27s facial expression could not be described by any of the emotions. Although we provide FePh as a facial expression dataset of signers in sign language, it has a wider application in gesture recognition and Human Computer Interaction (HCI) systems. In addition, post-secondary education persistence is the likelihood of a student remaining in post-secondary education. Although statistics show that post-secondary persistence for deaf students has increased recently, there are still many obstacles obstructing students from completing their post-secondary degree goals. Therefore, increasing the persistence rate is crucial to increase education and work goals for deaf students. In this work, we present an agent-based model using NetLogo software for the persistence phenomena of deaf students. We consider four non-cognitive factors: having clear goals, social integration, social skills, and academic experience, which influence the departure decision of deaf students. Progress and results of this work suggest that agent-based modeling approaches promise to give better understanding of what will increase persistence
    corecore