1,691 research outputs found

    Robust recognition and segmentation of human actions using HMMs with missing observations

    Get PDF
    This paper describes the integration of missing observation data with hidden Markov models to create a framework that is able to segment and classify individual actions from a stream of human motion using an incomplete 3D human pose estimation. Based on this framework, a model is trained to automatically segment and classify an activity sequence into its constituent subactions during inferencing. This is achieved by introducing action labels into the observation vector and setting these labels as missing data during inferencing, thus forcing the system to infer the probability of each action label. Additionally, missing data provides recognition-level support for occlusions and imperfect silhouette segmentation, permitting the use of a fast (real-time) pose estimation that delegates the burden of handling undetected limbs onto the action recognition system. Findings show that the use of missing data to segment activities is an accurate and elegant approach. Furthermore, action recognition can be accurate even when almost half of the pose feature data is missing due to occlusions, since not all of the pose data is important all of the time

    An original framework for understanding human actions and body language by using deep neural networks

    Get PDF
    The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

    Mining Mid-level Features for Action Recognition Based on Effective Skeleton Representation

    Get PDF
    Recently, mid-level features have shown promising performance in computer vision. Mid-level features learned by incorporating class-level information are potentially more discriminative than traditional low-level local features. In this paper, an effective method is proposed to extract mid-level features from Kinect skeletons for 3D human action recognition. Firstly, the orientations of limbs connected by two skeleton joints are computed and each orientation is encoded into one of the 27 states indicating the spatial relationship of the joints. Secondly, limbs are combined into parts and the limb's states are mapped into part states. Finally, frequent pattern mining is employed to mine the most frequent and relevant (discriminative, representative and non-redundant) states of parts in continuous several frames. These parts are referred to as Frequent Local Parts or FLPs. The FLPs allow us to build powerful bag-of-FLP-based action representation. This new representation yields state-of-the-art results on MSR DailyActivity3D and MSR ActionPairs3D

    Deep Learning Based Abnormal Gait Classification System Study with Heterogeneous Sensor Network

    Get PDF
    Gait is one of the important biological characteristics of the human body. Abnormal gait is mostly related to the lesion site and has been demonstrated to play a guiding role in clinical research such as medical diagnosis and disease prevention. In order to promote the research of automatic gait pattern recognition, this paper introduces the research status of abnormal gait recognition and systems analysis of the common gait recognition technologies. Based on this, two gait information extraction methods, sensor-based and vision-based, are studied, including wearable system design and deep neural network-based algorithm design. In the sensor-based study, we proposed a lower limb data acquisition system. The experiment was designed to collect acceleration signals and sEMG signals under normal and pathological gaits. Specifically, wearable hardware-based on MSP430 and upper computer software based on Labview is designed. The hardware system consists of EMG foot ring, high-precision IMU and pressure-sensitive intelligent insole. Data of 15 healthy persons and 15 hemiplegic patients during walking were collected. The classification of gait was carried out based on sEMG and the average accuracy rate can reach 92.8% for CNN. For IMU signals five kinds of abnormal gait are trained based on three models: BPNN, LSTM, and CNN. The experimental results show that the system combined with the neural network can classify different pathological gaits well, and the average accuracy rate of the six-classifications task can reach 93%. In vision-based research, by using human keypoint detection technology, we obtain the precise location of the key points through the fusion of thermal mapping and offset, thus extracts the space-time information of the key points. However, the results show that even the state-of-the-art is not good enough for replacing IMU in gait analysis and classification. The good news is the rhythm wave can be observed within 2 m, which proves that the temporal and spatial information of the key points extracted is highly correlated with the acceleration information collected by IMU, which paved the way for the visual-based abnormal gait classification algorithm.步态指人走路时表现出来的姿态,是人体重要生物特征之一。异常步态多与病变部位有关,作为反映人体健康状况和行为能力的重要特征,其被论证在医疗诊断、疾病预防等临床研究中具有指导作用。为了促进步态模式自动识别的研究,本文介绍了异常步态识别的研究现状,系统地分析了常见步态识别技术以及算法,以此为基础研究了基于传感器与基于视觉两种步态信息提取方法,内容包括可穿戴系统设计与基于深度神经网络的算法设计。 在基于传感器的研究中,本工作开发了下肢步态信息采集系统,并利用该信息采集系统设计实验,采集正常与不同病理步态下的加速度信号与肌电信号,搭建深度神经网络完成分类任务。具体的,在系统搭建部分设计了基于MSP430的可穿戴硬件设备以及基于Labview的上位机软件,该硬件系统由肌电脚环,高精度IMU以及压感智能鞋垫组成,该上位机软件接收、解包蓝牙数据并计算出步频步长等常用步态参数。 在基于运动信号与基于表面肌电的研究中,采集了15名健康人与15名偏瘫病人的步态数据,并针对表面肌电信号训练卷积神经网络进行帕金森步态的识别与分类,平均准确率可达92.8%。针对运动信号训练了反向传播神经网络,LSTM以及卷积神经网络三种模型进行五种异常步态的分类任务。实验结果表明,本工作中步态信息采集系统结合神经网络模型,可以很好地对不同病理步态进行分类,六分类平均正确率可达93%。 在基于视觉的研究中,本文利用人体关键点检测技术,首先检测出图片中的一个或多个人,接着对边界框做图像分割,接着采用全卷积resnet对每一个边界框中的人物的主要关节点做热力图并分析偏移量,最后通过热力图与偏移的融合得到关键点的精确定位。通过该算法提取了不同步态下姿态关键点时空信息,为基于视觉的步态分析系统提供了基础条件。但实验结果表明目前最高准确率的人体关键点检测算法不足以替代IMU实现步态分析与分类。但在2m之内可以观察到节律信息,证明了所提取的关键点时空信息与IMU采集的加速度信息呈现较高相关度,为基于视觉的异常步态分类算法铺平了道路

    A Depth Video-based Human Detection and Activity Recognition using Multi-features and Embedded Hidden Markov Models for Health Care Monitoring Systems

    Get PDF
    Increase in number of elderly people who are living independently needs especial care in the form of healthcare monitoring systems. Recent advancements in depth video technologies have made human activity recognition (HAR) realizable for elderly healthcare applications. In this paper, a depth video-based novel method for HAR is presented using robust multi-features and embedded Hidden Markov Models (HMMs) to recognize daily life activities of elderly people living alone in indoor environment such as smart homes. In the proposed HAR framework, initially, depth maps are analyzed by temporal motion identification method to segment human silhouettes from noisy background and compute depth silhouette area for each activity to track human movements in a scene. Several representative features, including invariant, multi-view differentiation and spatiotemporal body joints features were fused together to explore gradient orientation change, intensity differentiation, temporal variation and local motion of specific body parts. Then, these features are processed by the dynamics of their respective class and learned, modeled, trained and recognized with specific embedded HMM having active feature values. Furthermore, we construct a new online human activity dataset by a depth sensor to evaluate the proposed features. Our experiments on three depth datasets demonstrated that the proposed multi-features are efficient and robust over the state of the art features for human action and activity recognition

    Expressive movement generation with machine learning

    Get PDF
    Movement is an essential aspect of our lives. Not only do we move to interact with our physical environment, but we also express ourselves and communicate with others through our movements. In an increasingly computerized world where various technologies and devices surround us, our movements are essential parts of our interaction with and consumption of computational devices and artifacts. In this context, incorporating an understanding of our movements within the design of the technologies surrounding us can significantly improve our daily experiences. This need has given rise to the field of movement computing – developing computational models of movement that can perceive, manipulate, and generate movements. In this thesis, we contribute to the field of movement computing by building machine-learning-based solutions for automatic movement generation. In particular, we focus on using machine learning techniques and motion capture data to create controllable, generative movement models. We also contribute to the field by creating datasets, tools, and libraries that we have developed during our research. We start our research by reviewing the works on building automatic movement generation systems using machine learning techniques and motion capture data. Our review covers background topics such as high-level movement characterization, training data, features representation, machine learning models, and evaluation methods. Building on our literature review, we present WalkNet, an interactive agent walking movement controller based on neural networks. The expressivity of virtual, animated agents plays an essential role in their believability. Therefore, WalkNet integrates controlling the expressive qualities of movement with the goal-oriented behaviour of an animated virtual agent. It allows us to control the generation based on the valence and arousal levels of affect, the movement’s walking direction, and the mover’s movement signature in real-time. Following WalkNet, we look at controlling movement generation using more complex stimuli such as music represented by audio signals (i.e., non-symbolic music). Music-driven dance generation involves a highly non-linear mapping between temporally dense stimuli (i.e., the audio signal) and movements, which renders a more challenging modelling movement problem. To this end, we present GrooveNet, a real-time machine learning model for music-driven dance generation
    corecore