177 research outputs found

    Kinematic assessment for stroke patients in a stroke game and a daily activity recognition and assessment system

    Get PDF
    Stroke is the leading cause of serious, long-term disabilities among which deficits in motor abilities in arms or legs are most common. Those who suffer a stroke can recover through effective rehabilitation which is delicately personalized. To achieve the best personalization, it is essential for clinicians to monitor patients' health status and recovery progress accurately and consistently. Traditionally, rehabilitation involves patients performing exercises in clinics where clinicians oversee the procedure and evaluate patients' recovery progress. Following the in-clinic visits, additional home practices are tailored and assigned to patients. The in-clinic visits are important to evaluate recovery progress. The information collected can then help clinicians customize home practices for stroke patients. However, as the number of in-clinic sessions is limited by insurance policies, the recovery information collected in-clinic is often insufficient. Meanwhile, the home practice programs report low adherence rates based on historic data. Given that clinicians rely on patients to self-report adherence, the actual adherence rate could be even lower. Despite the limited feedback clinicians could receive, the measurement method is subjective as well. In practice, classic clinical scales are mostly used for assessing the qualities of movements and the recovery status of patients. However, these clinical scales are evaluated subjectively with only moderate inter-rater and intra-rater reliabilities. Taken together, clinicians lack a method to get sufficient and accurate feedback from patients, which limits the extent to which clinicians can personalize treatment plans. This work aims to solve this problem. To help clinicians obtain abundant health information regarding patients' recovery in an objective approach, I've developed a novel kinematic assessment toolchain that consists of two parts. The first part is a tool to evaluate stroke patients' motions collected in a rehabilitation game setting. This kinematic assessment tool utilizes body-tracking in a rehabilitation game. Specifically, a set of upper body assessment measures were proposed and calculated for assessing the movements using skeletal joint data. Statistical analysis was applied to evaluate the quality of upper body motions using the assessment outcomes. Second, to classify and quantify home activities for stroke patients objectively and accurately, I've developed DARAS, a daily activity recognition and assessment system that evaluates daily motions in a home setting. DARAS consists of three main components: daily action logger, action recognition part, and assessment part. The logger is implemented with a Foresite system to record daily activities using depth and skeletal joint data. Daily activity data in a realistic environment were collected from sixteen post-stroke participants. The collection period for each participant lasts three months. An ensemble network for activity recognition and temporal localization was developed to detect and segment the clinically relevant actions from the recorded data. The ensemble network fuses the prediction outputs from customized 3D Convolutional-De-Convolutional, customized Region Convolutional 3D network and a proposed Region Hierarchical Co-occurrence network which learns rich spatial-temporal features from either depth data or joint data. The per-frame precision and the per-action precision were 0.819 and 0.838, respectively, on the validation set. For the recognized actions, the kinematic assessments were performed using the skeletal joint data, as well as the longitudinal assessments. The results showed that, compared with non-stroke participants, stroke participants had slower hand movements, were less active, and tended to perform fewer hand manipulation actions. The assessment outcomes from the proposed toolchain help clinicians to provide more personalized rehabilitation plans that benefit patients.Includes bibliographical references

    Vision Based Activity Recognition Using Machine Learning and Deep Learning Architecture

    Get PDF
    Human Activity recognition, with wide application in fields like video surveillance, sports, human interaction, elderly care has shown great influence in upbringing the standard of life of people. With the constant development of new architecture, models, and an increase in the computational capability of the system, the adoption of machine learning and deep learning for activity recognition has shown great improvement with high performance in recent years. My research goal in this thesis is to design and compare machine learning and deep learning models for activity recognition through videos collected from different media in the field of sports. Human activity recognition (HAR) mostly is to recognize the action performed by a human through the data collected from different sources automatically. Based on the literature review, most data collected for analysis is based on time series data collected through different sensors and video-based data collected through the camera. So firstly, our research analyzes and compare different machine learning and deep learning architecture with sensor-based data collected from an accelerometer of a smartphone place at different position of the human body. Without any hand-crafted feature extraction methods, we found that deep learning architecture outperforms most of the machine learning architecture and the use of multiple sensors has higher accuracy than a dataset collected from a single sensor. Secondly, as collecting data from sensors in real-time is not feasible in all the fields such as sports, we study the activity recognition by using the video dataset. For this, we used two state-of-the-art deep learning architectures previously trained on the big, annotated dataset using transfer learning methods for activity recognition in three different sports-related publicly available datasets. Extending the study to the different activities performed on a single sport, and to avoid the current trend of using special cameras and expensive set up around the court for data collection, we developed our video dataset using sports coverage of basketball games broadcasted through broadcasting media. The detailed analysis and experiments based on different criteria such as range of shots taken, scoring activities is presented for 8 different activities using state-of-art deep learning architecture for video classification

    Understanding Human Actions in Video

    Full text link
    Understanding human behavior is crucial for any autonomous system which interacts with humans. For example, assistive robots need to know when a person is signaling for help, and autonomous vehicles need to know when a person is waiting to cross the street. However, identifying human actions in video is a challenging and unsolved problem. In this work, we address several of the key challenges in human action recognition. To enable better representations of video sequences, we develop novel deep learning architectures which improve representations both at the level of instantaneous motion as well as at the level of long-term context. In addition, to reduce reliance on fixed action vocabularies, we develop a compositional representation of actions which allows novel action descriptions to be represented as a sequence of sub-actions. Finally, we address the issue of data collection for human action understanding by creating a large-scale video dataset, consisting of 70 million videos collected from internet video sharing sites and their matched descriptions. We demonstrate that these contributions improve the generalization performance of human action recognition systems on several benchmark datasets.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162887/1/stroud_1.pd

    Deep Learning for Action and Gesture Recognition in Image Sequences: A Survey

    Get PDF
    Interest in automatic action and gesture recognition has grown considerably in the last few years. This is due in part to the large number of application domains for this type of technology. As in many other computer vision areas, deep learning based methods have quickly become a reference methodology for obtaining state-of-the-art performance in both tasks. This chapter is a survey of current deep learning based methodologies for action and gesture recognition in sequences of images. The survey reviews both fundamental and cutting edge methodologies reported in the last few years. We introduce a taxonomy that summarizes important aspects of deep learning for approaching both tasks. Details of the proposed architectures, fusion strategies, main datasets, and competitions are reviewed. Also, we summarize and discuss the main works proposed so far with particular interest on how they treat the temporal dimension of data, their highlighting features, and opportunities and challenges for future research. To the best of our knowledge this is the first survey in the topic. We foresee this survey will become a reference in this ever dynamic field of research

    Action recognition from RGB-D data

    Get PDF
    In recent years, action recognition based on RGB-D data has attracted increasing attention. Different from traditional 2D action recognition, RGB-D data contains extra depth and skeleton modalities. Different modalities have their own characteristics. This thesis presents seven novel methods to take advantages of the three modalities for action recognition. First, effective handcrafted features are designed and frequent pattern mining method is employed to mine the most discriminative, representative and nonredundant features for skeleton-based action recognition. Second, to take advantages of powerful Convolutional Neural Networks (ConvNets), it is proposed to represent spatio-temporal information carried in 3D skeleton sequences in three 2D images by encoding the joint trajectories and their dynamics into color distribution in the images, and ConvNets are adopted to learn the discriminative features for human action recognition. Third, for depth-based action recognition, three strategies of data augmentation are proposed to apply ConvNets to small training datasets. Forth, to take full advantage of the 3D structural information offered in the depth modality and its being insensitive to illumination variations, three simple, compact yet effective images-based representations are proposed and ConvNets are adopted for feature extraction and classification. However, both of previous two methods are sensitive to noise and could not differentiate well fine-grained actions. Fifth, it is proposed to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling to deal with the issue. The structured dynamic image preserves the spatial-temporal information, enhances the structure information across both body parts/joints and different temporal scales, and takes advantages of ConvNets for action recognition. Sixth, it is proposed to extract and use scene flow for action recognition from RGB and depth data. Last, to exploit the joint information in multi-modal features arising from heterogeneous sources (RGB, depth), it is proposed to cooperatively train a single ConvNet (referred to as c-ConvNet) on both RGB features and depth features, and deeply aggregate the two modalities to achieve robust action recognition
    • …
    corecore