882,522 research outputs found

    Human Motion Analysis for Efficient Action Recognition

    Get PDF
    Automatic understanding of human actions is at the core of several application domains, such as content-based indexing, human-computer interaction, surveillance, and sports video analysis. The recent advances in digital platforms and the exponential growth of video and image data have brought an urgent quest for intelligent frameworks to automatically analyze human motion and predict their corresponding action based on visual data and sensor signals. This thesis presents a collection of methods that targets human action recognition using different action modalities. The first method uses the appearance modality and classifies human actions based on heterogeneous global- and local-based features of scene and humanbody appearances. The second method harnesses 2D and 3D articulated human poses and analyizes the body motion using a discriminative combination of the parts’ velocities, locations, and correlations histograms for action recognition. The third method presents an optimal scheme for combining the probabilistic predictions from different action modalities by solving a constrained quadratic optimization problem. In addition to the action classification task, we present a study that compares the utility of different pose variants in motion analysis for human action recognition. In particular, we compare the recognition performance when 2D and 3D poses are used. Finally, we demonstrate the efficiency of our pose-based method for action recognition in spotting and segmenting motion gestures in real time from a continuous stream of an input video for the recognition of the Italian sign gesture language

    Learning to recognise 3D human action from a new skeleton-based representation using deep convolutional neural networks

    Get PDF
    Recognising human actions in untrimmed videos is an important challenging task. An effective three-dimensional (3D) motion representation and a powerful learning model are two key factors influencing recognition performance. In this study, the authors introduce a new skeleton-based representation for 3D action recognition in videos. The key idea of the proposed representation is to transform 3D joint coordinates of the human body carried in skeleton sequences into RGB images via a colour encoding process. By normalising the 3D joint coordinates and dividing each skeleton frame into five parts, where the joints are concatenated according to the order of their physical connections, the colour-coded representation is able to represent spatio-temporal evolutions of complex 3D motions, independently of the length of each sequence. They then design and train different deep convolutional neural networks based on the residual network architecture on the obtained image-based representations to learn 3D motion features and classify them into classes. Their proposed method is evaluated on two widely used action recognition benchmarks: MSR Action3D and NTU-RGB+D, a very large-scale dataset for 3D human action recognition. The experimental results demonstrate that the proposed method outperforms previous state-of-the-art approaches while requiring less computation for training and prediction

    Qualitative Action Recognition by Wireless Radio Signals in Human–Machine Systems

    Get PDF
    Human-machine systems required a deep understanding of human behaviors. Most existing research on action recognition has focused on discriminating between different actions, however, the quality of executing an action has received little attention thus far. In this paper, we study the quality assessment of driving behaviors and present WiQ, a system to assess the quality of actions based on radio signals. This system includes three key components, a deep neural network based learning engine to extract the quality information from the changes of signal strength, a gradient-based method to detect the signal boundary for an individual action, and an activity-based fusion policy to improve the recognition performance in a noisy environment. By using the quality information, WiQ can differentiate a triple body status with an accuracy of 97%, whereas for identification among 15 drivers, the average accuracy is 88%. Our results show that, via dedicated analysis of radio signals, a fine-grained action characterization can be achieved, which can facilitate a large variety of applications, such as smart driving assistants

    Learning space-time structures for action recognition and localization

    Get PDF
    In this thesis the problem of automatic human action recognition and localization in videos is studied. In this problem, our goal is to recognize the category of the human action that is happening in the video, and also to localize the action in space and/or time. This problem is challenging due to the complexity of the human actions, the large intra-class variations and the distraction of backgrounds. Human actions are inherently structured patterns of body movements. However, past works are inadequate in learning the space-time structures in human actions and exploring them for better recognition and localization. In this thesis new methods are proposed that exploit such space-time structures for effective human action recognition and localization in videos, including sports videos, YouTube videos, TV programs and movies. A new local space-time video representation, the hierarchical Space-Time Segments, is first proposed. Using this new video representation, ensembles of hierarchical spatio-temporal trees, discovered directly from the training videos, are constructed to model the hierarchical, spatial and temporal structures of human actions. This proposed approach achieves promising performances in action recognition and localization on challenging benchmark datasets. Moreover, the discovered trees show good cross-dataset generalizability: trees learned on one dataset can be used to recognize and localize similar actions in another dataset. To handle large scale data, a deep model is explored that learns temporal progression of the actions using Long Short Term Memory (LSTM), which is a type of Recurrent Neural Network (RNN). Two novel ranking losses are proposed to train the model to better capture the temporal structures of actions for accurate action recognition and temporal localization. This model achieves state-of-art performance on a large scale video dataset. A deep model usually employs a Convolutional Neural Network (CNN) to learn visual features from video frames. The problem of utilizing web action images for training a Convolutional Neural Network (CNN) is also studied: training CNN typically requires a large number of training videos, but the findings of this study show that web action images can be utilized as additional training data to significantly reduce the burden of video training data collection

    Action-Gons: Action recognition with a discriminative dictionary of structured elements with varying granularity

    Get PDF
    LNCS v. 9007 entitled: Computer Vision -- ACCV 2014: 12th Asian Conference on Computer ..., Part 5This paper presents “Action-Gons”, a middle level representation for action recognition in videos. Actions in videos exhibit a reasonable level of regularity seen in human behavior, as well as a large degree of variation. One key property of action, compared with image scene, might be the amount of interaction among body parts, although scenes also observe structured patterns in 2D images. Here, we study highorder statistics of the interaction among regions of interest in actions and propose a mid-level representation for action recognition, inspired by the Julesz school of n-gon statistics. We propose a systematic learning process to build an over-complete dictionary of “Action-Gons”. We first extract motion clusters, named as action units, then sequentially learn a pool of action-gons with different granularities modeling different degree of interactions among action units. We validate the discriminative power of our learned action-gons on three challenging video datasets and show evident advantages over the existing methods. © Springer International Publishing Switzerland 2015.postprin

    Research on Hand Action Pattern Recognition of Bionic Limb Based on Surface Electromyography

    Get PDF
    Hands are important parts of a human body. It is not only the main tool for people to engage in productive labor, but also an important communication tool. When the hand moves, the human body produces a kind of signal named surface electromyography (sEMG), which is a kind of electrophysiological signal that accompanies muscle activity. It contains a lot of information about human movement consciousness. The bionic limb is driven by multi-degree-freedom control, which is got by converting the recognition result and this can meet the urgent need of people with disabilities for autonomous operation. A profound study of hand action pattern technology based on sEMG signals can achieve the ability of the bionic limb to distinguish the hand action fast and accurately. From the perspective of the pattern recognition of the bionic limb, this paper discussed the human hand action pattern recognition technology of sEMG. By analyzing and summarizing the current development of human hand movement recognition, the author proposed a bionic limb schema based on artificial neural network and the improved DT-SVM hand action recognition system. According to the research results, it is necessary to expand the type and total amount of hand movements and gesture recognition, in order to adapt to the objective requirements of the diversity of hand action patterns in the application of the bionic limb

    A bibliometric analysis of human action recognition

    Get PDF
    Over the past two decades, the use of computer vision methods for enabling machines to recognize human action from a sequence of images, has grown as information technologies advance, and hardware availability such as cameras (especially closed circuit television) has increased. From the latter part of the 1980s till recently, computer vision has been employed for human action recognition research. Due to the volume of existing academic studies, it would be impractical to review all researches. This paper presents a brief analysis regarding the body of knowledge in Human Activity Recognition from 1987 to 2015. Bibliometric techniques based on the Science Citation Index (SCI) databases of the Web of Science are employed where 1,172 articles are critically analysed on the various aspects of publication characteristics such as authorship, countries, institutions, number of citations, and keywords. The pace of publishing in this field has shown to increase rapidly over last 20 years. By identifying the global trends in HAR research, this study is beneficial for researchers, for example, in the selection of future research topics. Similarly, policy makers can also benefit from the findings for a better understanding of how HAR develops over time
    corecore