5,972 research outputs found

    Descriptive temporal template features for visual motion recognition

    Get PDF
    In this paper, a human action recognition system is proposed. The system is based on new, descriptive `temporal template' features in order to achieve high-speed recognition in real-time, embedded applications. The limitations of the well known `Motion History Image' (MHI) temporal template are addressed and a new `Motion History Histogram' (MHH) feature is proposed to capture more motion information in the video. MHH not only provides rich motion information, but also remains computationally inexpensive. To further improve classification performance, we combine both MHI and MHH into a low dimensional feature vector which is processed by a support vector machine (SVM). Experimental results show that our new representation can achieve a significant improvement in the performance of human action recognition over existing comparable methods, which use 2D temporal template based representations

    Going Deeper into Action Recognition: A Survey

    Full text link
    Understanding human actions in visual data is tied to advances in complementary research areas including object recognition, human dynamics, domain adaptation and semantic segmentation. Over the last decade, human action analysis evolved from earlier schemes that are often limited to controlled environments to nowadays advanced solutions that can learn from millions of videos and apply to almost all daily activities. Given the broad range of applications from video surveillance to human-computer interaction, scientific milestones in action recognition are achieved more rapidly, eventually leading to the demise of what used to be good in a short time. This motivated us to provide a comprehensive review of the notable steps taken towards recognizing human actions. To this end, we start our discussion with the pioneering methods that use handcrafted representations, and then, navigate into the realm of deep learning based approaches. We aim to remain objective throughout this survey, touching upon encouraging improvements as well as inevitable fallbacks, in the hope of raising fresh questions and motivating new research directions for the reader

    SpatioTemporal LBP and shape feature for human activity representation and recognition

    Get PDF
    In this paper, we propose a histogram based feature to represent and recognize human action in video sequences. Motion History Image (MHI) merges a video sequence into a single image. However, in this method, we use Directional Motion History Image (DMHI) to create four directional spatiotemporal templates. We, then, extract the Local Binary Pattern (LBP) from those templates. Then, spatiotemporal LBP histograms are formed to represent the distribution of those patterns which makes the feature vector. We also use shape feature taken from three selective snippets and concatenate them with the LBP histograms. We measure the performance of the proposed representation method along with some variants of it by experimenting on the Weizmann action dataset. Higher recognition rates found in the experiment suggest that, compared to complex representation, the proposed simple and compact representation can achieve robust recognition of human activity for practical use

    Early Recognition of Human Activities from First-Person Videos Using Onset Representations

    Full text link
    In this paper, we propose a methodology for early recognition of human activities from videos taken with a first-person viewpoint. Early recognition, which is also known as activity prediction, is an ability to infer an ongoing activity at its early stage. We present an algorithm to perform recognition of activities targeted at the camera from streaming videos, making the system to predict intended activities of the interacting person and avoid harmful events before they actually happen. We introduce the novel concept of 'onset' that efficiently summarizes pre-activity observations, and design an approach to consider event history in addition to ongoing video observation for early first-person recognition of activities. We propose to represent onset using cascade histograms of time series gradients, and we describe a novel algorithmic setup to take advantage of onset for early recognition of activities. The experimental results clearly illustrate that the proposed concept of onset enables better/earlier recognition of human activities from first-person videos

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

    A framework for cardio-pulmonary resuscitation (CPR) scene retrieval from medical simulation videos based on object and activity detection.

    Get PDF
    In this thesis, we propose a framework to detect and retrieve CPR activity scenes from medical simulation videos. Medical simulation is a modern training method for medical students, where an emergency patient condition is simulated on human-like mannequins and the students act upon. These simulation sessions are recorded by the physician, for later debriefing. With the increasing number of simulation videos, automatic detection and retrieval of specific scenes became necessary. The proposed framework for CPR scene retrieval, would eliminate the conventional approach of using shot detection and frame segmentation techniques. Firstly, our work explores the application of Histogram of Oriented Gradients in three dimensions (HOG3D) to retrieve the scenes containing CPR activity. Secondly, we investigate the use of Local Binary Patterns in Three Orthogonal Planes (LBPTOP), which is the three dimensional extension of the popular Local Binary Patterns. This technique is a robust feature that can detect specific activities from scenes containing multiple actors and activities. Thirdly, we propose an improvement to the above mentioned methods by a combination of HOG3D and LBP-TOP. We use decision level fusion techniques to combine the features. We prove experimentally that the proposed techniques and their combination out-perform the existing system for CPR scene retrieval. Finally, we devise a method to detect and retrieve the scenes containing the breathing bag activity, from the medical simulation videos. The proposed framework is tested and validated using eight medical simulation videos and the results are presented

    Human action representation and recognition: An approach to histogram od spatiotemporal templates

    Get PDF
    The motion sequences of human actions have its own discriminating profile that can be represented as a spatiotemporal template like Motion History Image (MHI). A histogram is a popular statistic to present the underlying information in a template. In this paper a histogram oriented action recognition method is presented. In the proposed method, we use the Directional Motion History Images (DMHI), their corresponding Local Binary Pattern (LBP) images and the Motion Energy Image (MEI) as spatiotemporal template. The intensity histogram is then extracted from those images which are concatenated together to form the feature vector for action representation. A linear combination of the histograms taken from DMHIs and LBP images is used in the experiment. We evaluated the performance of the proposed method along with some variants of it using the renowned KTH action dataset and found higher accuracies. The obtained results justify the superiority of the proposed method compared to other approaches for action recognition found in literature
    corecore