119,237 research outputs found

    Vision-based hand gesture interaction using particle filter, principle component analysis and transition network

    Get PDF
    Vision-based human-computer interaction is becoming important nowadays. It offers natural interaction with computers and frees users from mechanical interaction devices, which is favourable especially for wearable computers. This paper presents a human-computer interaction system based on a conventional webcam and hand gesture recognition. This interaction system works in real time and enables users to control a computer cursor with hand motions and gestures instead of a mouse. Five hand gestures are designed on behalf of five mouse operations: moving, left click, left-double click, right click and no-action. An algorithm based on Particle Filter is used for tracking the hand position. PCA-based feature selection is used for recognizing the hand gestures. A transition network is also employed for improving the accuracy and reliability of the interaction system. This interaction system shows good performance in the recognition and interaction test

    Human activity recognition from automatically labeled data in RGB-D videos

    Get PDF
    Human Activity Recognition (HAR) is an interdisciplinary research area that has been attracting interest from several research communities specialized in machine learning, computer vision, medical and gaming research. The potential applications range from surveillance systems, human computer interfaces, sports video analysis, digital shopping assistants, video retrieval, games and health-care. Several and diverse approaches exist to recognize a human action. From computer vision techniques, modeling relations between human motion and objects, marker-based tracking systems and RGB-D cameras. Using a Kinect sensor that provides the position of the main skeleton joints we extract features based solely on the motion of those joints. This paper aims to compare the performance of several supervised classifiers trained with manually labeled data versus the same classifiers trained with data automatically labeled. We propose a framework capable of recognizing human actions using supervised classifiers trained with automatically labeled data.info:eu-repo/semantics/acceptedVersio

    Recognizing human actions from low-resolution videos by region-based mixture models

    Full text link
    © 2016 IEEE. Recognizing human action from low-resolution (LR) videos is essential for many applications including large-scale video surveillance, sports video analysis and intelligent aerial vehicles. Currently, state-of-the-art performance in action recognition is achieved by the use of dense trajectories which are extracted by optical flow algorithms. However, the optical flow algorithms are far from perfect in LR videos. In addition, the spatial and temporal layout of features is a powerful cue for action discrimination. While, most existing methods encode the layout by previously segmenting body parts which is not feasible in LR videos. Addressing the problems, we adopt the Layered Elastic Motion Tracking (LEMT) method to extract a set of long-term motion trajectories and a long-term common shape from each video sequence, where the extracted trajectories are much denser than those of sparse interest points(SIPs); then we present a hybrid feature representation to integrate both of the shape and motion features; and finally we propose a Region-based Mixture Model (RMM) to be utilized for action classification. The RMM models the spatial layout of features without any needs of body parts segmentation. Experiments are conducted on two publicly available LR human action datasets. Among which, the UT-Tower dataset is very challenging because the average height of human figures is only about 20 pixels. The proposed approach attains near-perfect accuracy on both of the datasets

    Detecting events and key actors in multi-person videos

    Full text link
    Multi-person event recognition is a challenging task, often with many people active in the scene but only a small subset contributing to an actual event. In this paper, we propose a model which learns to detect events in such videos while automatically "attending" to the people responsible for the event. Our model does not use explicit annotations regarding who or where those people are during training and testing. In particular, we track people in videos and use a recurrent neural network (RNN) to represent the track features. We learn time-varying attention weights to combine these features at each time-instant. The attended features are then processed using another RNN for event detection/classification. Since most video datasets with multiple people are restricted to a small number of videos, we also collected a new basketball dataset comprising 257 basketball games with 14K event annotations corresponding to 11 event classes. Our model outperforms state-of-the-art methods for both event classification and detection on this new dataset. Additionally, we show that the attention mechanism is able to consistently localize the relevant players.Comment: Accepted for publication in CVPR'1

    Combining Appearance and Motion for Human Action Classification in Videos

    Get PDF
    We study the question of activity classification in videos and present a novel approach for recognizing human action categories in videos by combining information from appearance and motion of human body parts. Our approach uses a tracking step which involves Particle Filtering and a local non - parametric clustering step. The motion information is provided by the trajectory of the cluster modes of a local set of particles. The statistical information about the particles of that cluster over a number of frames provides the appearance information. Later we use a “Bag ofWords” model to build one histogram per video sequence from the set of these robust appearance and motion descriptors. These histograms provide us characteristic information which helps us to discriminate among various human actions and thus classify them correctly. We tested our approach on the standard KTH and Weizmann human action datasets and the results were comparable to the state of the art. Additionally our approach is able to distinguish between activities that involve the motion of complete body from those in which only certain body parts move. In other words, our method discriminates well between activities with “gross motion” like running, jogging etc. and “local motion” like waving, boxing etc
    • …
    corecore