246 research outputs found

    Representation and recognition of action in interactive spaces

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 1999.Includes bibliographical references (p. 246-258).This thesis presents new theory and technology for the representation and recognition of complex, context-sensitive human actions in interactive spaces. To represent action and interaction a symbolic framework has been developed based on Roger Schank's conceptualizations, augmented by a mechanism to represent the temporal structure of the sub-actions based on Allen's interval algebra networks. To overcome the exponential nature of temporal constraint propagation in such networks, we have developed the PNF propagation algorithm based on the projection of IA-networks into simplified, 3-valued (past, now, future) constraint networks called PNF-networks. The PNF propagation algorithm has been applied to an action recognition vision system that handles actions composed of multiple, parallel threads of sub-actions, in situations that can not be efficiently dealt by the commonly used temporal representation schemes such as finite-state machines and HMMs. The PNF propagation algorithm is also the basis of interval scripts, a scripting paradigm for interactive systems that represents interaction as a set of temporal constraints between the individual components of the interaction. Unlike previously proposed non-procedural scripting methods, we use a strong temporal representation (allowing, for example, mutually exclusive actions) and perform control by propagating the temporal constraints in real-time. These concepts have been tested in the context of four projects involving story-driven interactive spaces. The action representation framework has been used in the Intelligent Studio project to enhance the control of automatic cameras in a TV studio. Interval scripts have been extensively employed in the development of "SingSong ", a short interactive performance that introduced the idea of live interaction with computer graphics characters; in "It/I", a full-length computer theater play; and in "It", an interactive art installation based on the play "It /I" that realizes our concept of immersive stages, that is, interactive spaces that can be used both by performers and public.by Claudio Santos Pinhanez.Ph.D

    Histogram of oriented rectangles: A new pose descriptor for human action recognition

    Get PDF
    Cataloged from PDF version of article.Most of the approaches to human action recognition tend to form complex models which require lots of parameter estimation and computation time. In this study, we show that, human actions can be simply represented by pose without dealing with the complex representation of dynamics. Based on this idea, we propose a novel pose descriptor which we name as Histogram-of-Oriented-Rectangles (HOR) for representing and recognizing human actions in videos. We represent each human pose in an action sequence by oriented rectangular patches extracted over the human silhouette. We then form spatial oriented histograms to represent the distribution of these rectangular patches. We make use of several matching strategies to carry the information from the spatial domain described by the HOR descriptor to temporal domain. These are (i) nearest neighbor classification, which recognizes the actions by matching the descriptors of each frame, (ii) global histogramming, which extends the idea of Motion Energy Image proposed by Bobick and Davis to rectangular patches, (iii) a classifier-based approach using Support Vector Machines, and (iv) adaptation of Dynamic Time Warping on the temporal representation of the HOR descriptor. For the cases when pose descriptor is not sufficiently strong alone, such as to differentiate actions "jogging" and "running", we also incorporate a simple velocity descriptor as a prior to the pose based classification step. We test our system with different configurations and experiment on two commonly used action datasets: the Weizmann dataset and the KTH dataset. Results show that our method is superior to other methods on Weizmann dataset with a perfect accuracy rate of 100%, and is comparable to the other methods on KTH dataset with a very high success rate close to 90%. These results prove that with a simple and compact representation, we can achieve robust recognition of human actions, compared to complex representations. (C) 2009 Elsevier B.V. All rights reserved

    Automatic human activity segmentation and labeling in RGBD videos

    Get PDF
    Human activity recognition has become one of the most active research topics in image processing and pattern recognition. Manual analysis of video is labour intensive, fatiguing, and error prone. Solving the problem of recognizing human activities from video can lead to improvements in several application fields like surveillance systems, human computer interfaces, sports video analysis, digital shopping assistants, video retrieval, gaming and health-care. This paper aims to recognize an action performed in a sequence of continuous actions recorded with a Kinect sensor based on the information about the position of the main skeleton joints. The typical approach is to use manually labeled data to perform supervised training. In this paper we propose a method to perform automatic temporal segmentation in order to separate the sequence in a set of actions. By measuring the amount of movement that occurs in each joint of the skeleton we are able to find temporal segments that represent the singular actions.We also proposed an automatic labeling method of human actions using a clustering algorithm on a subset of the available features.info:eu-repo/semantics/acceptedVersio

    Automatic Recognition of Concurrent and Coupled Human Motion Sequences

    Get PDF
    We developed methods and algorithms for all parts of a motion recognition system, i. e. Feature Extraction, Motion Segmentation and Labeling, Motion Primitive and Context Modeling as well as Decoding. We collected several datasets to compare our proposed methods with the state-of-the-art in human motion recognition. The main contributions of this thesis are a structured functional motion decomposition and a flexible and scalable motion recognition system suitable for a Humanoid Robot

    Fault Tolerance for Spacecraft Attitude Management

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/83657/1/AIAA-2010-8301-426.pd

    Human action recognition using distribution of oriented rectangular patches

    Get PDF
    We describe a "bag-of-rectangles" method for representing and recognizing human actions in videos. In this method, each human pose in an action sequence is represented by oriented rectangular patches extracted over the whole body. Then, spatial oriented histograms are formed to represent the distribution of these rectangular patches. In order to carry the information from the spatial domain described by the bag-of-rectangles descriptor to temporal domain for recognition of the actions, four different methods are proposed. These are namely, (i) frame by frame voting, which recognizes the actions by matching the descriptors of each frame, (ii) global histogramming, which extends the idea of Motion Energy Image proposed by Bobick and Davis by rectangular patches, (iii) a classifier based approach using SVMs, and (iv) adaptation of Dynamic Time Warping on the temporal representation of the descriptor. The detailed experiments are carried out on the action dataset of Blank et. al. High success rates (100%) prove that with a very simple and compact representation, we can achieve robust recognition of human actions, compared to complex representations. © Springer-Verlag Berlin Heidelberg 2007

    Robust recognition and segmentation of human actions using HMMs with missing observations

    Get PDF
    This paper describes the integration of missing observation data with hidden Markov models to create a framework that is able to segment and classify individual actions from a stream of human motion using an incomplete 3D human pose estimation. Based on this framework, a model is trained to automatically segment and classify an activity sequence into its constituent subactions during inferencing. This is achieved by introducing action labels into the observation vector and setting these labels as missing data during inferencing, thus forcing the system to infer the probability of each action label. Additionally, missing data provides recognition-level support for occlusions and imperfect silhouette segmentation, permitting the use of a fast (real-time) pose estimation that delegates the burden of handling undetected limbs onto the action recognition system. Findings show that the use of missing data to segment activities is an accurate and elegant approach. Furthermore, action recognition can be accurate even when almost half of the pose feature data is missing due to occlusions, since not all of the pose data is important all of the time
    corecore