84 research outputs found

    Locating Human Interactions With Discriminatively Trained Deformable Pose+Motion Parts

    Get PDF
    We model dyadic (two-person) interactions by discriminatively training a spatio-temporal deformable part model of fine-grained human interactions. All interactions involve at most two persons. Our models are capable of localizing human interactions in unsegmented videos, marking the interactions of interest in space and time. Our contributions are as follows: First, we create a model that localizes human interactions in space and time. Second, our models use multiple pose and motion features per part. Third, we experiment with different ways of training our models discriminatively. When testing on the target class our models achieve a mean average precision score of 0.86. Cross dataset tests show that our models generalize well to different environments

    Lend Me a Hand: Auxiliary Image Data Helps Interaction Detection

    Get PDF
    In social settings, people interact in close proximity. When analyzing such encounters from video, we are typically interested in distinguishing between a large number of different interactions. Here, we address training deformable part models (DPMs) for the detection of such interactions from video, in both space and time. When we consider a large number of interaction classes, we face two challenges. First, we need to distinguish between interactions that are visually more similar. Second, it becomes more difficult to obtain sufficient specific training examples for each interaction class. In this paper, we address both challenges and focus on the latter. Specifically, we introduce a method to train body part detectors from nonspecific images with pose information. Such resources are widely available. We introduce a training scheme and an adapted DPM formulation to allow for the inclusion of this auxiliary data. We perform cross-dataset experiments to evaluate the generalization performance of our method. We demonstrate that our method can still achieve decent performance, from as few as five training examples

    Class Feature Pyramids for Video Explanation

    Get PDF
    Deep convolutional networks are widely used in video action recognition. 3D convolutions are one prominent approach to deal with the additional time dimension. While 3D convolutions typically lead to higher accuracies, the inner workings of the trained models are more difficult to interpret. We focus on creating human-understandable visual explanations that represent the hierarchical parts of spatio-temporal networks. We introduce Class Feature Pyramids, a method that traverses the entire network structure and incrementally discovers kernels at different network depths that are informative for a specific class. Our method does not depend on the network's architecture or the type of 3D convolutions, supporting grouped and depth-wise convolutions, convolutions in fibers, and convolutions in branches. We demonstrate the method on six state-of-the-art 3D convolution neural networks (CNNs) on three action recognition (Kinetics-400, UCF-101, and HMDB-51) and two egocentric action recognition datasets (EPIC-Kitchens and EGTEA Gaze+)

    Saliency Tubes: Visual Explanations for Spatio-Temporal Convolutions

    Get PDF
    Deep learning approaches have been established as the main methodology for video classification and recognition. Recently, 3-dimensional convolutions have been used to achieve state-of-the-art performance in many challenging video datasets. Because of the high level of complexity of these methods, as the convolution operations are also extended to an additional dimension in order to extract features from it as well, providing a visualization for the signals that the network interpret as informative, is a challenging task. An effective notion of understanding the network’s innerworkings would be to isolate the spatio-temporal regions on the video that the network finds most informative. We propose a method called Saliency Tubes which demonstrate the foremost points and regions in both frame level and over time that are found to be the main focus points of the network. We demonstrate our findings on widely used datasets for thirdperson and egocentric action classification and enhance the set of methods and visualizations that improve 3D Convolutional Neural Networks (CNNs) intelligibility

    Drawing Outside the Lines: Tracking-based Gesture Interaction in Mobile Augmented Entertainment

    Get PDF
    We present a proof-of-concept study for tracking- based gesture interaction in an augmented reality setting using tablets. By tracking a pen in front of a tablet using it’s integrated camera, we are able to map certain motions to gestures, which in turn are used to interact with the application. A comparative user study investigates the feasibility and usefulness of our approach with a simple augmented reality board game allowing translation and drawing gestures to move and create virtual board pieces, respectively. In particular, we demonstrate that users can handle it (and to what degree) and that they enjoy it (and what they potentially dislike). The results from the 25 participants of our experiment provide both subjective and objective evidence of the potential of tracking-based gesture interaction for augmented reality gaming

    Coping with paediatric illness: Child’s play? Exploring the effectiveness of a play- and sports-based cognitive behavioural programme for children with chronic health conditions

    Get PDF
    Little is known about how play affects the development of children with a chronic condition. Studying play poses major methodological challenges in measuring differences in play behaviour, which results in a relative scarcity of research on this subject. This pilot study seeks to provide novel directions for research in this area. The effectiveness of a play- and sports-based cognitive behavioural programme for children (8–12 years) with a chronic condition was studied. The children and parents completed a battery of measurement tools before and after the programme. Moreover, the application of automated computer analyses of behaviour was piloted. Behaviour (Child Behavior Checklist) seemed to be positively affected by the programme. An increase in psychological well-being was observed (KIDSCREEN). Perceived competence (Self-Perception Profile for Children) and actual motor competence (Canadian Agility and Movement Skill Assessment) did not show any positive trends. These results of 13 participants suggest that children might learn to better cope with their illness by stimulating play behaviour. For the analysis of the effectiveness of programmes like this, we therefore propose to focus on measuring behaviour and quality of life. In addition, pilot measurements showed that automated analysis of play can provide important insights into the participation of children

    Object Detection-Based Location and Activity Classification from Egocentric Videos: A Systematic Analysis

    Get PDF
    Egocentric vision has emerged in the daily practice of application domains such as lifelogging, activity monitoring, robot navigation and the analysis of social interactions. Plenty of research focuses on location detection and activity recognition, with applications in the area of Ambient Assisted Living. The basis of this work is the idea that indoor locations and daily activities can be characterized by the presence of specific objects. Objects can be obtained either from laborious human annotations or automatically, using vision-based detectors. We perform a study regarding the use of object detections as input for location and activity classification and analyze the influence of various detection parameters. We compare our detections against manually provided object labels and show that location classification is affected by detection quality and quantity. Utilization of the temporal structure in object detections mitigates the consequences of noisy ones. Moreover, we determine that the recognition of activities is related to the presence of specific objects and that the lack of explicit associations between certain activities and objects hurts classification performance for these activities. Finally, we discuss the outcomes of each task and our method’s potential for real-world applications

    Comparison of the physical and geotechnical properties of gas-hydrate-bearing sediments from offshore India and other gas-hydrate-reservoir systems

    Get PDF
    This paper is not subject to U.S. copyright. The definitive version was published in Marine and Petroleum Geology 58A (2014): 139-167, doi:10.1016/j.marpetgeo.2014.07.024.The sediment characteristics of hydrate-bearing reservoirs profoundly affect the formation, distribution, and morphology of gas hydrate. The presence and type of gas, porewater chemistry, fluid migration, and subbottom temperature may govern the hydrate formation process, but it is the host sediment that commonly dictates final hydrate habit, and whether hydrate may be economically developed. In this paper, the physical properties of hydrate-bearing regions offshore eastern India (Krishna-Godavari and Mahanadi Basins) and the Andaman Islands, determined from Expedition NGHP-01 cores, are compared to each other, well logs, and published results of other hydrate reservoirs. Properties from the hydrate-free Kerala-Konkan basin off the west coast of India are also presented. Coarser-grained reservoirs (permafrost-related and marine) may contain high gas-hydrate-pore saturations, while finer-grained reservoirs may contain low-saturation disseminated or more complex gas-hydrates, including nodules, layers, and high-angle planar and rotational veins. However, even in these fine-grained sediments, gas hydrate preferentially forms in coarser sediment or fractures, when present. The presence of hydrate in conjunction with other geologic processes may be responsible for sediment porosity being nearly uniform for almost 500 m off the Andaman Islands. Properties of individual NGHP-01 wells and regional trends are discussed in detail. However, comparison of marine and permafrost-related Arctic reservoirs provides insight into the inter-relationships and common traits between physical properties and the morphology of gas-hydrate reservoirs regardless of location. Extrapolation of properties from one location to another also enhances our understanding of gas-hydrate reservoir systems. Grain size and porosity effects on permeability are critical, both locally to trap gas and regionally to provide fluid flow to hydrate reservoirs. Index properties corroborate more advanced consolidation and triaxial strength test results and can be used for predicting behavior in other NGHP-01 regions. Pseudo-overconsolidation is present near the seafloor and is underlain by underconsolidation at depth at some NGHP-01 locations.This work was supported by the Coastal and Marine Geology, and Energy Programs of the U.S. Geological Survey. Partial support for this research was provided by Interagency Agreement DE-FE0002911 between the USGS Gas Hydrates Project and the U.S. Department of Energy's Methane Hydrates R&D Program

    Automatic Analysis of Bodily Social Signals

    No full text
    The human body plays an important role in face-to-face interactions (Knapp & Hall, 2010; McNeill, 1992).We use our bodies to regulate turns, to display attitudes and to signal attention (Scheflen, 1964). Unconsciously, the body also reflects our affective and mental states (Ekman & Friesen, 1969). There is a long history of research into the bodily behaviors that correlate with the social and affective state of a person, in particular in interaction with others (Argyle, 2010; Dittmann, 1987; Mehrabian, 1968). We will refer to these behaviors as bodily social signals. These social and affective cues can be detected and interpreted by observing the human body’s posture and movement (Harrigan, 2008; Kleinsmith & Bianchi-Berthouze, 2013). Automatic observation and analysis has applications such as the detection of driver fatigue and deception, the analysis of interest and mood in interactions with robot companions, and in the interpretation of higher-level phenomena such as mimicry and turn-taking. In this chapter, we will discuss various bodily social signals, and how to analyze and recognize them automatically. Human motion can be studied on many levels, from the physical level involving muscles and joints, to the level of interpreting a person’s full-body actions and intentions (Poppe, 2007, 2010; Jiang et al., 2013). We will focus on automatically analyzing movements with a relatively short time scale, such as a gesture or posture shift. In the first section, we will discuss the different ways of measurement and coding, both from motion capture data and images and video. The recorded data can subsequently be interpreted in terms of social signals. In the second section, we address the automatic recognition of several bodily social signals. We will conclude the chapter with a discussion of challenges and directions of future work
    • …
    corecore