11,718 research outputs found

    Egocentric Activity Recognition with Multimodal Fisher Vector

    Full text link
    With the increasing availability of wearable devices, research on egocentric activity recognition has received much attention recently. In this paper, we build a Multimodal Egocentric Activity dataset which includes egocentric videos and sensor data of 20 fine-grained and diverse activity categories. We present a novel strategy to extract temporal trajectory-like features from sensor data. We propose to apply the Fisher Kernel framework to fuse video and temporal enhanced sensor features. Experiment results show that with careful design of feature extraction and fusion algorithm, sensor data can enhance information-rich video data. We make publicly available the Multimodal Egocentric Activity dataset to facilitate future research.Comment: 5 pages, 4 figures, ICASSP 2016 accepte

    FROM TRADITIONAL TO INTERDISCIPLINARY APPROACHES FOR INERTIAL BODY MOTION CAPTURE

    Get PDF
    Inertial motion capture (mocap) is a widespread technology for capturing human motion outside the lab, e.g. for applications in sports, ergonomics, rehabilitation and personal fitness. Even though mature systems are commercially available, inertial mocap is still a subject of research due to a number of limitations: besides measurement errors and sparsity, also simplified body models and calibration routines, soft tissue artefacts and varying body shapes lead to limited precision and robustness compared to optical gold standard systems. The goal of the research group wearHEALTH at the TU Kaiserslautern is to tackle these challenges by bringing together ideas and approaches from different disciplines including biomechanics, sensor fusion, computer vision and (optimal control) simulation. In this talk, we will present an overview of our approaches and applications, starting from the more traditional ones

    Spheromak formation and sustainment studies at the sustained spheromak physics experiment using high-speed imaging and magnetic diagnostics

    Get PDF
    A high-speed imaging system with shutter speeds as fast as 2 ns and double frame capability has been used to directly image the formation and evolution of the sustained spheromak physics experiment (SSPX) [E. B. Hooper et al., Nucl. Fusion 39, 863 (1999)]. Reproducible plasma features have been identified with this diagnostic and divided into three groups, according to the stage in the discharge at which they occur: (i) breakdown and ejection, (ii) sustainment, and (iii) decay. During the first stage, plasma descends into the flux conserver shortly after breakdown and a transient plasma column is formed. The column then rapidly bends and simultaneously becomes too dim to photograph a few microseconds after formation. It is conjectured here that this rapid bending precedes the transfer of toroidal to poloidal flux. During sustainment, a stable plasma column different from the transient one is observed. It has been possible to measure the column diameter and compare it to CORSICA [A. Tarditi et al., Contrib. Plasma Phys. 36, 132 (1996)], a magnetohydrodynamic equilibrium reconstruction code which showed good agreement with the measurements. Elongation and velocity measurements were made of cathode patterns also seen during this stage, possibly caused by pressure gradients or EĂ—B drifts. The patterns elongate in a toroidal-only direction which depends on the magnetic-field polarity. During the decay stage the column diameter expands as the current ramps down, until it eventually dissolves into filaments. With the use of magnetic probes inserted in the gun region, an X point which moved axially depending on current level and toroidal mode number was observed in all the stages of the SSPX plasma discharge

    Kinematic State Estimation using Multiple DGPS/MEMS-IMU Sensors

    Get PDF
    Animals have evolved over billions of years and understanding these complex and intertwined systems have potential to advance the technology in the field of sports science, robotics and more. As such, a gait analysis using Motion Capture (MOCAP) technology is the subject of a number of research and development projects aimed at obtaining quantitative measurements. Existing MOCAP technology has limited the majority of studies to the analysis of the steady-state locomotion in a controlled (indoor) laboratory environment. MOCAP systems such as the optical, non-optical acoustic and non-optical magnetic MOCAP systems require predefined capture volume and controlled environmental conditions whilst the non-optical mechanical MOCAP system impedes the motion of the subject. Although the non-optical inertial MOCAP system allows MOCAP in an outdoor environment, it suffers from measurement noise and drift and lacks global trajectory information. The accuracy of these MOCAP systems are known to decrease during the tracking of the transient locomotion. Quantifying the manoeuvrability of animals in their natural habitat to answer the question “Why are animals so manoeuvrable?” remains a challenge. This research aims to develop an outdoor MOCAP system that will allow tracking of the steady-state as well as the transient locomotion of an animal in its natural habitat outside a controlled laboratory condition. A number of researchers have developed novel MOCAP systems with the same aim of creating an outdoor MOCAP system that is aimed at tracking the motion outside a controlled laboratory (indoor) environment with unlimited capture volume. These novel MOCAP systems are either not validated against the commercial MOCAP systems or do not have comparable sub-millimetre accuracy as the commercial MOCAP systems. The developed DGPS/MEMS-IMU multi-receiver fusion MOCAP system was assessed to have global trajectory accuracy of _0:0394m, relative limb position accuracy of _0:006497m. To conclude the research, several recommendations are made to improve the developed MOCAP system and to prepare for a field-testing with a wild animal from a family of a terrestrial megafauna

    Second-order Temporal Pooling for Action Recognition

    Full text link
    Deep learning models for video-based action recognition usually generate features for short clips (consisting of a few frames); such clip-level features are aggregated to video-level representations by computing statistics on these features. Typically zero-th (max) or the first-order (average) statistics are used. In this paper, we explore the benefits of using second-order statistics. Specifically, we propose a novel end-to-end learnable feature aggregation scheme, dubbed temporal correlation pooling that generates an action descriptor for a video sequence by capturing the similarities between the temporal evolution of clip-level CNN features computed across the video. Such a descriptor, while being computationally cheap, also naturally encodes the co-activations of multiple CNN features, thereby providing a richer characterization of actions than their first-order counterparts. We also propose higher-order extensions of this scheme by computing correlations after embedding the CNN features in a reproducing kernel Hilbert space. We provide experiments on benchmark datasets such as HMDB-51 and UCF-101, fine-grained datasets such as MPII Cooking activities and JHMDB, as well as the recent Kinetics-600. Our results demonstrate the advantages of higher-order pooling schemes that when combined with hand-crafted features (as is standard practice) achieves state-of-the-art accuracy.Comment: Accepted in the International Journal of Computer Vision (IJCV
    • …
    corecore