2,859 research outputs found

    Pyramidal Fisher Motion for Multiview Gait Recognition

    Full text link
    The goal of this paper is to identify individuals by analyzing their gait. Instead of using binary silhouettes as input data (as done in many previous works) we propose and evaluate the use of motion descriptors based on densely sampled short-term trajectories. We take advantage of state-of-the-art people detectors to define custom spatial configurations of the descriptors around the target person. Thus, obtaining a pyramidal representation of the gait motion. The local motion features (described by the Divergence-Curl-Shear descriptor) extracted on the different spatial areas of the person are combined into a single high-level gait descriptor by using the Fisher Vector encoding. The proposed approach, coined Pyramidal Fisher Motion, is experimentally validated on the recent `AVA Multiview Gait' dataset. The results show that this new approach achieves promising results in the problem of gait recognition.Comment: Submitted to International Conference on Pattern Recognition, ICPR, 201

    Automatic learning of gait signatures for people identification

    Get PDF
    This work targets people identification in video based on the way they walk (i.e. gait). While classical methods typically derive gait signatures from sequences of binary silhouettes, in this work we explore the use of convolutional neural networks (CNN) for learning high-level descriptors from low-level motion features (i.e. optical flow components). We carry out a thorough experimental evaluation of the proposed CNN architecture on the challenging TUM-GAID dataset. The experimental results indicate that using spatio-temporal cuboids of optical flow as input data for CNN allows to obtain state-of-the-art results on the gait task with an image resolution eight times lower than the previously reported results (i.e. 80x60 pixels).Comment: Proof of concept paper. Technical report on the use of ConvNets (CNN) for gait recognition. Data and code: http://www.uco.es/~in1majim/research/cnngaitof.htm

    Progressive search space reduction for human pose estimation

    Get PDF
    The objective of this paper is to estimate 2D human pose as a spatial configuration of body parts in TV and movie video shots. Such video material is uncontrolled and extremely challenging. We propose an approach that progressively reduces the search space for body parts, to greatly improve the chances that pose estimation will succeed. This involves two contributions: (i) a generic detector using a weak model of pose to substantially reduce the full pose search space; and (ii) employing ‘grabcut ’ initialized on detected regions proposed by the weak model, to further prune the search space. Moreover, we also propose (iii) an integrated spatiotemporal model covering multiple frames to refine pose estimates from individual frames, with inference using belief propagation. The method is fully automatic and self-initializing, and explains the spatio-temporal volume covered by a person moving in a shot, by soft-labeling every pixel as belonging to a particular body part or to the background. We demonstrate upper-body pose estimation by an extensive evaluation over 70000 frames from four episodes of the TV series Buffy the vampire slayer, and present an application to fullbody action recognition on the Weizmann dataset. 1

    2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images

    Get PDF
    We present a technique for estimating the spatial layout of humans in still images—the position of the head, torso and arms. The theme we explore is that once a person is localized using an upper body detector, the search for their body parts can be considerably simplified using weak constraints on position and appearance arising from that detection. Our approach is capable of estimating upper body pose in highly challenging uncontrolled images, without prior knowledge of background, clothing, lighting, or the location and scale of the person in the image. People are only required to be upright and seen from the front or the back (not side). We evaluate the stages of our approach experimentally using ground truth layout annotation on a variety of challenging material, such as images from the PASCAL VOC 2008 challenge and video frames from TV shows and feature films. We also propose and evaluate techniques for searching a video dataset for people in a specific pose. To this end, we develop three new pose descriptors and compare their classification and retrieval performance to two baselines built on state-of-the-art object detection model

    An effective theory of accelerated expansion

    Get PDF
    We work out an effective theory of accelerated expansion to describe general phenomena of inflation and acceleration (dark energy) in the Universe. Our aim is to determine from theoretical grounds, in a physically-motivated and model independent way, which and how many (free) parameters are needed to broadly capture the physics of a theory describing cosmic acceleration. Our goal is to make as much as possible transparent the physical interpretation of the parameters describing the expansion. We show that, at leading order, there are five independent parameters, of which one can be constrained via general relativity tests. The other four parameters need to be determined by observing and measuring the cosmic expansion rate only, H(z). Therefore we suggest that future cosmology surveys focus on obtaining an accurate as possible measurement of H(z)H(z) to constrain the nature of accelerated expansion (dark energy and/or inflation).Comment: In press; minor changes, results unchange

    Streaming flow by oscillating bubbles: Quantitative diagnostics via particle tracking velocimetry

    Get PDF
    Oscillating microbubbles can be used as microscopic agents. Using external acoustic fields they are able to set the surrounding fluid into motion, Erode surfaces and even to carry particles attached to their interfaces. Although the acoustic streaming flow that the bubble generates in its vicinity has been often observed, it has never been measured and quantitatively compared with the available theoretical models. The scarcity of quantitative data is partially due to the strong three-dimensional character of bubble-induced streaming flows, which demands advanced velocimetry techniques. In this work, we present quantitative measurements of the flow generated by single and pairs of acoustically excited sessile microbubbles using a three-dimensional particle tracking technique. Using this novel experimental approach we are able to obtain the bubble's resonant oscillating frequency, study the boundaries of the linear oscillation regime, give predictions on the flow strength and the shear in the surrounding surface and study the flow and the stability of a two-bubble system. Our results show that velocimetry techniques are a suitable tool to make diagnostics on the dynamics of acoustically excited microbubbles

    "Here's looking at you, kid":Detecting people looking at each other in videos

    Get PDF

    Pose search: Retrieving people using their pose

    Get PDF
    We describe a method for retrieving shots containing a particular 2D human pose from unconstrained movie and TV videos. The method involves first localizing the spatial layout of the head, torso and limbs in individual frames using pictorial structures, and associating these through a shot by tracking. A feature vector describing the pose is then constructed from the pictorial structure. Shots can be retrieved either by querying on a single frame with the desired pose, or through a pose classifier trained from a set of pose examples. Our main contribution is an effective system for retrieving people based on their pose, and in particular we propose and investigate several pose descriptors which are person, clothing, background and lighting independent. As a second contribution, we improve the performance over existing methods for localizing upper body layout on unconstrained video. We compare the spatial layout pose retrieval to a baseline method where poses are retrieved using a HOG descriptor. Performance is assessed on five episodes of the TV series ’Buffy the Vampire Slayer’, and pose retrieval is demonstrated also on three Hollywood movies. 1

    Detecting People Looking at Each Other in Videos

    Get PDF
    The objective of this work is to determine if people are interacting in TV video by detecting whether they are looking at each other or not. We determine both the temporal period of the interaction and also spatially localize the relevant people. We make the following four contributions: (i) head detection with implicit coarse pose information (front, profile, back); (ii) continuous head pose estimation in unconstrained scenarios (TV video) using Gaussian process regression; (iii) propose and evaluate several methods for assessing whether and when pairs of people are looking at each other in a video shot; and (iv) introduce new ground truth annotation for this task, extending the TV human interactions dataset (Patron-Perez et al. 2010) The performance of the methods is evaluated on this dataset, which consists of 300 video clips extracted from TV shows. Despite the variety and difficulty of this video material, our best method obtains an average precision of 87.6 % in a fully automatic manner

    Metastable Resting State Brain Dynamics

    Get PDF
    Metastability refers to the fact that the state of a dynamical system spends a large amount of time in a restricted region of its available phase space before a transition takes place, bringing the system into another state from where it might recur into the previous one. beim Graben and Hutt (2013) suggested to use the recurrence plot (RP) technique introduced by Eckmann et al. (1987) for the segmentation of system's trajectories into metastable states using recurrence grammars. Here, we apply this recurrence structure analysis (RSA) for the first time to resting-state brain dynamics obtained from functional magnetic resonance imaging (fMRI). Brain regions are defined according to the brain hierarchical atlas (BHA) developed by Diez et al. (2015), and as a consequence, regions present high-connectivity in both structure (obtained from diffusion tensor imaging) and function (from the blood-level dependent-oxygenation—BOLD—signal). Remarkably, regions observed by Diez et al. were completely time-invariant. Here, in order to compare this static picture with the metastable systems dynamics obtained from the RSA segmentation, we determine the number of metastable states as a measure of complexity for all subjects and for region numbers varying from 3 to 100. We find RSA convergence toward an optimal segmentation of 40 metastable states for normalized BOLD signals, averaged over BHA modules. Next, we build a bistable dynamics at population level by pooling 30 subjects after Hausdorff clustering. In link with this finding, we reflect on the different modeling frameworks that can allow for such scenarios: heteroclinic dynamics, dynamics with riddled basins of attraction, multiple-timescale dynamics. Finally, we characterize the metastable states both functionally and structurally, using templates for resting state networks (RSNs) and the automated anatomical labeling (AAL) atlas, respectively.SR would like to acknowledge Ikerbasque (The Basque Foundation for Science) and moreover, this research is supported by the Basque Government through the BERC 2018-2021 program and by the Spanish State Research Agency through BCAM Severo Ochoa excellence accreditation SEV2017-0718 and through project RTI2018-093860-B- C21 funded by (AEI/FEDER, UE) and acronym MathNEURO. JC acknowledges financial support from Ikerbasque, Ministerio Economia, Industria y Competitividad (Spain) and FEDER (grant DPI2016-79874-R) and the Department of Economical Development and Infrastructure of the Basque Country (Elkartek Program, KK-2018/00032). Finally, PG acknowledges BCAM’s hospitality during a visiting fellowship in fall 2017
    • 

    corecore