34,275 research outputs found

    Action Recognition by Hierarchical Mid-level Action Elements

    Full text link
    Realistic videos of human actions exhibit rich spatiotemporal structures at multiple levels of granularity: an action can always be decomposed into multiple finer-grained elements in both space and time. To capture this intuition, we propose to represent videos by a hierarchy of mid-level action elements (MAEs), where each MAE corresponds to an action-related spatiotemporal segment in the video. We introduce an unsupervised method to generate this representation from videos. Our method is capable of distinguishing action-related segments from background segments and representing actions at multiple spatiotemporal resolutions. Given a set of spatiotemporal segments generated from the training data, we introduce a discriminative clustering algorithm that automatically discovers MAEs at multiple levels of granularity. We develop structured models that capture a rich set of spatial, temporal and hierarchical relations among the segments, where the action label and multiple levels of MAE labels are jointly inferred. The proposed model achieves state-of-the-art performance in multiple action recognition benchmarks. Moreover, we demonstrate the effectiveness of our model in real-world applications such as action recognition in large-scale untrimmed videos and action parsing

    Motion clouds: model-based stimulus synthesis of natural-like random textures for the study of motion perception

    Full text link
    Choosing an appropriate set of stimuli is essential to characterize the response of a sensory system to a particular functional dimension, such as the eye movement following the motion of a visual scene. Here, we describe a framework to generate random texture movies with controlled information content, i.e., Motion Clouds. These stimuli are defined using a generative model that is based on controlled experimental parametrization. We show that Motion Clouds correspond to dense mixing of localized moving gratings with random positions. Their global envelope is similar to natural-like stimulation with an approximate full-field translation corresponding to a retinal slip. We describe the construction of these stimuli mathematically and propose an open-source Python-based implementation. Examples of the use of this framework are shown. We also propose extensions to other modalities such as color vision, touch, and audition

    Recycle-GAN: Unsupervised Video Retargeting

    Full text link
    We introduce a data-driven approach for unsupervised video retargeting that translates content from one domain to another while preserving the style native to a domain, i.e., if contents of John Oliver's speech were to be transferred to Stephen Colbert, then the generated content/speech should be in Stephen Colbert's style. Our approach combines both spatial and temporal information along with adversarial losses for content translation and style preservation. In this work, we first study the advantages of using spatiotemporal constraints over spatial constraints for effective retargeting. We then demonstrate the proposed approach for the problems where information in both space and time matters such as face-to-face translation, flower-to-flower, wind and cloud synthesis, sunrise and sunset.Comment: ECCV 2018; Please refer to project webpage for videos - http://www.cs.cmu.edu/~aayushb/Recycle-GA

    Comparison of Five Spatio-Temporal Satellite Image Fusion Models over Landscapes with Various Spatial Heterogeneity and Temporal Variation

    Get PDF
    In recent years, many spatial and temporal satellite image fusion (STIF) methods have been developed to solve the problems of trade-off between spatial and temporal resolution of satellite sensors. This study, for the first time, conducted both scene-level and local-level comparison of five state-of-art STIF methods from four categories over landscapes with various spatial heterogeneity and temporal variation. The five STIF methods include the spatial and temporal adaptive reflectance fusion model (STARFM) and Fit-FC model from the weight function-based category, an unmixing-based data fusion (UBDF) method from the unmixing-based category, the one-pair learning method from the learning-based category, and the Flexible Spatiotemporal DAta Fusion (FSDAF) method from hybrid category. The relationship between the performances of the STIF methods and scene-level and local-level landscape heterogeneity index (LHI) and temporal variation index (TVI) were analyzed. Our results showed that (1) the FSDAF model was most robust regardless of variations in LHI and TVI at both scene level and local level, while it was less computationally efficient than the other models except for one-pair learning; (2) Fit-FC had the highest computing efficiency. It was accurate in predicting reflectance but less accurate than FSDAF and one-pair learning in capturing image structures; (3) One-pair learning had advantages in prediction of large-area land cover change with the capability of preserving image structures. However, it was the least computational efficient model; (4) STARFM was good at predicting phenological change, while it was not suitable for applications of land cover type change; (5) UBDF is not recommended for cases with strong temporal changes or abrupt changes. These findings could provide guidelines for users to select appropriate STIF method for their own applications

    Dynamical potentials for non-equilibrium quantum many-body phases

    Get PDF
    Out of equilibrium phases of matter exhibiting order in individual eigenstates, such as many-body localised spin glasses and discrete time crystals, can be characterised by inherently dynamical quantities such as spatiotemporal correlation functions. In this work, we introduce dynamical potentials which act as generating functions for such correlations and capture eigenstate phases and order. These potentials show formal similarities to their equilibrium counterparts, namely thermodynamic potentials. We provide three representative examples: a disordered, many-body localised XXZ chain showing many-body localisation, a disordered Ising chain exhibiting spin-glass order and its periodically-driven cousin exhibiting time-crystalline order.Comment: 4+epsilon pages, 4 figures + supplementary materia

    Non-invertible transformations and spatiotemporal randomness

    Full text link
    We generalize the exact solution to the Bernoulli shift map. Under certain conditions, the generalized functions can produce unpredictable dynamics. We use the properties of the generalized functions to show that certain dynamical systems can generate random dynamics. For instance, the chaotic Chua's circuit coupled to a circuit with a non-invertible I-V characteristic can generate unpredictable dynamics. In general, a nonperiodic time-series with truncated exponential behavior can be converted into unpredictable dynamics using non-invertible transformations. Using a new theoretical framework for chaos and randomness, we investigate some classes of coupled map lattices. We show that, in some cases, these systems can produce completely unpredictable dynamics. In a similar fashion, we explain why some wellknown spatiotemporal systems have been found to produce very complex dynamics in numerical simulations. We discuss real physical systems that can generate random dynamics.Comment: Accepted in International Journal of Bifurcation and Chao
    corecore