25,803 research outputs found

    Efficient Action Detection in Untrimmed Videos via Multi-Task Learning

    Full text link
    This paper studies the joint learning of action recognition and temporal localization in long, untrimmed videos. We employ a multi-task learning framework that performs the three highly related steps of action proposal, action recognition, and action localization refinement in parallel instead of the standard sequential pipeline that performs the steps in order. We develop a novel temporal actionness regression module that estimates what proportion of a clip contains action. We use it for temporal localization but it could have other applications like video retrieval, surveillance, summarization, etc. We also introduce random shear augmentation during training to simulate viewpoint change. We evaluate our framework on three popular video benchmarks. Results demonstrate that our joint model is efficient in terms of storage and computation in that we do not need to compute and cache dense trajectory features, and that it is several times faster than its sequential ConvNets counterpart. Yet, despite being more efficient, it outperforms state-of-the-art methods with respect to accuracy.Comment: WACV 2017 camera ready, minor updates about test time efficienc

    Cracking the code of oscillatory activity

    Get PDF
    Neural oscillations are ubiquitous measurements of cognitive processes and dynamic routing and gating of information. The fundamental and so far unresolved problem for neuroscience remains to understand how oscillatory activity in the brain codes information for human cognition. In a biologically relevant cognitive task, we instructed six human observers to categorize facial expressions of emotion while we measured the observers' EEG. We combined state-of-the-art stimulus control with statistical information theory analysis to quantify how the three parameters of oscillations (i.e., power, phase, and frequency) code the visual information relevant for behavior in a cognitive task. We make three points: First, we demonstrate that phase codes considerably more information (2.4 times) relating to the cognitive task than power. Second, we show that the conjunction of power and phase coding reflects detailed visual features relevant for behavioral response-that is, features of facial expressions predicted by behavior. Third, we demonstrate, in analogy to communication technology, that oscillatory frequencies in the brain multiplex the coding of visual features, increasing coding capacity. Together, our findings about the fundamental coding properties of neural oscillations will redirect the research agenda in neuroscience by establishing the differential role of frequency, phase, and amplitude in coding behaviorally relevant information in the brai

    Tracking Information Flow through the Environment: Simple Cases of Stigmerg

    Get PDF
    Recent work in sensor evolution aims at studying the perception-action loop in a formalized information-theoretic manner. By treating sensors as extracting information and actuators as having the capability to "imprint" information on the environment we can view agents as creating, maintaining and making use of various information flows. In our paper we study the perception-action loop of agents using Shannon information flows. We use information theory to track and reveal the important relationships between agents and their environment. For example, we provide an information-theoretic characterization of stigmergy and evolve finite-state automata as agent controllers to engage in stigmergic communication. Our analysis of the evolved automata and the information flow provides insight into how evolution organizes sensoric information acquisition, implicit internal and external memory, processing and action selection
    corecore