36,806 research outputs found

    Online, Supervised and Unsupervised Action Localization in Videos

    Get PDF
    Action recognition classifies a given video among a set of action labels, whereas action localization determines the location of an action in addition to its class. The overall aim of this dissertation is action localization. Many of the existing action localization approaches exhaustively search (spatially and temporally) for an action in a video. However, as the search space increases with high resolution and longer duration videos, it becomes impractical to use such sliding window techniques. The first part of this dissertation presents an efficient approach for localizing actions by learning contextual relations between different video regions in training. In testing, we use the context information to estimate the probability of each supervoxel belonging to the foreground action and use Conditional Random Field (CRF) to localize actions. In the above method and typical approaches to this problem, localization is performed in an offline manner where all the video frames are processed together. This prevents timely localization and prediction of actions/interactions - an important consideration for many tasks including surveillance and human-machine interaction. Therefore, in the second part of this dissertation we propose an online approach to the challenging problem of localization and prediction of actions/interactions in videos. In this approach, we use human poses and superpixels in each frame to train discriminative appearance models and perform online prediction of actions/interactions with Structural SVM. Above two approaches rely on human supervision in the form of assigning action class labels to videos and annotating actor bounding boxes in each frame of training videos. Therefore, in the third part of this dissertation we address the problem of unsupervised action localization. Given unlabeled videos without annotations, this approach aims at: 1) Discovering action classes using a discriminative clustering approach, and 2) Localizing actions using a variant of Knapsack problem

    Egocentric Vision-based Future Vehicle Localization for Intelligent Driving Assistance Systems

    Full text link
    Predicting the future location of vehicles is essential for safety-critical applications such as advanced driver assistance systems (ADAS) and autonomous driving. This paper introduces a novel approach to simultaneously predict both the location and scale of target vehicles in the first-person (egocentric) view of an ego-vehicle. We present a multi-stream recurrent neural network (RNN) encoder-decoder model that separately captures both object location and scale and pixel-level observations for future vehicle localization. We show that incorporating dense optical flow improves prediction results significantly since it captures information about motion as well as appearance change. We also find that explicitly modeling future motion of the ego-vehicle improves the prediction accuracy, which could be especially beneficial in intelligent and automated vehicles that have motion planning capability. To evaluate the performance of our approach, we present a new dataset of first-person videos collected from a variety of scenarios at road intersections, which are particularly challenging moments for prediction because vehicle trajectories are diverse and dynamic.Comment: To appear on ICRA 201

    RED: Reinforced Encoder-Decoder Networks for Action Anticipation

    Full text link
    Action anticipation aims to detect an action before it happens. Many real world applications in robotics and surveillance are related to this predictive capability. Current methods address this problem by first anticipating visual representations of future frames and then categorizing the anticipated representations to actions. However, anticipation is based on a single past frame's representation, which ignores the history trend. Besides, it can only anticipate a fixed future time. We propose a Reinforced Encoder-Decoder (RED) network for action anticipation. RED takes multiple history representations as input and learns to anticipate a sequence of future representations. One salient aspect of RED is that a reinforcement module is adopted to provide sequence-level supervision; the reward function is designed to encourage the system to make correct predictions as early as possible. We test RED on TVSeries, THUMOS-14 and TV-Human-Interaction datasets for action anticipation and achieve state-of-the-art performance on all datasets

    EEG-Based Quantification of Cortical Current Density and Dynamic Causal Connectivity Generalized across Subjects Performing BCI-Monitored Cognitive Tasks.

    Get PDF
    Quantification of dynamic causal interactions among brain regions constitutes an important component of conducting research and developing applications in experimental and translational neuroscience. Furthermore, cortical networks with dynamic causal connectivity in brain-computer interface (BCI) applications offer a more comprehensive view of brain states implicated in behavior than do individual brain regions. However, models of cortical network dynamics are difficult to generalize across subjects because current electroencephalography (EEG) signal analysis techniques are limited in their ability to reliably localize sources across subjects. We propose an algorithmic and computational framework for identifying cortical networks across subjects in which dynamic causal connectivity is modeled among user-selected cortical regions of interest (ROIs). We demonstrate the strength of the proposed framework using a "reach/saccade to spatial target" cognitive task performed by 10 right-handed individuals. Modeling of causal cortical interactions was accomplished through measurement of cortical activity using (EEG), application of independent component clustering to identify cortical ROIs as network nodes, estimation of cortical current density using cortically constrained low resolution electromagnetic brain tomography (cLORETA), multivariate autoregressive (MVAR) modeling of representative cortical activity signals from each ROI, and quantification of the dynamic causal interaction among the identified ROIs using the Short-time direct Directed Transfer function (SdDTF). The resulting cortical network and the computed causal dynamics among its nodes exhibited physiologically plausible behavior, consistent with past results reported in the literature. This physiological plausibility of the results strengthens the framework's applicability in reliably capturing complex brain functionality, which is required by applications, such as diagnostics and BCI

    Anderson localization of a weakly interacting one dimensional Bose gas

    Full text link
    We consider the phase coherent transport of a quasi one-dimensional beam of Bose-Einstein condensed particles through a disordered potential of length L. Among the possible different types of flow identified in [T. Paul et al., Phys. Rev. Lett. 98, 210602 (2007)], we focus here on the supersonic stationary regime where Anderson localization exists. We generalize the diffusion formalism of Dorokhov-Mello-Pereyra-Kumar to include interaction effects. It is shown that interactions modify the localization length and also introduce a length scale L* for the disordered region, above which most of the realizations of the random potential lead to time dependent flows. A Fokker-Planck equation for the probability density of the transmission coefficient that takes this new effect into account is introduced and solved. The theoretical predictions are verified numerically for different types of disordered potentials. Experimental scenarios for observing our predictions are discussed.Comment: 20 pages, 13 figure
    • …
    corecore