2,548 research outputs found

    Rejection based multipath reconstruction for background estimation in video sequences with stationary objects

    Full text link
    This is the author’s version of a work that was accepted for publication in Computer Vision and Image Understanding. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computer Vision and Image Understanding, VOL147 (2016) DOI 10.1016/j.cviu.2016.03.012Background estimation in video consists in extracting a foreground-free image from a set of training frames. Moving and stationary objects may affect the background visibility, thus invalidating the assumption of many related literature where background is the temporal dominant data. In this paper, we present a temporal-spatial block-level approach for background estimation in video to cope with moving and stationary objects. First, a Temporal Analysis module obtains a compact representation of the training data by motion filtering and dimensionality reduction. Then, a threshold-free hierarchical clustering determines a set of candidates to represent the background for each spatial location (block). Second, a Spatial Analysis module iteratively reconstructs the background using these candidates. For each spatial location, multiple reconstruction hypotheses (paths) are explored to obtain its neighboring locations by enforcing inter-block similarities and intra-block homogeneity constraints in terms of color discontinuity, color dissimilarity and variability. The experimental results show that the proposed approach outperforms the related state-of-the-art over challenging video sequences in presence of moving and stationary objects.This work was partially supported by the Spanish Government (HAVideo, TEC2014-53176-R) and by the TEC department (Universidad Autónoma de Madrid)

    Cascaded Scene Flow Prediction using Semantic Segmentation

    Full text link
    Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene. Many existing approaches use superpixels for regularization, but may predict inconsistent shapes and motions inside rigidly moving objects. We instead assume that scenes consist of foreground objects rigidly moving in front of a static background, and use semantic cues to produce pixel-accurate scene flow estimates. Our cascaded classification framework accurately models 3D scenes by iteratively refining semantic segmentation masks, stereo correspondences, 3D rigid motion estimates, and optical flow fields. We evaluate our method on the challenging KITTI autonomous driving benchmark, and show that accounting for the motion of segmented vehicles leads to state-of-the-art performance.Comment: International Conference on 3D Vision (3DV), 2017 (oral presentation

    RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments

    Full text link
    It is typically challenging for visual or visual-inertial odometry systems to handle the problems of dynamic scenes and pure rotation. In this work, we design a novel visual-inertial odometry (VIO) system called RD-VIO to handle both of these two problems. Firstly, we propose an IMU-PARSAC algorithm which can robustly detect and match keypoints in a two-stage process. In the first state, landmarks are matched with new keypoints using visual and IMU measurements. We collect statistical information from the matching and then guide the intra-keypoint matching in the second stage. Secondly, to handle the problem of pure rotation, we detect the motion type and adapt the deferred-triangulation technique during the data-association process. We make the pure-rotational frames into the special subframes. When solving the visual-inertial bundle adjustment, they provide additional constraints to the pure-rotational motion. We evaluate the proposed VIO system on public datasets. Experiments show the proposed RD-VIO has obvious advantages over other methods in dynamic environments

    Robust Bayesian target detection algorithm for depth imaging from sparse single-photon data

    Get PDF
    This paper presents a new Bayesian model and associated algorithm for depth and intensity profiling using full waveforms from time-correlated single-photon counting (TCSPC) measurements in the limit of very low photon counts (i.e., typically less than 20 photons per pixel). The model represents each Lidar waveform as an unknown constant background level, which is combined in the presence of a target, to a known impulse response weighted by the target intensity and finally corrupted by Poisson noise. The joint target detection and depth imaging problem is expressed as a pixel-wise model selection and estimation problem which is solved using Bayesian inference. Prior knowledge about the problem is embedded in a hierarchical model that describes the dependence structure between the model parameters while accounting for their constraints. In particular, Markov random fields (MRFs) are used to model the joint distribution of the background levels and of the target presence labels, which are both expected to exhibit significant spatial correlations. An adaptive Markov chain Monte Carlo algorithm including reversible-jump updates is then proposed to compute the Bayesian estimates of interest. This algorithm is equipped with a stochastic optimization adaptation mechanism that automatically adjusts the parameters of the MRFs by maximum marginal likelihood estimation. Finally, the benefits of the proposed methodology are demonstrated through a series of experiments using real data.Comment: arXiv admin note: text overlap with arXiv:1507.0251
    • …