2,548 research outputs found
Rejection based multipath reconstruction for background estimation in video sequences with stationary objects
This is the author’s version of a work that was accepted for publication in Computer Vision and Image Understanding. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computer Vision and Image Understanding, VOL147 (2016) DOI 10.1016/j.cviu.2016.03.012Background estimation in video consists in extracting a foreground-free image from a set of training frames. Moving and stationary objects may affect the background visibility, thus invalidating the assumption of many related literature where background is the temporal dominant data. In this paper, we present a temporal-spatial block-level approach for background estimation in video to cope with moving and stationary objects. First, a Temporal Analysis module obtains a compact representation of the training data by motion filtering and dimensionality reduction. Then, a threshold-free hierarchical clustering determines a set of candidates to represent the background for each spatial location (block). Second, a Spatial Analysis module iteratively reconstructs the background using these candidates. For each spatial location, multiple reconstruction hypotheses (paths) are explored to obtain its neighboring locations by enforcing inter-block similarities and intra-block homogeneity constraints in terms of color discontinuity, color dissimilarity and variability. The experimental results show that the proposed approach outperforms the related state-of-the-art over challenging video sequences in presence of moving and stationary objects.This work was partially supported by the Spanish Government (HAVideo, TEC2014-53176-R) and by the TEC department (Universidad Autónoma de Madrid)
Cascaded Scene Flow Prediction using Semantic Segmentation
Given two consecutive frames from a pair of stereo cameras, 3D scene flow
methods simultaneously estimate the 3D geometry and motion of the observed
scene. Many existing approaches use superpixels for regularization, but may
predict inconsistent shapes and motions inside rigidly moving objects. We
instead assume that scenes consist of foreground objects rigidly moving in
front of a static background, and use semantic cues to produce pixel-accurate
scene flow estimates. Our cascaded classification framework accurately models
3D scenes by iteratively refining semantic segmentation masks, stereo
correspondences, 3D rigid motion estimates, and optical flow fields. We
evaluate our method on the challenging KITTI autonomous driving benchmark, and
show that accounting for the motion of segmented vehicles leads to
state-of-the-art performance.Comment: International Conference on 3D Vision (3DV), 2017 (oral presentation
RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments
It is typically challenging for visual or visual-inertial odometry systems to
handle the problems of dynamic scenes and pure rotation. In this work, we
design a novel visual-inertial odometry (VIO) system called RD-VIO to handle
both of these two problems. Firstly, we propose an IMU-PARSAC algorithm which
can robustly detect and match keypoints in a two-stage process. In the first
state, landmarks are matched with new keypoints using visual and IMU
measurements. We collect statistical information from the matching and then
guide the intra-keypoint matching in the second stage. Secondly, to handle the
problem of pure rotation, we detect the motion type and adapt the
deferred-triangulation technique during the data-association process. We make
the pure-rotational frames into the special subframes. When solving the
visual-inertial bundle adjustment, they provide additional constraints to the
pure-rotational motion. We evaluate the proposed VIO system on public datasets.
Experiments show the proposed RD-VIO has obvious advantages over other methods
in dynamic environments
Robust Bayesian target detection algorithm for depth imaging from sparse single-photon data
This paper presents a new Bayesian model and associated algorithm for depth
and intensity profiling using full waveforms from time-correlated single-photon
counting (TCSPC) measurements in the limit of very low photon counts (i.e.,
typically less than 20 photons per pixel). The model represents each Lidar
waveform as an unknown constant background level, which is combined in the
presence of a target, to a known impulse response weighted by the target
intensity and finally corrupted by Poisson noise. The joint target detection
and depth imaging problem is expressed as a pixel-wise model selection and
estimation problem which is solved using Bayesian inference. Prior knowledge
about the problem is embedded in a hierarchical model that describes the
dependence structure between the model parameters while accounting for their
constraints. In particular, Markov random fields (MRFs) are used to model the
joint distribution of the background levels and of the target presence labels,
which are both expected to exhibit significant spatial correlations. An
adaptive Markov chain Monte Carlo algorithm including reversible-jump updates
is then proposed to compute the Bayesian estimates of interest. This algorithm
is equipped with a stochastic optimization adaptation mechanism that
automatically adjusts the parameters of the MRFs by maximum marginal likelihood
estimation. Finally, the benefits of the proposed methodology are demonstrated
through a series of experiments using real data.Comment: arXiv admin note: text overlap with arXiv:1507.0251
- …