107,880 research outputs found
Optical Flow in Mostly Rigid Scenes
The optical flow of natural scenes is a combination of the motion of the
observer and the independent motion of objects. Existing algorithms typically
focus on either recovering motion and structure under the assumption of a
purely static world or optical flow for general unconstrained scenes. We
combine these approaches in an optical flow algorithm that estimates an
explicit segmentation of moving objects from appearance and physical
constraints. In static regions we take advantage of strong constraints to
jointly estimate the camera motion and the 3D structure of the scene over
multiple frames. This allows us to also regularize the structure instead of the
motion. Our formulation uses a Plane+Parallax framework, which works even under
small baselines, and reduces the motion estimation to a one-dimensional search
problem, resulting in more accurate estimation. In moving regions the flow is
treated as unconstrained, and computed with an existing optical flow method.
The resulting Mostly-Rigid Flow (MR-Flow) method achieves state-of-the-art
results on both the MPI-Sintel and KITTI-2015 benchmarks.Comment: 15 pages, 10 figures; accepted for publication at CVPR 201
Variational Disparity Estimation Framework for Plenoptic Image
This paper presents a computational framework for accurately estimating the
disparity map of plenoptic images. The proposed framework is based on the
variational principle and provides intrinsic sub-pixel precision. The
light-field motion tensor introduced in the framework allows us to combine
advanced robust data terms as well as provides explicit treatments for
different color channels. A warping strategy is embedded in our framework for
tackling the large displacement problem. We also show that by applying a simple
regularization term and a guided median filtering, the accuracy of displacement
field at occluded area could be greatly enhanced. We demonstrate the excellent
performance of the proposed framework by intensive comparisons with the Lytro
software and contemporary approaches on both synthetic and real-world datasets
Realtime State Estimation with Tactile and Visual sensing. Application to Planar Manipulation
Accurate and robust object state estimation enables successful object
manipulation. Visual sensing is widely used to estimate object poses. However,
in a cluttered scene or in a tight workspace, the robot's end-effector often
occludes the object from the visual sensor. The robot then loses visual
feedback and must fall back on open-loop execution.
In this paper, we integrate both tactile and visual input using a framework
for solving the SLAM problem, incremental smoothing and mapping (iSAM), to
provide a fast and flexible solution. Visual sensing provides global pose
information but is noisy in general, whereas contact sensing is local, but its
measurements are more accurate relative to the end-effector. By combining them,
we aim to exploit their advantages and overcome their limitations. We explore
the technique in the context of a pusher-slider system. We adapt iSAM's
measurement cost and motion cost to the pushing scenario, and use an
instrumented setup to evaluate the estimation quality with different object
shapes, on different surface materials, and under different contact modes
Lifting GIS Maps into Strong Geometric Context for Scene Understanding
Contextual information can have a substantial impact on the performance of
visual tasks such as semantic segmentation, object detection, and geometric
estimation. Data stored in Geographic Information Systems (GIS) offers a rich
source of contextual information that has been largely untapped by computer
vision. We propose to leverage such information for scene understanding by
combining GIS resources with large sets of unorganized photographs using
Structure from Motion (SfM) techniques. We present a pipeline to quickly
generate strong 3D geometric priors from 2D GIS data using SfM models aligned
with minimal user input. Given an image resectioned against this model, we
generate robust predictions of depth, surface normals, and semantic labels. We
show that the precision of the predicted geometry is substantially more
accurate other single-image depth estimation methods. We then demonstrate the
utility of these contextual constraints for re-scoring pedestrian detections,
and use these GIS contextual features alongside object detection score maps to
improve a CRF-based semantic segmentation framework, boosting accuracy over
baseline models
- …