4,143 research outputs found
Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes
Unsupervised deep learning for optical flow computation has achieved
promising results. Most existing deep-net based methods rely on image
brightness consistency and local smoothness constraint to train the networks.
Their performance degrades at regions where repetitive textures or occlusions
occur. In this paper, we propose Deep Epipolar Flow, an unsupervised optical
flow method which incorporates global geometric constraints into network
learning. In particular, we investigate multiple ways of enforcing the epipolar
constraint in flow estimation. To alleviate a "chicken-and-egg" type of problem
encountered in dynamic scenes where multiple motions may be present, we propose
a low-rank constraint as well as a union-of-subspaces constraint for training.
Experimental results on various benchmarking datasets show that our method
achieves competitive performance compared with supervised methods and
outperforms state-of-the-art unsupervised deep-learning methods.Comment: CVPR 201
The space of essential matrices as a Riemannian quotient manifold
The essential matrix, which encodes the epipolar constraint between points in two projective views,
is a cornerstone of modern computer vision. Previous works have proposed different characterizations
of the space of essential matrices as a Riemannian manifold. However, they either do not consider the
symmetric role played by the two views, or do not fully take into account the geometric peculiarities
of the epipolar constraint. We address these limitations with a characterization as a quotient manifold
which can be easily interpreted in terms of camera poses. While our main focus in on theoretical
aspects, we include applications to optimization problems in computer vision.This work was supported by grants NSF-IIP-0742304, NSF-OIA-1028009, ARL MAST-CTA W911NF-08-2-0004, and ARL RCTA W911NF-10-2-0016, NSF-DGE-0966142, and NSF-IIS-1317788
A Framework for SAR-Optical Stereogrammetry over Urban Areas
Currently, numerous remote sensing satellites provide a huge volume of
diverse earth observation data. As these data show different features regarding
resolution, accuracy, coverage, and spectral imaging ability, fusion techniques
are required to integrate the different properties of each sensor and produce
useful information. For example, synthetic aperture radar (SAR) data can be
fused with optical imagery to produce 3D information using stereogrammetric
methods. The main focus of this study is to investigate the possibility of
applying a stereogrammetry pipeline to very-high-resolution (VHR) SAR-optical
image pairs. For this purpose, the applicability of semi-global matching is
investigated in this unconventional multi-sensor setting. To support the image
matching by reducing the search space and accelerating the identification of
correct, reliable matches, the possibility of establishing an epipolarity
constraint for VHR SAR-optical image pairs is investigated as well. In
addition, it is shown that the absolute geolocation accuracy of VHR optical
imagery with respect to VHR SAR imagery such as provided by TerraSAR-X can be
improved by a multi-sensor block adjustment formulation based on rational
polynomial coefficients. Finally, the feasibility of generating point clouds
with a median accuracy of about 2m is demonstrated and confirms the potential
of 3D reconstruction from SAR-optical image pairs over urban areas.Comment: This is the pre-acceptance version, to read the final version, please
go to ISPRS Journal of Photogrammetry and Remote Sensing on ScienceDirec
Motion from Fixation
We study the problem of estimating rigid motion from a sequence of monocular perspective images obtained by navigating around an object while fixating a particular feature point. The motivation comes from the mechanics of the buman eye, which either pursuits smoothly some fixation point in the scene, or "saccades" between different fixation points. In particular, we are interested in understanding whether fixation helps the process of estimating motion in the sense that it makes it more robust, better conditioned or simpler to solve.
We cast the problem in the framework of "dynamic epipolar geometry", and propose an implicit dynamical model for recursively estimating motion from fixation. This allows us to compare directly the quality of the estimates of motion obtained by imposing the fixation constraint, or by assuming a general rigid motion, simply by changing the geometry of the parameter space while maintaining the same structure of the recursive estimator. We also present a closed-form static solution from two views, and a recursive estimator of the absolute attitude between the viewer and the scene.
One important issue is how do the estimates degrade in presence of disturbances in the tracking procedure. We describe a simple fixation control that converges exponentially, which is complemented by a image shift-registration for achieving sub-pixel accuracy, and assess how small deviations from perfect tracking affect the estimates of motion
Motion from "X" by Compensating "Y"
This paper analyzes the geometry of the visual motion estimation problem in relation to transformations of the input (images) that stabilize particular output functions such as the motion of a point, a line and a plane in the image. By casting the problem within the popular "epipolar geometry", we provide a common framework for including constraints such as point, line of plane fixation by just considering "slices" of the parameter manifold. The models we provide can be used for estimating motion from a batch using the preferred optimization techniques, or for defining dynamic filters that estimate motion from a causal sequence. We discuss methods for performing the necessary compensation by either controlling the support of the camera or by pre-processing the images. The compensation algorithms may be used also for recursively fitting a plane in 3-D both from point-features or directly from brightness. Conversely, they may be used for estimating motion relative to the plane independent of its parameters
Wide baseline stereo matching with convex bounded-distortion constraints
Finding correspondences in wide baseline setups is a challenging problem.
Existing approaches have focused largely on developing better feature
descriptors for correspondence and on accurate recovery of epipolar line
constraints. This paper focuses on the challenging problem of finding
correspondences once approximate epipolar constraints are given. We introduce a
novel method that integrates a deformation model. Specifically, we formulate
the problem as finding the largest number of corresponding points related by a
bounded distortion map that obeys the given epipolar constraints. We show that,
while the set of bounded distortion maps is not convex, the subset of maps that
obey the epipolar line constraints is convex, allowing us to introduce an
efficient algorithm for matching. We further utilize a robust cost function for
matching and employ majorization-minimization for its optimization. Our
experiments indicate that our method finds significantly more accurate maps
than existing approaches
Joint Optical Flow and Temporally Consistent Semantic Segmentation
The importance and demands of visual scene understanding have been steadily
increasing along with the active development of autonomous systems.
Consequently, there has been a large amount of research dedicated to semantic
segmentation and dense motion estimation. In this paper, we propose a method
for jointly estimating optical flow and temporally consistent semantic
segmentation, which closely connects these two problem domains and leverages
each other. Semantic segmentation provides information on plausible physical
motion to its associated pixels, and accurate pixel-level temporal
correspondences enhance the accuracy of semantic segmentation in the temporal
domain. We demonstrate the benefits of our approach on the KITTI benchmark,
where we observe performance gains for flow and segmentation. We achieve
state-of-the-art optical flow results, and outperform all published algorithms
by a large margin on challenging, but crucial dynamic objects.Comment: 14 pages, Accepted for CVRSUAD workshop at ECCV 201
- …