11 research outputs found
Optical Flow in Mostly Rigid Scenes
The optical flow of natural scenes is a combination of the motion of the
observer and the independent motion of objects. Existing algorithms typically
focus on either recovering motion and structure under the assumption of a
purely static world or optical flow for general unconstrained scenes. We
combine these approaches in an optical flow algorithm that estimates an
explicit segmentation of moving objects from appearance and physical
constraints. In static regions we take advantage of strong constraints to
jointly estimate the camera motion and the 3D structure of the scene over
multiple frames. This allows us to also regularize the structure instead of the
motion. Our formulation uses a Plane+Parallax framework, which works even under
small baselines, and reduces the motion estimation to a one-dimensional search
problem, resulting in more accurate estimation. In moving regions the flow is
treated as unconstrained, and computed with an existing optical flow method.
The resulting Mostly-Rigid Flow (MR-Flow) method achieves state-of-the-art
results on both the MPI-Sintel and KITTI-2015 benchmarks.Comment: 15 pages, 10 figures; accepted for publication at CVPR 201
An Appearance-Based Method for Parametric Video Registration
In this paper we address the problem of multi frame video registration using the combination of an appearance-based technique and a parametric model of the transformations. This technique uses an image that is selected as reference frame, and therefore, estimates the transformation that occurred to each frame in the sequence respect to this absolute referenced one. Both global and local information are employed to the estimation of these registered images. Global information is applied in terms of linear appearance subspace constraints, under the subspace constancy assumption [4], where variabilities of each frame respect to the reference frame are encoded. Local information is used by means of a polynomial parametric model that estimates the velocities field evoluton in each frame. The objective function to be minimized considers both issues at the same time, i.e., the appearance representation and the time evolution across the sequence. This function is the connection between the global coordinates in the subspace representation and the time evolution and the parametric optical flow estimates. Thus, the appearance constraints result to take into account al the images in a sequence in order to estimate the transformation parameters
Automatic aerial target detection and tracking system in airborne FLIR images based on efficient target trajectory filtering
Common strategies for detection and tracking of aerial moving targets in airborne Forward-Looking Infrared
(FLIR) images offer accurate results in images composed by a non-textured sky. However, when cloud and
earth regions appear in the image sequence, those strategies result in an over-detection that increases very
significantly the false alarm rate. Besides, the airborne camera induces a global motion in the image sequence
that complicates even more detection and tracking tasks. In this work, an automatic detection and tracking
system with an innovative and efficient target trajectory filtering is presented. It robustly compensates the
global motion to accurately detect and track potential aerial targets. Their trajectories are analyzed by a curve
fitting technique to reliably validate real targets. This strategy allows to filter false targets with stationary or
erratic trajectories. The proposed system makes special emphasis in the use of low complexity video analysis
techniques to achieve real-time operation. Experimental results using real FLIR sequences show a dramatic
reduction of the false alarm rate, while maintaining the detection rate
Tigris: Architecture and Algorithms for 3D Perception in Point Clouds
Machine perception applications are increasingly moving toward manipulating
and processing 3D point cloud. This paper focuses on point cloud registration,
a key primitive of 3D data processing widely used in high-level tasks such as
odometry, simultaneous localization and mapping, and 3D reconstruction. As
these applications are routinely deployed in energy-constrained environments,
real-time and energy-efficient point cloud registration is critical.
We present Tigris, an algorithm-architecture co-designed system specialized
for point cloud registration. Through an extensive exploration of the
registration pipeline design space, we find that, while different design points
make vastly different trade-offs between accuracy and performance, KD-tree
search is a common performance bottleneck, and thus is an ideal candidate for
architectural specialization. While KD-tree search is inherently sequential, we
propose an acceleration-amenable data structure and search algorithm that
exposes different forms of parallelism of KD-tree search in the context of
point cloud registration. The co-designed accelerator systematically exploits
the parallelism while incorporating a set of architectural techniques that
further improve the accelerator efficiency. Overall, Tigris achieves
77.2 speedup and 7.4 power reduction in KD-tree search over an
RTX 2080 Ti GPU, which translates to a 41.7% registration performance
improvements and 3.0 power reduction.Comment: Published at MICRO-52 (52nd IEEE/ACM International Symposium on
Microarchitecture); Tiancheng Xu and Boyuan Tian are co-primary author
A hybrid approach to simultaneous localization and mapping in indoors environment
This thesis will present SLAM in the current literature to benefit from then it will present the investigation results for a hybrid approach used where different algorithms using laser, sonar, and camera sensors were tested and compared. The contribution of this thesis is the development of a hybrid approach for SLAM that uses different sensors and where different factors are taken into consideration such as dynamic objects, and the development of a scalable grid map model with new sensors models for real time update of the map.The thesis will show the success found, difficulties faced and limitations of the algorithms developed which were simulated and experimentally tested in an indoors environment
Automatically Recovering Geometry and Texture from Large Sets of Calibrated Images
Three-dimensional models which contain both geometry and texture have numerous applications such as urban planning, physical simulation, and virtual environments. A major focus of computer vision (and recently graphics) research is the automatic recovery of three-dimensional models from two-dimensional images. After many years of research this goal is yet to be achieved. Most practical modeling systems require substantial human input and unlike automatic systems are not scalable. This thesis presents a novel method for automatically recovering dense surface patches using large sets (1000's) of calibrated images taken from arbitrary positions within the scene. Physical instruments, such as Global Positioning System (GPS), inertial sensors, and inclinometers, are used to estimate the position and orientation of each image. Essentially, the problem is to find corresponding points in each of the images. Once a correspondence has been established, calculating its three-dimensional position is simply a matter of geometry. Long baseline images improve the accuracy. Short baseline images and the large number of images greatly simplifies the correspondence problem. The initial stage of the algorithm is completely local and scales linearly with the number of images. Subsequent stages are global in nature, exploit geometric constraints, and scale quadratically with the complexity of the underlying scene. We describe techniques for: 1) detecting and localizing surface patches; 2) refining camera calibration estimates and rejecting false positive surfels; and 3) grouping surface patches into surfaces and growing the surface along a two-dimensional manifold. We also discuss a method for producing high quality, textured three-dimensional models from these surfaces. Some of the most important characteristics of this approach are that it: 1) uses and refines noisy calibration estimates; 2) compensates for large variations in illumination; 3) tolerates significant soft occlusion (e.g. tree branches); and 4) associates, at a fundamental level, an estimated normal (i.e. no frontal-planar assumption) and texture with each surface patch
Visual Odometry and Mapping in Natural Environments for Arbitrary Camera Motion Models
This is a thesis on outdoor monocular visual SLAM in natural environments. The techniques proposed herein aim at estimating camera pose and 3D geometrical structure of the surrounding environment. This problem statement was motivated by the GPS-denied scenario for a sea-surface vehicle developed at Plymouth University named Springer. The algorithms proposed in this thesis are mainly adapted for the Springer’s environmental conditions, so that the vehicle can navigate on a vision based localization system when GPS is not available; such environments include estuarine areas, forests and the occasional semi-urban territories.
The research objectives are constrained versions of the ever-abiding problems in the fields of multiple view geometry and mobile robotics. The research is proposing new techniques or improving existing ones for problems such as scene reconstruction, relative camera pose recovery and filtering, always in the context of the aforementioned landscapes (i.e., rivers, forests, etc.). Although visual tracking is paramount for the generation of data point correspondences, this thesis focuses primarily on the geometric aspect of the problem as well as with the probabilistic framework in which the optimization of pose and structure estimates takes place. Besides algorithms, the deliverables of this research should include the respective implementations and test data for these algorithms in the form of a software library and a dataset containing footage of estuarine regions taken from a boat, along with synchronized sensor logs.
This thesis is not the final analysis on vision based navigation. It merely proposes various solutions for the localization problem of a vehicle navigating in natural environments either on land or on the surface of the water. Although these solutions can be used to provide position and orientation estimates when GPS is not available, they have limitations and there is still a vast new world of ideas to be explored.UTC Aerospace System
Recovery of Ego-Motion Using Region Alignment
A method for computing the 3D camera motion (the egomotion) in a static scene is described, where initially a detected 2D motion between two frames is used to align corresponding image regions. We prove that such a 2D registration removes all effects of camera rotation, even for those image regions that remain misaligned. The resulting residual parallax displacement field between the two region-aligned images is an epipolar field centered at the FOE (Focusof -Expansion). The 3D camera translation is recovered from the epipolar field. The 3D camera rotation is recovered from the computed 3D translation and the detected 2D motion. The decomposition of image motion into a 2D parametric motion and residual epipolar parallax displacements avoids many of the inherent ambiguities and instabilities associated with decomposing the image motion into its rotational and translational components, and hence makes the computation of ego-motion or 3D structure estimation more robust