11 research outputs found

    Optical Flow in Mostly Rigid Scenes

    Full text link
    The optical flow of natural scenes is a combination of the motion of the observer and the independent motion of objects. Existing algorithms typically focus on either recovering motion and structure under the assumption of a purely static world or optical flow for general unconstrained scenes. We combine these approaches in an optical flow algorithm that estimates an explicit segmentation of moving objects from appearance and physical constraints. In static regions we take advantage of strong constraints to jointly estimate the camera motion and the 3D structure of the scene over multiple frames. This allows us to also regularize the structure instead of the motion. Our formulation uses a Plane+Parallax framework, which works even under small baselines, and reduces the motion estimation to a one-dimensional search problem, resulting in more accurate estimation. In moving regions the flow is treated as unconstrained, and computed with an existing optical flow method. The resulting Mostly-Rigid Flow (MR-Flow) method achieves state-of-the-art results on both the MPI-Sintel and KITTI-2015 benchmarks.Comment: 15 pages, 10 figures; accepted for publication at CVPR 201

    An Appearance-Based Method for Parametric Video Registration

    Get PDF
    In this paper we address the problem of multi frame video registration using the combination of an appearance-based technique and a parametric model of the transformations. This technique uses an image that is selected as reference frame, and therefore, estimates the transformation that occurred to each frame in the sequence respect to this absolute referenced one. Both global and local information are employed to the estimation of these registered images. Global information is applied in terms of linear appearance subspace constraints, under the subspace constancy assumption [4], where variabilities of each frame respect to the reference frame are encoded. Local information is used by means of a polynomial parametric model that estimates the velocities field evoluton in each frame. The objective function to be minimized considers both issues at the same time, i.e., the appearance representation and the time evolution across the sequence. This function is the connection between the global coordinates in the subspace representation and the time evolution and the parametric optical flow estimates. Thus, the appearance constraints result to take into account al the images in a sequence in order to estimate the transformation parameters

    Automatic aerial target detection and tracking system in airborne FLIR images based on efficient target trajectory filtering

    Get PDF
    Common strategies for detection and tracking of aerial moving targets in airborne Forward-Looking Infrared (FLIR) images offer accurate results in images composed by a non-textured sky. However, when cloud and earth regions appear in the image sequence, those strategies result in an over-detection that increases very significantly the false alarm rate. Besides, the airborne camera induces a global motion in the image sequence that complicates even more detection and tracking tasks. In this work, an automatic detection and tracking system with an innovative and efficient target trajectory filtering is presented. It robustly compensates the global motion to accurately detect and track potential aerial targets. Their trajectories are analyzed by a curve fitting technique to reliably validate real targets. This strategy allows to filter false targets with stationary or erratic trajectories. The proposed system makes special emphasis in the use of low complexity video analysis techniques to achieve real-time operation. Experimental results using real FLIR sequences show a dramatic reduction of the false alarm rate, while maintaining the detection rate

    Tigris: Architecture and Algorithms for 3D Perception in Point Clouds

    Full text link
    Machine perception applications are increasingly moving toward manipulating and processing 3D point cloud. This paper focuses on point cloud registration, a key primitive of 3D data processing widely used in high-level tasks such as odometry, simultaneous localization and mapping, and 3D reconstruction. As these applications are routinely deployed in energy-constrained environments, real-time and energy-efficient point cloud registration is critical. We present Tigris, an algorithm-architecture co-designed system specialized for point cloud registration. Through an extensive exploration of the registration pipeline design space, we find that, while different design points make vastly different trade-offs between accuracy and performance, KD-tree search is a common performance bottleneck, and thus is an ideal candidate for architectural specialization. While KD-tree search is inherently sequential, we propose an acceleration-amenable data structure and search algorithm that exposes different forms of parallelism of KD-tree search in the context of point cloud registration. The co-designed accelerator systematically exploits the parallelism while incorporating a set of architectural techniques that further improve the accelerator efficiency. Overall, Tigris achieves 77.2×\times speedup and 7.4×\times power reduction in KD-tree search over an RTX 2080 Ti GPU, which translates to a 41.7% registration performance improvements and 3.0×\times power reduction.Comment: Published at MICRO-52 (52nd IEEE/ACM International Symposium on Microarchitecture); Tiancheng Xu and Boyuan Tian are co-primary author

    A hybrid approach to simultaneous localization and mapping in indoors environment

    Get PDF
    This thesis will present SLAM in the current literature to benefit from then it will present the investigation results for a hybrid approach used where different algorithms using laser, sonar, and camera sensors were tested and compared. The contribution of this thesis is the development of a hybrid approach for SLAM that uses different sensors and where different factors are taken into consideration such as dynamic objects, and the development of a scalable grid map model with new sensors models for real time update of the map.The thesis will show the success found, difficulties faced and limitations of the algorithms developed which were simulated and experimentally tested in an indoors environment

    Automatically Recovering Geometry and Texture from Large Sets of Calibrated Images

    Get PDF
    Three-dimensional models which contain both geometry and texture have numerous applications such as urban planning, physical simulation, and virtual environments. A major focus of computer vision (and recently graphics) research is the automatic recovery of three-dimensional models from two-dimensional images. After many years of research this goal is yet to be achieved. Most practical modeling systems require substantial human input and unlike automatic systems are not scalable. This thesis presents a novel method for automatically recovering dense surface patches using large sets (1000's) of calibrated images taken from arbitrary positions within the scene. Physical instruments, such as Global Positioning System (GPS), inertial sensors, and inclinometers, are used to estimate the position and orientation of each image. Essentially, the problem is to find corresponding points in each of the images. Once a correspondence has been established, calculating its three-dimensional position is simply a matter of geometry. Long baseline images improve the accuracy. Short baseline images and the large number of images greatly simplifies the correspondence problem. The initial stage of the algorithm is completely local and scales linearly with the number of images. Subsequent stages are global in nature, exploit geometric constraints, and scale quadratically with the complexity of the underlying scene. We describe techniques for: 1) detecting and localizing surface patches; 2) refining camera calibration estimates and rejecting false positive surfels; and 3) grouping surface patches into surfaces and growing the surface along a two-dimensional manifold. We also discuss a method for producing high quality, textured three-dimensional models from these surfaces. Some of the most important characteristics of this approach are that it: 1) uses and refines noisy calibration estimates; 2) compensates for large variations in illumination; 3) tolerates significant soft occlusion (e.g. tree branches); and 4) associates, at a fundamental level, an estimated normal (i.e. no frontal-planar assumption) and texture with each surface patch

    Visual Odometry and Mapping in Natural Environments for Arbitrary Camera Motion Models

    Get PDF
    This is a thesis on outdoor monocular visual SLAM in natural environments. The techniques proposed herein aim at estimating camera pose and 3D geometrical structure of the surrounding environment. This problem statement was motivated by the GPS-denied scenario for a sea-surface vehicle developed at Plymouth University named Springer. The algorithms proposed in this thesis are mainly adapted for the Springer’s environmental conditions, so that the vehicle can navigate on a vision based localization system when GPS is not available; such environments include estuarine areas, forests and the occasional semi-urban territories. The research objectives are constrained versions of the ever-abiding problems in the fields of multiple view geometry and mobile robotics. The research is proposing new techniques or improving existing ones for problems such as scene reconstruction, relative camera pose recovery and filtering, always in the context of the aforementioned landscapes (i.e., rivers, forests, etc.). Although visual tracking is paramount for the generation of data point correspondences, this thesis focuses primarily on the geometric aspect of the problem as well as with the probabilistic framework in which the optimization of pose and structure estimates takes place. Besides algorithms, the deliverables of this research should include the respective implementations and test data for these algorithms in the form of a software library and a dataset containing footage of estuarine regions taken from a boat, along with synchronized sensor logs. This thesis is not the final analysis on vision based navigation. It merely proposes various solutions for the localization problem of a vehicle navigating in natural environments either on land or on the surface of the water. Although these solutions can be used to provide position and orientation estimates when GPS is not available, they have limitations and there is still a vast new world of ideas to be explored.UTC Aerospace System

    Recovery of Ego-Motion Using Region Alignment

    No full text
    A method for computing the 3D camera motion (the egomotion) in a static scene is described, where initially a detected 2D motion between two frames is used to align corresponding image regions. We prove that such a 2D registration removes all effects of camera rotation, even for those image regions that remain misaligned. The resulting residual parallax displacement field between the two region-aligned images is an epipolar field centered at the FOE (Focusof -Expansion). The 3D camera translation is recovered from the epipolar field. The 3D camera rotation is recovered from the computed 3D translation and the detected 2D motion. The decomposition of image motion into a 2D parametric motion and residual epipolar parallax displacements avoids many of the inherent ambiguities and instabilities associated with decomposing the image motion into its rotational and translational components, and hence makes the computation of ego-motion or 3D structure estimation more robust