236 research outputs found

    Registracija stereo slika postupkom zasnovanim na RANSAC strategiji s geometrijskim ograničenjem na generiranje hipoteza.

    Get PDF
    An approach for registration of sparse feature sets detected in two stereo image pairs taken from two different views is proposed. Analogously to many existing image registration approaches, our method consists of initial matching of features using local descriptors followed by a RANSAC-based procedure. The proposed approach is especially suitable for cases where there is a high percentage of false initial matches. The strategy proposed in this paper is to modify the hypothesis generation step of the basic RANSAC approach by performing a multiple-step procedure which uses geometric constraints in order to reduce the probability of false correspondences in generated hypotheses. The algorithm needs approximate information about the relative camera pose between the two views. However, the uncertainty of this information is allowed to be rather high. The presented technique is evaluated using both synthetic data and real data obtained by a stereo camera system.U radu je predložen jedan pristup registraciji skupova značajki detektiranih na dva para stereo slika snimljenih iz dva različita pogleda. Slično mnogim postojećim pristupima registraciji slika, predložena se metoda sastoji od početnog sparivanja značajki na temelju lokalnih deskriptora iza kojeg slijedi postupak temeljen na RANSAC-strategiji. Predloženi je pristup posebno prikladan za slučajeve kada rezultat početnog sparivanja sadrži veliki postotak pogrešno sparenih značajki. Strategija koja se predlaže u ovom članku je da se korak RANSAC-algoritma u kojem se slučajnim uzorkovanjem generiraju hipoteze zamijeni postupkom u kojem se hipoteza generira u više koraka, pri čemu se u svakom koraku, korištenjem odgovarajućih geometrijskih ograničenja, smanjuje vjerojatnost izbora pogrešno sparenih značajki. Algoritam treba približnu informaciju o relativnom položaju kamera između dva pogleda, pri čemu je dopuštena nesigurnost te informacije prilično velika. Predstavljena strategija je provjerena korištenjem sintetičkih podataka te pokusima sa slikama snimljenim pomoću stereo sustava kamera

    Learning and Matching Multi-View Descriptors for Registration of Point Clouds

    Full text link
    Critical to the registration of point clouds is the establishment of a set of accurate correspondences between points in 3D space. The correspondence problem is generally addressed by the design of discriminative 3D local descriptors on the one hand, and the development of robust matching strategies on the other hand. In this work, we first propose a multi-view local descriptor, which is learned from the images of multiple views, for the description of 3D keypoints. Then, we develop a robust matching approach, aiming at rejecting outlier matches based on the efficient inference via belief propagation on the defined graphical model. We have demonstrated the boost of our approaches to registration on the public scanning and multi-view stereo datasets. The superior performance has been verified by the intensive comparisons against a variety of descriptors and matching methods

    Robust convex optimisation techniques for autonomous vehicle vision-based navigation

    Get PDF
    This thesis investigates new convex optimisation techniques for motion and pose estimation. Numerous computer vision problems can be formulated as optimisation problems. These optimisation problems are generally solved via linear techniques using the singular value decomposition or iterative methods under an L2 norm minimisation. Linear techniques have the advantage of offering a closed-form solution that is simple to implement. The quantity being minimised is, however, not geometrically or statistically meaningful. Conversely, L2 algorithms rely on iterative estimation, where a cost function is minimised using algorithms such as Levenberg-Marquardt, Gauss-Newton, gradient descent or conjugate gradient. The cost functions involved are geometrically interpretable and can statistically be optimal under an assumption of Gaussian noise. However, in addition to their sensitivity to initial conditions, these algorithms are often slow and bear a high probability of getting trapped in a local minimum or producing infeasible solutions, even for small noise levels. In light of the above, in this thesis we focus on developing new techniques for finding solutions via a convex optimisation framework that are globally optimal. Presently convex optimisation techniques in motion estimation have revealed enormous advantages. Indeed, convex optimisation ensures getting a global minimum, and the cost function is geometrically meaningful. Moreover, robust optimisation is a recent approach for optimisation under uncertain data. In recent years the need to cope with uncertain data has become especially acute, particularly where real-world applications are concerned. In such circumstances, robust optimisation aims to recover an optimal solution whose feasibility must be guaranteed for any realisation of the uncertain data. Although many researchers avoid uncertainty due to the added complexity in constructing a robust optimisation model and to lack of knowledge as to the nature of these uncertainties, and especially their propagation, in this thesis robust convex optimisation, while estimating the uncertainties at every step is investigated for the motion estimation problem. First, a solution using convex optimisation coupled to the recursive least squares (RLS) algorithm and the robust H filter is developed for motion estimation. In another solution, uncertainties and their propagation are incorporated in a robust L convex optimisation framework for monocular visual motion estimation. In this solution, robust least squares is combined with a second order cone program (SOCP). A technique to improve the accuracy and the robustness of the fundamental matrix is also investigated in this thesis. This technique uses the covariance intersection approach to fuse feature location uncertainties, which leads to more consistent motion estimates. Loop-closure detection is crucial in improving the robustness of navigation algorithms. In practice, after long navigation in an unknown environment, detecting that a vehicle is in a location it has previously visited gives the opportunity to increase the accuracy and consistency of the estimate. In this context, we have developed an efficient appearance-based method for visual loop-closure detection based on the combination of a Gaussian mixture model with the KD-tree data structure. Deploying this technique for loop-closure detection, a robust L convex posegraph optimisation solution for unmanned aerial vehicle (UAVs) monocular motion estimation is introduced as well. In the literature, most proposed solutions formulate the pose-graph optimisation as a least-squares problem by minimising a cost function using iterative methods. In this work, robust convex optimisation under the L norm is adopted, which efficiently corrects the UAV’s pose after loop-closure detection. To round out the work in this thesis, a system for cooperative monocular visual motion estimation with multiple aerial vehicles is proposed. The cooperative motion estimation employs state-of-the-art approaches for optimisation, individual motion estimation and registration. Three-view geometry algorithms in a convex optimisation framework are deployed on board the monocular vision system for each vehicle. In addition, vehicle-to-vehicle relative pose estimation is performed with a novel robust registration solution in a global optimisation framework. In parallel, and as a complementary solution for the relative pose, a robust non-linear H solution is designed as well to fuse measurements from the UAVs’ on-board inertial sensors with the visual estimates. The suggested contributions have been exhaustively evaluated over a number of real-image data experiments in the laboratory using monocular vision systems and range imaging devices. In this thesis, we propose several solutions towards the goal of robust visual motion estimation using convex optimisation. We show that the convex optimisation framework may be extended to include uncertainty information, to achieve robust and optimal solutions. We observed that convex optimisation is a practical and very appealing alternative to linear techniques and iterative methods

    3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints

    Get PDF
    International audienceThis article introduces a novel representation for three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patches under affine projection are combined with a normalized representation of their appearance to guide matching and reconstruction, allowing the acquisition of true 3D affine and Euclidean models from multiple unregistered images, as well as their recognition in photographs taken from arbitrary viewpoints. The proposed approach does not require a separate segmentation stage, and it is applicable to highly cluttered scenes. Modeling and recognition results are presented

    Automatic registration of 3D models to laparoscopic video images for guidance during liver surgery

    Get PDF
    Laparoscopic liver interventions offer significant advantages over open surgery, such as less pain and trauma, and shorter recovery time for the patient. However, they also bring challenges for the surgeons such as the lack of tactile feedback, limited field of view and occluded anatomy. Augmented reality (AR) can potentially help during laparoscopic liver interventions by displaying sub-surface structures (such as tumours or vasculature). The initial registration between the 3D model extracted from the CT scan and the laparoscopic video feed is essential for an AR system which should be efficient, robust, intuitive to use and with minimal disruption to the surgical procedure. Several challenges of registration methods in laparoscopic interventions include the deformation of the liver due to gas insufflation in the abdomen, partial visibility of the organ and lack of prominent geometrical or texture-wise landmarks. These challenges are discussed in detail and an overview of the state of the art is provided. This research project aims to provide the tools to move towards a completely automatic registration. Firstly, the importance of pre-operative planning is discussed along with the characteristics of the liver that can be used in order to constrain a registration method. Secondly, maximising the amount of information obtained before the surgery, a semi-automatic surface based method is proposed to recover the initial rigid registration irrespective of the position of the shapes. Finally, a fully automatic 3D-2D rigid global registration is proposed which estimates a global alignment of the pre-operative 3D model using a single intra-operative image. Moving towards incorporating the different liver contours can help constrain the registration, especially for partial surfaces. Having a robust, efficient AR system which requires no manual interaction from the surgeon will aid in the translation of such approaches to the clinics

    Geometric and photometric affine invariant image registration

    Get PDF
    This thesis aims to present a solution to the correspondence problem for the registration of wide-baseline images taken from uncalibrated cameras. We propose an affine invariant descriptor that combines the geometry and photometry of the scene to find correspondences between both views. The geometric affine invariant component of the descriptor is based on the affine arc-length metric, whereas the photometry is analysed by invariant colour moments. A graph structure represents the spatial distribution of the primitive features; i.e. nodes correspond to detected high-curvature points, whereas arcs represent connectivities by extracted contours. After matching, we refine the search for correspondences by using a maximum likelihood robust algorithm. We have evaluated the system over synthetic and real data. The method is endemic to propagation of errors introduced by approximations in the system.BAE SystemsSelex Sensors and Airborne System

    Dense real-time 3D reconstruction from multiple images

    Get PDF
    The rapid increase in computer graphics and acquisition technologies has led to the widespread use of 3D models. Techniques for 3D reconstruction from multiple views aim to recover the structure of a scene and the position and orientation (motion) of the camera using only the geometrical constraints in 2D images. This problem, known as Structure from Motion (SfM) has been the focus of a great deal of research effort in recent years; however, the automatic, dense, real-time and accurate reconstruction of a scene is still a major research challenge. This thesis presents work that targets the development of efficient algorithms to produce high quality and accurate reconstructions, introducing new computer vision techniques for camera motion calibration, dense SfM reconstruction and dense real-time 3D reconstruction. In SfM, a second challenge is to build an effective reconstruction framework that provides dense and high quality surface modelling. This thesis develops a complete, automatic and flexible system with a simple user-interface of `raw images to 3D surface representation'. As part of the proposed image reconstruction approach, this thesis introduces an accurate and reliable region-growing algorithm to propagate the dense matching points from the sparse key points among all stereo pairs. This dense 3D reconstruction proposal addresses the deficiencies of existing SfM systems built on sparsely distributed 3D point clouds which are insufficient for reconstructing a complete 3D model of a scene. The existing SfM reconstruction methods perform a bundle adjustment optimization of the global geometry in order to obtain an accurate model. Such an optimization is very computational expensive and cannot be implemented in a real-time application. Extended Kalman Filter (EKF) Simultaneous Localization and Mapping (SLAM) considers the problem of concurrently estimating in real-time the structure of the surrounding world, perceived by moving sensors (cameras), simultaneously localizing in it. However, standard EKF-SLAM techniques are susceptible to errors introduced during the state prediction and measurement prediction linearization.

    Image-based 3-D reconstruction of constrained environments

    Get PDF
    Nuclear power plays a important role to the United Kingdom electricity generation infrastructure, providing a reliable baseload of low carbon electricity. The Advanced Gas-cooled Reactor (AGR) design makes up approximately 50% of the existing fleet, however, many of the operating reactors have exceeding their original design lifetimes.To ensure safe reactor operation, engineers perform periodic in-core visual inspections of reactor components to monitor the structural health of the core as it ages. However, current inspection mechanisms deployed provide limited structural information about the fuel channel or defects.;This thesis investigates the suitability of image-based 3-D reconstruction techniques to acquire 3-D structural geometry to enable improved diagnostic and prognostic abilities for inspection engineers. The application of image-based 3-D reconstruction to in-core inspection footage highlights significant challenges, most predominantly that the image saliency proves insuffcient for general reconstruction frameworks. The contribution of the thesis is threefold. Firstly, a novel semi-dense matching scheme which exploits sparse and dense image correspondence in combination with a novel intra-image region strength approach to improve the stability of the correspondence between images.;This results in a percentage increase of 138.53% of correct feature matches over similar state-of-the-art image matching paradigms. Secondly, a bespoke incremental Structure-from-Motion (SfM) framework called the Constrained Homogeneous SfM (CH-SfM) which is able to derive structure from deficient feature spaces and constrained environments. Thirdly, the application of the CH-SfM framework to remote visual inspection footage gathered within AGR fuel channels, outperforming other state-of-the-art reconstruction approaches and extracting representative 3-D structural geometry of orientational scans and fully circumferential reconstructions.;This is demonstrated on in-core and laboratory footage, achieving an approximate 3-D point density of 2.785 - 23.8025NX/cm² for real in-core inspection footage and high quality laboratory footage respectively. The demonstrated novelties have applicability to other constrained or feature-poor environments, with future work looking to producing fully dense, photo-realistic 3-D reconstructions.Nuclear power plays a important role to the United Kingdom electricity generation infrastructure, providing a reliable baseload of low carbon electricity. The Advanced Gas-cooled Reactor (AGR) design makes up approximately 50% of the existing fleet, however, many of the operating reactors have exceeding their original design lifetimes.To ensure safe reactor operation, engineers perform periodic in-core visual inspections of reactor components to monitor the structural health of the core as it ages. However, current inspection mechanisms deployed provide limited structural information about the fuel channel or defects.;This thesis investigates the suitability of image-based 3-D reconstruction techniques to acquire 3-D structural geometry to enable improved diagnostic and prognostic abilities for inspection engineers. The application of image-based 3-D reconstruction to in-core inspection footage highlights significant challenges, most predominantly that the image saliency proves insuffcient for general reconstruction frameworks. The contribution of the thesis is threefold. Firstly, a novel semi-dense matching scheme which exploits sparse and dense image correspondence in combination with a novel intra-image region strength approach to improve the stability of the correspondence between images.;This results in a percentage increase of 138.53% of correct feature matches over similar state-of-the-art image matching paradigms. Secondly, a bespoke incremental Structure-from-Motion (SfM) framework called the Constrained Homogeneous SfM (CH-SfM) which is able to derive structure from deficient feature spaces and constrained environments. Thirdly, the application of the CH-SfM framework to remote visual inspection footage gathered within AGR fuel channels, outperforming other state-of-the-art reconstruction approaches and extracting representative 3-D structural geometry of orientational scans and fully circumferential reconstructions.;This is demonstrated on in-core and laboratory footage, achieving an approximate 3-D point density of 2.785 - 23.8025NX/cm² for real in-core inspection footage and high quality laboratory footage respectively. The demonstrated novelties have applicability to other constrained or feature-poor environments, with future work looking to producing fully dense, photo-realistic 3-D reconstructions

    3D Reconstruction of Indoor Corridor Models Using Single Imagery and Video Sequences

    Get PDF
    In recent years, 3D indoor modeling has gained more attention due to its role in decision-making process of maintaining the status and managing the security of building indoor spaces. In this thesis, the problem of continuous indoor corridor space modeling has been tackled through two approaches. The first approach develops a modeling method based on middle-level perceptual organization. The second approach develops a visual Simultaneous Localisation and Mapping (SLAM) system with model-based loop closure. In the first approach, the image space was searched for a corridor layout that can be converted into a geometrically accurate 3D model. Manhattan rule assumption was adopted, and indoor corridor layout hypotheses were generated through a random rule-based intersection of image physical line segments and virtual rays of orthogonal vanishing points. Volumetric reasoning, correspondences to physical edges, orientation map and geometric context of an image are all considered for scoring layout hypotheses. This approach provides physically plausible solutions while facing objects or occlusions in a corridor scene. In the second approach, Layout SLAM is introduced. Layout SLAM performs camera localization while maps layout corners and normal point features in 3D space. Here, a new feature matching cost function was proposed considering both local and global context information. In addition, a rotation compensation variable makes Layout SLAM robust against cameras orientation errors accumulations. Moreover, layout model matching of keyframes insures accurate loop closures that prevent miss-association of newly visited landmarks to previously visited scene parts. The comparison of generated single image-based 3D models to ground truth models showed that average ratio differences in widths, heights and lengths were 1.8%, 3.7% and 19.2% respectively. Moreover, Layout SLAM performed with the maximum absolute trajectory error of 2.4m in position and 8.2 degree in orientation for approximately 318m path on RAWSEEDS data set. Loop closing was strongly performed for Layout SLAM and provided 3D indoor corridor layouts with less than 1.05m displacement errors in length and less than 20cm in width and height for approximately 315m path on York University data set. The proposed methods can successfully generate 3D indoor corridor models compared to their major counterpart