1,129 research outputs found

    Dense Vision in Image-guided Surgery

    Get PDF
    Image-guided surgery needs an efficient and effective camera tracking system in order to perform augmented reality for overlaying preoperative models or label cancerous tissues on the 2D video images of the surgical scene. Tracking in endoscopic/laparoscopic scenes however is an extremely difficult task primarily due to tissue deformation, instrument invasion into the surgical scene and the presence of specular highlights. State of the art feature-based SLAM systems such as PTAM fail in tracking such scenes since the number of good features to track is very limited. When the scene is smoky and when there are instrument motions, it will cause feature-based tracking to fail immediately. The work of this thesis provides a systematic approach to this problem using dense vision. We initially attempted to register a 3D preoperative model with multiple 2D endoscopic/laparoscopic images using a dense method but this approach did not perform well. We subsequently proposed stereo reconstruction to directly obtain the 3D structure of the scene. By using the dense reconstructed model together with robust estimation, we demonstrate that dense stereo tracking can be incredibly robust even within extremely challenging endoscopic/laparoscopic scenes. Several validation experiments have been conducted in this thesis. The proposed stereo reconstruction algorithm has turned out to be the state of the art method for several publicly available ground truth datasets. Furthermore, the proposed robust dense stereo tracking algorithm has been proved highly accurate in synthetic environment (< 0.1 mm RMSE) and qualitatively extremely robust when being applied to real scenes in RALP prostatectomy surgery. This is an important step toward achieving accurate image-guided laparoscopic surgery.Open Acces

    Most Likely Separation of Intensity and Warping Effects in Image Registration

    Full text link
    This paper introduces a class of mixed-effects models for joint modeling of spatially correlated intensity variation and warping variation in 2D images. Spatially correlated intensity variation and warp variation are modeled as random effects, resulting in a nonlinear mixed-effects model that enables simultaneous estimation of template and model parameters by optimization of the likelihood function. We propose an algorithm for fitting the model which alternates estimation of variance parameters and image registration. This approach avoids the potential estimation bias in the template estimate that arises when treating registration as a preprocessing step. We apply the model to datasets of facial images and 2D brain magnetic resonance images to illustrate the simultaneous estimation and prediction of intensity and warp effects

    Optical flow estimation using steered-L1 norm

    Get PDF
    Motion is a very important part of understanding the visual picture of the surrounding environment. In image processing it involves the estimation of displacements for image points in an image sequence. In this context dense optical flow estimation is concerned with the computation of pixel displacements in a sequence of images, therefore it has been used widely in the field of image processing and computer vision. A lot of research was dedicated to enable an accurate and fast motion computation in image sequences. Despite the recent advances in the computation of optical flow, there is still room for improvements and optical flow algorithms still suffer from several issues, such as motion discontinuities, occlusion handling, and robustness to illumination changes. This thesis includes an investigation for the topic of optical flow and its applications. It addresses several issues in the computation of dense optical flow and proposes solutions. Specifically, this thesis is divided into two main parts dedicated to address two main areas of interest in optical flow. In the first part, image registration using optical flow is investigated. Both local and global image registration has been used for image registration. An image registration based on an improved version of the combined Local-global method of optical flow computation is proposed. A bi-lateral filter was used in this optical flow method to improve the edge preserving performance. It is shown that image registration via this method gives more robust results compared to the local and the global optical flow methods previously investigated. The second part of this thesis encompasses the main contribution of this research which is an improved total variation L1 norm. A smoothness term is used in the optical flow energy function to regularise this function. The L1 is a plausible choice for such a term because of its performance in preserving edges, however this term is known to be isotropic and hence decreases the penalisation near motion boundaries in all directions. The proposed improved L1 (termed here as the steered-L1 norm) smoothness term demonstrates similar performance across motion boundaries but improves the penalisation performance along such boundaries

    3D Motion Analysis via Energy Minimization

    Get PDF
    This work deals with 3D motion analysis from stereo image sequences for driver assistance systems. It consists of two parts: the estimation of motion from the image data and the segmentation of moving objects in the input images. The content can be summarized with the technical term machine visual kinesthesia, the sensation or perception and cognition of motion. In the first three chapters, the importance of motion information is discussed for driver assistance systems, for machine vision in general, and for the estimation of ego motion. The next two chapters delineate on motion perception, analyzing the apparent movement of pixels in image sequences for both a monocular and binocular camera setup. Then, the obtained motion information is used to segment moving objects in the input video. Thus, one can clearly identify the thread from analyzing the input images to describing the input images by means of stationary and moving objects. Finally, I present possibilities for future applications based on the contents of this thesis. Previous work in each case is presented in the respective chapters. Although the overarching issue of motion estimation from image sequences is related to practice, there is nothing as practical as a good theory (Kurt Lewin). Several problems in computer vision are formulated as intricate energy minimization problems. In this thesis, motion analysis in image sequences is thoroughly investigated, showing that splitting an original complex problem into simplified sub-problems yields improved accuracy, increased robustness, and a clear and accessible approach to state-of-the-art motion estimation techniques. In Chapter 4, optical flow is considered. Optical flow is commonly estimated by minimizing the combined energy, consisting of a data term and a smoothness term. These two parts are decoupled, yielding a novel and iterative approach to optical flow. The derived Refinement Optical Flow framework is a clear and straight-forward approach to computing the apparent image motion vector field. Furthermore this results currently in the most accurate motion estimation techniques in literature. Much as this is an engineering approach of fine-tuning precision to the last detail, it helps to get a better insight into the problem of motion estimation. This profoundly contributes to state-of-the-art research in motion analysis, in particular facilitating the use of motion estimation in a wide range of applications. In Chapter 5, scene flow is rethought. Scene flow stands for the three-dimensional motion vector field for every image pixel, computed from a stereo image sequence. Again, decoupling of the commonly coupled approach of estimating three-dimensional position and three dimensional motion yields an approach to scene ow estimation with more accurate results and a considerably lower computational load. It results in a dense scene flow field and enables additional applications based on the dense three-dimensional motion vector field, which are to be investigated in the future. One such application is the segmentation of moving objects in an image sequence. Detecting moving objects within the scene is one of the most important features to extract in image sequences from a dynamic environment. This is presented in Chapter 6. Scene flow and the segmentation of independently moving objects are only first steps towards machine visual kinesthesia. Throughout this work, I present possible future work to improve the estimation of optical flow and scene flow. Chapter 7 additionally presents an outlook on future research for driver assistance applications. But there is much more to the full understanding of the three-dimensional dynamic scene. This work is meant to inspire the reader to think outside the box and contribute to the vision of building perceiving machines.</em

    Erkennung bewegter Objekte durch raum-zeitliche Bewegungsanalyse

    Get PDF
    Driver assistance systems of the future, that will support the driver in complex driving situations, require a thorough understanding of the car's environment. This includes not only the comprehension of the infrastructure, but also the precise detection and measurement of other moving traffic participants. In this thesis, a novel principle is presented and investigated in detail, that allows the reconstruction of the 3d motion field from the image sequence obtained by a stereo camera system. Given correspondences of stereo measurements over time, this principle estimates the 3d position and the 3d motion vector of selected points using Kalman Filters, resulting in a real-time estimation of the observed motion field. Since the state vector of the Kalman Filter consists of six elements, this principle is called 6d-Vision. To estimate the absolute motion field, the ego-motion of the moving observer must be known precisely. Since cars are usually not equipped with high-end inertial sensors, a novel algorithm to estimate the ego-motion from the image sequence is presented. Based on a Kalman Filter, it is able to support even complex vehicle models, and takes advantage of all available data, namely the previously estimated motion field and eventually available inertial sensors. As the 6d-Vision principle is not restricted to particular algorithms to obtain the image measurements, various optical flow and stereo algorithms are evaluated. In particular, a novel dense stereo algorithm is presented, that gives excellent precision results and runs at real-time. In addition, two novel scene flow algorithms are introduced, that measure the optical flow and stereo information in a combined approach, yielding more precise and robust results than a separate analysis of the two information sources. The application of the 6d-Vision principle to real-world data is illustrated throughout the thesis. As practical applications usually require an object understanding, rather than a 3d motion field, a simple, yet efficient algorithm to detect and track moving objects is presented. This algorithm was successfully implemented in a demonstrator vehicle, that performs an autonomous braking resp. steering manoeuvre to avoid collisions with moving pedestrians.Fahrerassistenzsysteme der Zukunft, die den Fahrer in kritischen Situationen unterstützen sollen, benötigen ein umfangreiches Verständnis der Fahrzeugumgebung. Dieses umfasst nicht nur die Erkennung und Interpretation der Infrastruktur, sondern auch die Detektion und präzise Vermessung anderer Verkehrsteilnehmer. In dieser Arbeit wird ein neues Verfahren vorgestellt und ausführlich untersucht, welches die Rekonstruktion des 3d-Bewegungsfeldes aus Stereo-Bildsequenzen erlaubt. Auf Basis zeitlicher Korrespondenzen von Stereo-Messungen wird sowohl die 3d-Position, als auch der 3d-Geschwindigkeitsvektor einzelner Punkte mit Hilfe von Kalman Filtern geschätzt. Dies erlaubt die Schätzung des beobachteten Bewegungsfeldes in Echtzeit. Da der geschätzte Zustandsvektor sechs Elemente umfasst, wurde dieses Verfahren 6d-Vision genannt. Um das absolute Bewegungsfeld zu schätzen muss die Eigenbewegung des Beobachters bekannt sein. Da Fahrzeuge in der Regel nicht mit einer hoch-präzisen Intertialsensorik ausgestattet sind, muss die Eigenbewegung aus der Bildfolge bestimmt werden. In dieser Arbeit wird dazu ein neuer Algorithmus vorgestellt und untersucht, der mit Hilfe eines Kalman Filters die Eigenbewegung schätzt, und sich optimal in den Datenverarbeitungsprozess des 6d-Vision Verfahrens integriert. Da das 6d-Vision Verfahren nicht auf bestimmte Bildverarbeitungsalgorithmen beschränkt ist, werden in dieser Arbeit verschiedene Algorithmen zur Bestimmung des Optischen Flusses und der Stereo-Korrespondenzen im Hinblick auf Genauigkeit und Robustheit untersucht. Hierbei wird ein neues dichtes Stereo-Verfahren vorgestellt, das im Hinblick auf Genauigkeit sehr gute Ergebnisse erzielt und zudem in Echtzeit läuft. Daneben werden zwei neue Scene-Flow-Algorithmen vorgestellt, die in einem kombinierten Verfahren den Optischen Fluß und Stereo-Korrespondenzen bestimmen, und einer getrennten Analyse hinsichtlich Genauigkeit und Robustheit überlegen sind. Das Verfahren wurde ausführlich auf der Straße getestet und stellt heute eine wichtige Informationsgrundlage für verschiedene Anwendungen dar. Beispielhaft wird in dieser Arbeit auf ein Versuchsfahrzeug eingegangen, das ein autonomes Brems- bzw. Ausweichmanöver durchführt, um eine drohende Kollision mit einem Fußgänger zu vermeiden

    Optical flow estimation on image sequences with differently exposed frames

    Get PDF
    Optical flow (OF) methods are used to estimate dense motion information between consecutive frames in image sequences. In addition to the specific OF estimation method itself, the quality of the input image sequence is of crucial importance to the quality of the resulting flow estimates. For instance, lack of texture in image frames caused by saturation of the camera sensor during exposure can significantly deteriorate the performance. An approach to avoid this negative effect is to use different camera settings when capturing the individual frames. We provide a framework for OF estimation on such sequences that contain differently exposed frames. Information from multiple frames are combined into a total cost functional such that the lack of an active data term for saturated image areas is avoided. Experimental results demonstrate that using alternate camera settings to capture the full dynamic range of an underlying scene can clearly improve the quality of flow estimates. When saturation of image data is significant, the proposed methods show superior performance in terms of lower endpoint errors of the flow vectors compared to a set of baseline methods. Furthermore, we provide some qualitative examples of how and when our method should be used

    Unsupervised Training of Deep Neural Networks for Motion Estimation

    Get PDF
    PhDThis thesis addresses the problem of motion estimation, that is, the estimation of a eld that describes how pixels move from a reference frame to a target frame, using Deep Neural Networks (DNNs). In contrast to classic methods, we don't solve an optimization problem at test time. We train DNNs once and apply it in one pass during the test which reduces the computational complexity. The major contribution is that in contrast to a supervised method, we train our DNNs in an unsupervised way. By unsupervised, we mean without the need for ground truth motion elds which are expensive to obtain for real scenes. More speci cally, we have trained our networks by designing cost functions inspired by classical optical ow estimation schemes and generative methods in Computer Vision. We rst propose a straightforward CNN method that is trained to optimize the brightness constancy constraint and we embed it in a classical multiscale scheme in order to predict motions that are large in magnitude (GradNet). We show that GradNet generalizes well to an unknown dataset and performed comparably with state-of-the-art unsupervised methods at that time. Second, we propose a convolutional Siamese architecture wherein is embedded a new soft warping scheme applied in a multiscale framework and is trained to optimize a higher-level feature constancy constraint (LikeNet). The architecture of LikeNet allows a trade-o between the computational load and memory and is 98% smaller than other SOA methods in terms of learned parameters. We show that LikeNet performs on par with SOA approaches and the best among uni-directional methods, methods that calculate motion eld in one pass. Third, we propose a novel approach to distill slower LikeNet in a much faster regression neural network without losing much of the accuracy (QLikeNet). The results show that using DNNs is a promising direction for motion estimation, although further improvements are required as classical methods yet perform the best

    Laparoscopic Image Recovery and Stereo Matching

    Get PDF
    Laparoscopic imaging can play a significant role in the minimally invasive surgical procedure. However, laparoscopic images often suffer from insufficient and irregular light sources, specular highlight surfaces, and a lack of depth information. These problems can negatively influence the surgeons during surgery, and lead to erroneous visual tracking and potential surgical risks. Thus, developing effective image-processing algorithms for laparoscopic vision recovery and stereo matching is of significant importance. Most related algorithms are effective on nature images, but less effective on laparoscopic images. The first purpose of this thesis is to restore low-light laparoscopic vision, where an effective image enhancement method is proposed by identifying different illumination regions and designing the enhancement criteria for desired image quality. This method can enhance the low-light region by reducing noise amplification during the enhancement process. In addition, this thesis also proposes a simplified Retinex optimization method for non-uniform illumination enhancement. By integrating the prior information of the illumination and reflectance into the optimization process, this method can significantly enhance the dark region while preserving naturalness, texture details, and image structures. Moreover, due to the replacement of the total variation term with two l2l_2-norm terms, the proposed algorithm has a significant computational advantage. Second, a global optimization method for specular highlight removal from a single laparoscopic image is proposed. This method consists of a modified dichromatic reflection model and a novel diffuse chromaticity estimation technique. Due to utilizing the limited color variation of the laparoscopic image, the estimated diffuse chromaticity can approximate the true diffuse chromaticity, which allows us to effectively remove the specular highlight with texture detail preservation. Third, a robust edge-preserving stereo matching method is proposed, based on sparse feature matching, left and right illumination equalization, and refined disparity optimization processes. The sparse feature matching and illumination equalization techniques can provide a good disparity map initialization so that our refined disparity optimization can quickly obtain an accurate disparity map. This approach is particularly promising on surgical tool edges, smooth soft tissues, and surfaces with strong specular highlight

    Large-Scale Light Field Capture and Reconstruction

    Get PDF
    This thesis discusses approaches and techniques to convert Sparsely-Sampled Light Fields (SSLFs) into Densely-Sampled Light Fields (DSLFs), which can be used for visualization on 3DTV and Virtual Reality (VR) devices. Exemplarily, a movable 1D large-scale light field acquisition system for capturing SSLFs in real-world environments is evaluated. This system consists of 24 sparsely placed RGB cameras and two Kinect V2 sensors. The real-world SSLF data captured with this setup can be leveraged to reconstruct real-world DSLFs. To this end, three challenging problems require to be solved for this system: (i) how to estimate the rigid transformation from the coordinate system of a Kinect V2 to the coordinate system of an RGB camera; (ii) how to register the two Kinect V2 sensors with a large displacement; (iii) how to reconstruct a DSLF from a SSLF with moderate and large disparity ranges. To overcome these three challenges, we propose: (i) a novel self-calibration method, which takes advantage of the geometric constraints from the scene and the cameras, for estimating the rigid transformations from the camera coordinate frame of one Kinect V2 to the camera coordinate frames of 12-nearest RGB cameras; (ii) a novel coarse-to-fine approach for recovering the rigid transformation from the coordinate system of one Kinect to the coordinate system of the other by means of local color and geometry information; (iii) several novel algorithms that can be categorized into two groups for reconstructing a DSLF from an input SSLF, including novel view synthesis methods, which are inspired by the state-of-the-art video frame interpolation algorithms, and Epipolar-Plane Image (EPI) inpainting methods, which are inspired by the Shearlet Transform (ST)-based DSLF reconstruction approaches
    • …
    corecore