53 research outputs found

    Laparoscopic Image Recovery and Stereo Matching

    Get PDF
    Laparoscopic imaging can play a significant role in the minimally invasive surgical procedure. However, laparoscopic images often suffer from insufficient and irregular light sources, specular highlight surfaces, and a lack of depth information. These problems can negatively influence the surgeons during surgery, and lead to erroneous visual tracking and potential surgical risks. Thus, developing effective image-processing algorithms for laparoscopic vision recovery and stereo matching is of significant importance. Most related algorithms are effective on nature images, but less effective on laparoscopic images. The first purpose of this thesis is to restore low-light laparoscopic vision, where an effective image enhancement method is proposed by identifying different illumination regions and designing the enhancement criteria for desired image quality. This method can enhance the low-light region by reducing noise amplification during the enhancement process. In addition, this thesis also proposes a simplified Retinex optimization method for non-uniform illumination enhancement. By integrating the prior information of the illumination and reflectance into the optimization process, this method can significantly enhance the dark region while preserving naturalness, texture details, and image structures. Moreover, due to the replacement of the total variation term with two l2l_2-norm terms, the proposed algorithm has a significant computational advantage. Second, a global optimization method for specular highlight removal from a single laparoscopic image is proposed. This method consists of a modified dichromatic reflection model and a novel diffuse chromaticity estimation technique. Due to utilizing the limited color variation of the laparoscopic image, the estimated diffuse chromaticity can approximate the true diffuse chromaticity, which allows us to effectively remove the specular highlight with texture detail preservation. Third, a robust edge-preserving stereo matching method is proposed, based on sparse feature matching, left and right illumination equalization, and refined disparity optimization processes. The sparse feature matching and illumination equalization techniques can provide a good disparity map initialization so that our refined disparity optimization can quickly obtain an accurate disparity map. This approach is particularly promising on surgical tool edges, smooth soft tissues, and surfaces with strong specular highlight

    Estimating and understanding motion : from diagnostic to robotic surgery

    Get PDF
    Estimating and understanding motion from an image sequence is a central topic in computer vision. The high interest in this topic is because we are living in a world where many events that occur in the environment are dynamic. This makes motion estimation and understanding a natural component and a key factor in a widespread of applications including object recognition , 3D shape reconstruction, autonomous navigation and medica! diagnosis. Particularly, we focus on the medical domain in which understanding the human body for clinical purposes requires retrieving the organs' complex motion patterns, which is in general a hard problem when using only image data. In this thesis, we cope with this problem by posing the question - How to achieve a realistic motion estimation to offer a better clinical understanding? We focus this thesis on answering this question by using a variational formulation as a basis to understand one of the most complex motions in the human's body, the heart motion, through three different applications: (i) cardiac motion estimation for diagnostic, (ii) force estimation and (iii) motion prediction, both for robotic surgery. Firstly, we focus on a central topic in cardiac imaging that is the estimation of the cardiac motion. The main aim is to offer objective and understandable measures to physicians for helping them in the diagnostic of cardiovascular diseases. We employ ultrafast ultrasound data and tools for imaging motion drawn from diverse areas such as low-rank analysis and variational deformation to perform a realistic cardiac motion estimation. The significance is that by taking low-rank data with carefully chosen penalization, synergies in this complex variational problem can be created. We demonstrate how our proposed solution deals with complex deformations through careful numerical experiments using realistic and simulated data. We then move from diagnostic to robotic surgeries where surgeons perform delicate procedures remotely through robotic manipulators without directly interacting with the patients. As a result, they lack force feedback, which is an important primary sense for increasing surgeon-patient transparency and avoiding injuries and high mental workload. To solve this problem, we follow the conservation principies of continuum mechanics in which it is clear that the change in shape of an elastic object is directly proportional to the force applied. Thus, we create a variational framework to acquire the deformation that the tissues undergo due to an applied force. Then, this information is used in a learning system to find the nonlinear relationship between the given data and the applied force. We carried out experiments with in-vivo and ex-vivo data and combined statistical, graphical and perceptual analyses to demonstrate the strength of our solution. Finally, we explore robotic cardiac surgery, which allows carrying out complex procedures including Off-Pump Coronary Artery Bypass Grafting (OPCABG). This procedure avoids the associated complications of using Cardiopulmonary Bypass (CPB) since the heart is not arrested while performing the surgery on a beating heart. Thus, surgeons have to deal with a dynamic target that compromisetheir dexterity and the surgery's precision. To compensate the heart motion, we propase a solution composed of three elements: an energy function to estimate the 3D heart motion, a specular highlight detection strategy and a prediction approach for increasing the robustness of the solution. We conduct evaluation of our solution using phantom and realistic datasets. We conclude the thesis by reporting our findings on these three applications and highlight the dependency between motion estimation and motion understanding at any dynamic event, particularly in clinical scenarios.L’estimació i comprensió del moviment dins d’una seqüència d’imatges és un tema central en la visió per ordinador, el que genera un gran interès perquè vivim en un entorn ple d’esdeveniments dinàmics. Per aquest motiu és considerat com un component natural i factor clau dins d’un ampli ventall d’aplicacions, el qual inclou el reconeixement d’objectes, la reconstrucció de formes tridimensionals, la navegació autònoma i el diagnòstic de malalties. En particular, ens situem en l’àmbit mèdic en el qual la comprensió del cos humà, amb finalitats clíniques, requereix l’obtenció de patrons complexos de moviment dels òrgans. Aquesta és, en general, una tasca difícil quan s’utilitzen només dades de tipus visual. En aquesta tesi afrontem el problema plantejant-nos la pregunta - Com es pot aconseguir una estimació realista del moviment amb l’objectiu d’oferir una millor comprensió clínica? La tesi se centra en la resposta mitjançant l’ús d’una formulació variacional com a base per entendre un dels moviments més complexos del cos humà, el del cor, a través de tres aplicacions: (i) estimació del moviment cardíac per al diagnòstic, (ii) estimació de forces i (iii) predicció del moviment, orientant-se les dues últimes en cirurgia robòtica. En primer lloc, ens centrem en un tema principal en la imatge cardíaca, que és l’estimació del moviment cardíac. L’objectiu principal és oferir als metges mesures objectives i comprensibles per ajudar-los en el diagnòstic de les malalties cardiovasculars. Fem servir dades d’ultrasons ultraràpids i eines per al moviment d’imatges procedents de diverses àrees, com ara l’anàlisi de baix rang i la deformació variacional, per fer una estimació realista del moviment cardíac. La importància rau en que, en prendre les dades de baix rang amb una penalització acurada, es poden crear sinergies en aquest problema variacional complex. Mitjançant acurats experiments numèrics, amb dades realístiques i simulades, hem demostrat com les nostres propostes solucionen deformacions complexes. Després passem del diagnòstic a la cirurgia robòtica, on els cirurgians realitzen procediments delicats remotament, a través de manipuladors robòtics, sense interactuar directament amb els pacients. Com a conseqüència, no tenen la percepció de la força com a resposta, que és un sentit primari important per augmentar la transparència entre el cirurgià i el pacient, per evitar lesions i per reduir la càrrega de treball mental. Resolem aquest problema seguint els principis de conservació de la mecànica del medi continu, en els quals està clar que el canvi en la forma d’un objecte elàstic és directament proporcional a la força aplicada. Per això hem creat un marc variacional que adquireix la deformació que pateixen els teixits per l’aplicació d’una força. Aquesta informació s’utilitza en un sistema d’aprenentatge, per trobar la relació no lineal entre les dades donades i la força aplicada. Hem dut a terme experiments amb dades in-vivo i ex-vivo i hem combinat l’anàlisi estadístic, gràfic i de percepció que demostren la robustesa de la nostra solució. Finalment, explorem la cirurgia cardíaca robòtica, la qual cosa permet realitzar procediments complexos, incloent la cirurgia coronària sense bomba (off-pump coronary artery bypass grafting o OPCAB). Aquest procediment evita les complicacions associades a l’ús de circulació extracorpòria (Cardiopulmonary Bypass o CPB), ja que el cor no s’atura mentre es realitza la cirurgia. Això comporta que els cirurgians han de tractar amb un objectiu dinàmic que compromet la seva destresa i la precisió de la cirurgia. Per compensar el moviment del cor, proposem una solució composta de tres elements: un funcional d’energia per estimar el moviment tridimensional del cor, una estratègia de detecció de les reflexions especulars i una aproximació basada en mètodes de predicció, per tal d’augmentar la robustesa de la solució. L’avaluació de la nostra solució s’ha dut a terme mitjançant conjunts de dades sintètiques i realistes. La tesi conclou informant dels nostres resultats en aquestes tres aplicacions i posant de relleu la dependència entre l’estimació i la comprensió del moviment en qualsevol esdeveniment dinàmic, especialment en escenaris clínics.Postprint (published version

    Appearance Modelling and Reconstruction for Navigation in Minimally Invasive Surgery

    Get PDF
    Minimally invasive surgery is playing an increasingly important role for patient care. Whilst its direct patient benefit in terms of reduced trauma, improved recovery and shortened hospitalisation has been well established, there is a sustained need for improved training of the existing procedures and the development of new smart instruments to tackle the issue of visualisation, ergonomic control, haptic and tactile feedback. For endoscopic intervention, the small field of view in the presence of a complex anatomy can easily introduce disorientation to the operator as the tortuous access pathway is not always easy to predict and control with standard endoscopes. Effective training through simulation devices, based on either virtual reality or mixed-reality simulators, can help to improve the spatial awareness, consistency and safety of these procedures. This thesis examines the use of endoscopic videos for both simulation and navigation purposes. More specifically, it addresses the challenging problem of how to build high-fidelity subject-specific simulation environments for improved training and skills assessment. Issues related to mesh parameterisation and texture blending are investigated. With the maturity of computer vision in terms of both 3D shape reconstruction and localisation and mapping, vision-based techniques have enjoyed significant interest in recent years for surgical navigation. The thesis also tackles the problem of how to use vision-based techniques for providing a detailed 3D map and dynamically expanded field of view to improve spatial awareness and avoid operator disorientation. The key advantage of this approach is that it does not require additional hardware, and thus introduces minimal interference to the existing surgical workflow. The derived 3D map can be effectively integrated with pre-operative data, allowing both global and local 3D navigation by taking into account tissue structural and appearance changes. Both simulation and laboratory-based experiments are conducted throughout this research to assess the practical value of the method proposed

    Epälambertilaiset pinnat ja niiden haasteet konenäössä

    Get PDF
    This thesis regards non-Lambertian surfaces and their challenges, solutions and study in computer vision. The physical theory for understanding the phenomenon is built first, using the Lambertian reflectance model, which defines Lambertian surfaces as ideally diffuse surfaces, whose luminance is isotropic and the luminous intensity obeys Lambert's cosine law. From these two assumptions, non-Lambertian surfaces violate at least the cosine law and are consequently specularly reflecting surfaces, whose perceived brightness is dependent from the viewpoint. Thus non-Lambertian surfaces violate also brightness and colour constancies, which assume that the brightness and colour of same real-world points stays constant across images. These assumptions are used, for example, in tracking and feature matching and thus non-Lambertian surfaces pose complications for object reconstruction and navigation among other tasks in the field of computer vision. After formulating the theoretical foundation of necessary physics and a more general reflectance model called the bi-directional reflectance distribution function, a comprehensive literature review into significant studies regarding non-Lambertian surfaces is conducted. The primary topics of the survey include photometric stereo and navigation systems, while considering other potential fields, such as fusion methods and illumination invariance. The goal of the survey is to formulate a detailed and in-depth answer to what methods can be used to solve the challenges posed by non-Lambertian surfaces, what are these methods' strengths and weaknesses, what are the used datasets and what remains to be answered by further research. After the survey, a dataset is collected and presented, and an outline of another dataset to be published in an upcoming paper is presented. Then a general discussion about the survey and the study is undertaken and conclusions along with proposed future steps are introduced

    Visual Tracking in Robotic Minimally Invasive Surgery

    Get PDF
    Intra-operative imaging and robotics are some of the technologies driving forward better and more effective minimally invasive surgical procedures. To advance surgical practice and capabilities further, one of the key requirements for computationally enhanced interventions is to know how instruments and tissues move during the operation. While endoscopic video captures motion, the complex appearance dynamic effects of surgical scenes are challenging for computer vision algorithms to handle with robustness. Tackling both tissue and instrument motion estimation, this thesis proposes a combined non-rigid surface deformation estimation method to track tissue surfaces robustly and in conditions with poor illumination. For instrument tracking, a keypoint based 2D tracker that relies on the Generalized Hough Transform is developed to initialize a 3D tracker in order to robustly track surgical instruments through long sequences that contain complex motions. To handle appearance changes and occlusion a patch-based adaptive weighting with segmentation and scale tracking framework is developed. It takes a tracking-by-detection approach and a segmentation model is used to assigns weights to template patches in order to suppress back- ground information. The performance of the method is thoroughly evaluated showing that without any offline-training, the tracker works well even in complex environments. Finally, the thesis proposes a novel 2D articulated instrument pose estimation framework, which includes detection-regression fully convolutional network and a multiple instrument parsing component. The framework achieves compelling performance and illustrates interesting properties includ- ing transfer between different instrument types and between ex vivo and in vivo data. In summary, the thesis advances the state-of-the art in visual tracking for surgical applications for both tissue and instrument motion estimation. It contributes to developing the technological capability of full surgical scene understanding from endoscopic video

    Image Registration to Map Endoscopic Video to Computed Tomography for Head and Neck Radiotherapy Patients

    Get PDF
    The purpose of this work was to explore the feasibility of registering endoscopic video to radiotherapy treatment plans for patients with head and neck cancer without physical tracking of the endoscope during the examination. Endoscopy-CT registration would provide a clinical tool that could be used to enhance the treatment planning process and would allow for new methods to study the incidence of radiation-related toxicity. Endoscopic video frames were registered to CT by optimizing virtual endoscope placement to maximize the similarity between the frame and the virtual image. Virtual endoscopic images were rendered using a polygonal mesh created by segmenting the airways of the head and neck with a density threshold. The optical properties of the virtual endoscope were matched to a calibrated model of the real endoscope. A novel registration algorithm was developed that takes advantage of physical constraints on the endoscope to effectively search the airways of the head and neck for the desired virtual endoscope coordinates. This algorithm was tested on rigid phantoms with embedded point markers and protruding bolus material. In these tests, the median registration accuracy was 3.0 mm for point measurements and 3.5 mm for surface measurements. The algorithm was also tested on four endoscopic examinations of three patients, in which it achieved a median registration accuracy of 9.9 mm. The uncertainties caused by the non-rigid anatomy of the head and neck and differences in patient positioning between endoscopic examinations and CT scans were examined by taking repeated measurements after placing the virtual endoscope in surface meshes created from different CT scans. Non-rigid anatomy introduced errors on the order of 1-3 mm. Patient positioning had a larger impact, introducing errors on the order of 3.5-4.5 mm. Endoscopy-CT registration in the head and neck is possible, but large registration errors were found in patients. The uncertainty analyses suggest a lower limit of 3-5 mm. Further development is required to achieve an accuracy suitable for clinical use

    Translational Functional Imaging in Surgery Enabled by Deep Learning

    Get PDF
    Many clinical applications currently rely on several imaging modalities such as Positron Emission Tomography (PET), Magnetic Resonance Imaging (MRI), Computed Tomography (CT), etc. All such modalities provide valuable patient data to the clinical staff to aid clinical decision-making and patient care. Despite the undeniable success of such modalities, most of them are limited to preoperative scans and focus on morphology analysis, e.g. tumor segmentation, radiation treatment planning, anomaly detection, etc. Even though the assessment of different functional properties such as perfusion is crucial in many surgical procedures, it remains highly challenging via simple visual inspection. Functional imaging techniques such as Spectral Imaging (SI) link the unique optical properties of different tissue types with metabolism changes, blood flow, chemical composition, etc. As such, SI is capable of providing much richer information that can improve patient treatment and care. In particular, perfusion assessment with functional imaging has become more relevant due to its involvement in the treatment and development of several diseases such as cardiovascular diseases. Current clinical practice relies on Indocyanine Green (ICG) injection to assess perfusion. Unfortunately, this method can only be used once per surgery and has been shown to trigger deadly complications in some patients (e.g. anaphylactic shock). This thesis addressed common roadblocks in the path to translating optical functional imaging modalities to clinical practice. The main challenges that were tackled are related to a) the slow recording and processing speed that SI devices suffer from, b) the errors introduced in functional parameter estimations under changing illumination conditions, c) the lack of medical data, and d) the high tissue inter-patient heterogeneity that is commonly overlooked. This framework follows a natural path to translation that starts with hardware optimization. To overcome the limitation that the lack of labeled clinical data and current slow SI devices impose, a domain- and task-specific band selection component was introduced. The implementation of such component resulted in a reduction of the amount of data needed to monitor perfusion. Moreover, this method leverages large amounts of synthetic data, which paired with unlabeled in vivo data is capable of generating highly accurate simulations of a wide range of domains. This approach was validated in vivo in a head and neck rat model, and showed higher oxygenation contrast between normal and cancerous tissue, in comparison to a baseline using all available bands. The need for translation to open surgical procedures was met by the implementation of an automatic light source estimation component. This method extracts specular reflections from low exposure spectral images, and processes them to obtain an estimate of the light source spectrum that generated such reflections. The benefits of light source estimation were demonstrated in silico, in ex vivo pig liver, and in vivo human lips, where the oxygenation estimation error was reduced when utilizing the correct light source estimated with this method. These experiments also showed that the performance of the approach proposed in this thesis surpass the performance of other baseline approaches. Video-rate functional property estimation was achieved by two main components: a regression and an Out-of-Distribution (OoD) component. At the core of both components is a compact SI camera that is paired with state-of-the-art deep learning models to achieve real time functional estimations. The first of such components features a deep learning model based on a Convolutional Neural Network (CNN) architecture that was trained on highly accurate physics-based simulations of light-tissue interactions. By doing this, the challenge of lack of in vivo labeled data was overcome. This approach was validated in the task of perfusion monitoring in pig brain and in a clinical study involving human skin. It was shown that this approach is capable of monitoring subtle perfusion changes in human skin in an arm clamping experiment. Even more, this approach was capable of monitoring Spreading Depolarizations (SDs) (deoxygenation waves) in the surface of a pig brain. Even though this method is well suited for perfusion monitoring in domains that are well represented with the physics-based simulations on which it was trained, its performance cannot be guaranteed for outlier domains. To handle outlier domains, the task of ischemia monitoring was rephrased as an OoD detection task. This new functional estimation component comprises an ensemble of Invertible Neural Networks (INNs) that only requires perfused tissue data from individual patients to detect ischemic tissue as outliers. The first ever clinical study involving a video-rate capable SI camera in laparoscopic partial nephrectomy was designed to validate this approach. Such study revealed particularly high inter-patient tissue heterogeneity under the presence of pathologies (cancer). Moreover, it demonstrated that this personalized approach is now capable of monitoring ischemia at video-rate with SI during laparoscopic surgery. In conclusion, this thesis addressed challenges related to slow image recording and processing during surgery. It also proposed a method for light source estimation to facilitate translation to open surgical procedures. Moreover, the methodology proposed in this thesis was validated in a wide range of domains: in silico, rat head and neck, pig liver and brain, and human skin and kidney. In particular, the first clinical trial with spectral imaging in minimally invasive surgery demonstrated that video-rate ischemia monitoring is now possible with deep learning

    Surface analysis and visualization from multi-light image collections

    Get PDF
    Multi-Light Image Collections (MLICs) are stacks of photos of a scene acquired with a fixed viewpoint and a varying surface illumination that provides large amounts of visual and geometric information. Over the last decades, a wide variety of methods have been devised to extract information from MLICs and have shown its use in different application domains to support daily activities. In this thesis, we present methods that leverage a MLICs for surface analysis and visualization. First, we provide background information: acquisition setup, light calibration and application areas where MLICs have been successfully used for the research of daily analysis work. Following, we discuss the use of MLIC for surface visualization and analysis and available tools used to support the analysis. Here, we discuss methods that strive to support the direct exploration of the captured MLIC, methods that generate relightable models from MLIC, non-photorealistic visualization methods that rely on MLIC, methods that estimate normal map from MLIC and we point out visualization tools used to do MLIC analysis. In chapter 3 we propose novel benchmark datasets (RealRTI, SynthRTI and SynthPS) that can be used to evaluate algorithms that rely on MLIC and discusses available benchmark for validation of photometric algorithms that can be also used to validate other MLIC-based algorithms. In chapter 4, we evaluate the performance of different photometric stereo algorithms using SynthPS for cultural heritage applications. RealRTI and SynthRTI have been used to evaluate the performance of (Neural)RTI method. Then, in chapter 5, we present a neural network-based RTI method, aka NeuralRTI, a framework for pixel-based encoding and relighting of RTI data. In this method using a simple autoencoder architecture, we show that it is possible to obtain a highly compressed representation that better preserves the original information and provides increased quality of virtual images relighted from novel directions, particularly in the case of challenging glossy materials. Finally, in chapter 6, we present a method for the detection of crack on the surface of paintings from multi-light image acquisitions and that can be used as well on single images and conclude our presentation
    • …
    corecore