1,708 research outputs found

    Robotic instrument segmentation with image-to-image translation

    Get PDF
    The semantic segmentation of robotic surgery video and the delineation of robotic instruments are important for enabling automation. Despite major recent progresses, the majority of the latest deep learning models for instrument detection and segmentation rely on large datasets with ground truth labels. While demonstrating the capability, reliance on large labelled data is a problem for practical applications because systems would need to be re-trained on domain variations such as procedure type or instrument sets. In this paper, we propose to alleviate this problem by training deep learning models on datasets that are synthesised using image-to-image translation techniques and we investigate different methods to perform this process optimally. Experimentally, we demonstrate that the same deep network architecture for robotic instrument segmentation can be trained on both real data and on our proposed synthetic data without affecting the quality of the output models' performance. We show this for several recent approaches and provide experimental support on publicly available datasets, which highlight the potential value of this approach

    Synthetic and Real Inputs for Tool Segmentation in Robotic Surgery

    Get PDF
    Semantic tool segmentation in surgical videos is important for surgical scene understanding and computer-assisted interventions as well as for the development of robotic automation. The problem is challenging because different illumination conditions, bleeding, smoke and occlusions can reduce algorithm robustness. At present labelled data for training deep learning models is still lacking for semantic surgical instrument segmentation and in this paper we show that it may be possible to use robot kinematic data coupled with laparoscopic images to alleviate the labelling problem. We propose a new deep learning based model for parallel processing of both laparoscopic and simulation images for robust segmentation of surgical tools. Due to the lack of laparoscopic frames annotated with both segmentation ground truth and kinematic information a new custom dataset was generated using the da Vinci Research Kit (dVRK) and is made available

    Instabilities at vicinal crystal surfaces - competition between the electromigration of the adatoms and the kinetic memory effect

    Full text link
    We studied the step dynamics during sublimation and growth in the presence of electromigration force acting on the adatoms. In the limit of fast surface diffusion and slow kinetics of atom attachment-detachment at the steps we formulate a model free of the quasi-static approximation in the calculation of the adatom concentration on the terraces. Numerical integration of the equations for the time evolution of the adatom concentrations and the equations of step motion reveals two different step bunching instabilities: 1) step density waves (small bunches which do not manifest any coarsening) induced by the kinetic memory effect and 2) step bunching with coarsening when the dynamics is dominated by the electromigration. The model developed in this paper also provides very instructive illustrations of the Popkov-Krug dynamical phase transition during sublimation and growth of a vicinal crystal surface.Comment: 15 pages, 6 figure

    Widening siamese architectures for stereo matching

    Get PDF
    Computational stereo is one of the classical problems in computer vision. Numerous algorithms and solutions have been reported in recent years focusing on developing methods for computing similarity, aggregating it to obtain spatial support and finally optimizing an energy function to find the final disparity. In this paper, we focus on the feature extraction component of stereo matching architecture and we show standard CNNs operation can be used to improve the quality of the features used to find point correspondences. Furthermore, we use a simple space aggregation that hugely simplifies the correlation learning problem, allowing us to better evaluate the quality of the features extracted. Our results on benchmark data are compelling and show promising potential even without refining the solution

    Evaporation and growth of crystals - propagation of step density compression waves at vicinal surfaces

    Full text link
    We studied the step dynamics during crystal sublimation and growth in the limit of fast surface diffusion and slow kinetics of atom attachment-detachment at the steps. For this limit we formulate a model free of the quasi-static approximation in the calculation of the adatom concentration on the terraces at the crystal surface. Such a model provides a relatively simple way to study the linear stability of a step train in a presence of step-step repulsion and an absence of destabilizing factors (as Schwoebel effect, surface electromigration etc.). The central result is that a critical velocity of the steps in the train exists which separates the stability and instability regimes. When the step velocity exceeds its critical value the plot of these trajectories manifests clear space and time periodicity (step density compression waves propagate on the vicinal surface). This ordered motion of the steps is preceded by a relatively short transition period of disordered step dynamics.Comment: 18 pages, 6 figure

    RCM-SLAM: Visual localisation and mapping under remote centre of motion constraints

    Get PDF
    In robotic surgery the motion of instruments and the laparoscopic camera is constrained by their insertion ports, i. e. a remote centre of motion (RCM). We propose a Simultaneous Localisation and Mapping (SLAM) approach that estimates laparoscopic camera motion under RCM constraints. To achieve this we derive a minimal solver for the absolute camera pose given two 2D-3D point correspondences (RCM-PnP) and also a bundle adjustment optimiser that refines camera poses within an RCM-constrained parameterisation. These two methods are used together with previous work on relative pose estimation under RCM [1] to assemble a SLAM pipeline suitable for robotic surgery. Our simulations show that RCM-PnP outperforms conventional PnP for a wide noise range in the RCM position. Results with video footage from a robotic prostatectomy show that RCM constraints significantly improve camera pose estimation

    RCM-SLAM: Visual localisation and mapping under remote centre of motion constraints

    Get PDF
    In robotic surgery the motion of instruments and the laparoscopic camera is constrained by their insertion ports, i. e. a remote centre of motion (RCM). We propose a Simultaneous Localisation and Mapping (SLAM) approach that estimates laparoscopic camera motion under RCM constraints. To achieve this we derive a minimal solver for the absolute camera pose given two 2D-3D point correspondences (RCMPnP) and also a bundle adjustment optimiser that refines camera poses within an RCM-constrained parameterisation. These two methods are used together with previous work on relative pose estimation under RCM [1] to assemble a SLAM pipeline suitable for robotic surgery. Our simulations show that RCM-PnP outperforms conventional PnP for a wide noise range in the RCM position. Results with video footage from a robotic prostatectomy show that RCM constraints significantly improve camera pose estimatio

    Catheter segmentation in X-ray fluoroscopy using synthetic data and transfer learning with light U-nets

    Get PDF
    Background and objectivesAutomated segmentation and tracking of surgical instruments and catheters under X-ray fluoroscopy hold the potential for enhanced image guidance in catheter-based endovascular procedures. This article presents a novel method for real-time segmentation of catheters and guidewires in 2d X-ray images. We employ Convolutional Neural Networks (CNNs) and propose a transfer learning approach, using synthetic fluoroscopic images, to develop a lightweight version of the U-Net architecture. Our strategy, requiring a small amount of manually annotated data, streamlines the training process and results in a U-Net model, which achieves comparable performance to the state-of-the-art segmentation, with a decreased number of trainable parameters. MethodsThe proposed transfer learning approach exploits high-fidelity synthetic images generated from real fluroscopic backgrounds. We implement a two-stage process, initial end-to-end training and fine-tuning, to develop two versions of our model, using synthetic and phantom fluoroscopic images independently. A small number of manually annotated in-vivo images is employed to fine-tune the deepest 7 layers of the U-Net architecture, producing a network specialized for pixel-wise catheter/guidewire segmentation. The network takes as input a single grayscale image and outputs the segmentation result as a binary mask against the background. ResultsEvaluation is carried out with images from in-vivo fluoroscopic video sequences from six endovascular procedures, with different surgical setups. We validate the effectiveness of developing the U-Net models using synthetic data, in tests where fine-tuning and testing in-vivo takes place both by dividing data from all procedures into independent fine-tuning/testing subsets as well as by using different in-vivo sequences. Accurate catheter/guidewire segmentation (average Dice coefficient of ~ 0.55, ~ 0.26 and ~ 0.17) is obtained with both U-Net models. Compared to the state-of-the-art CNN models, the proposed U-Net achieves comparable performance ( ± 5% average Dice coefficients) in terms of segmentation accuracy, while yielding a 84% reduction of the testing time. This adds flexibility for real-time operation and makes our network adaptable to increased input resolution. ConclusionsThis work presents a new approach in the development of CNN models for pixel-wise segmentation of surgical catheters in X-ray fluoroscopy, exploiting synthetic images and transfer learning. Our methodology reduces the need for manually annotating large volumes of data for training. This represents an important advantage, given that manual pixel-wise annotations is a key bottleneck in developing CNN segmentation models. Combined with a simplified U-Net model, our work yields significant advantages compared to current state-of-the-art solutions

    Photometric and spectroscopic variability of the FUor star V582 Aurigae

    Full text link
    We carried out BVRI CCD photometric observations in the field of V582 Aur from 2009 August to 2013 February. We acquired high-, medium-, and low-resolution spectroscopy of V582 Aur during this period. To study the pre-outburst variability of the target and construct its historical light curve, we searched for archival observations in photographic plate collections. Both CCD and photographic observations were analyzed using a sequence of 14 stars in the field of V582 Aur calibrated in BVRI. The pre-outburst photographic observations of V582 Aur show low-amplitude light variations typical of T Tauri stars. Archival photographic observations indicate that the increase in brightness began in late 1984 or early 1985 and the star reached the maximum level of brightness at 1986 January. The spectral type of V582 Aur can be defined as G0I with strong P Cyg profiles of H alpha and Na I D lines, which are typical of FU Orionis objects. Our BVRI photometric observations show large amplitude variations V~2.8 mag. during the 3.5 year period of observations. Most of the time, however, the star remains in a state close to the maximum brightness. The deepest drop in brightness was observed in the spring of 2012, when the brightness of the star fell to a level close to the pre-outburst. The multicolor photometric data show a color reversal during the minimum in brightness, which is typical of UX Ori variables. The corresponding spectral observations show strong variability in the profiles and intensities of the spectral lines (especially H alpha), which indicate significant changes in the accretion rate. On the basis of photometric monitoring performed over the past three years, the spectral properties of the maximal light, and the shape of the long-term light curve, we confirm the affiliation of V582 Aur to the group of FU Orionis objects.Comment: 9 pages, 8 figures, accepted for publication in A&

    Computer Vision in the Surgical Operating Room

    Get PDF
    Background: Multiple types of surgical cameras are used in modern surgical practice and provide a rich visual signal that is used by surgeons to visualize the clinical site and make clinical decisions. This signal can also be used by artificial intelligence (AI) methods to provide support in identifying instruments, structures, or activities both in real-time during procedures and postoperatively for analytics and understanding of surgical processes. Summary: In this paper, we provide a succinct perspective on the use of AI and especially computer vision to power solutions for the surgical operating room (OR). The synergy between data availability and technical advances in computational power and AI methodology has led to rapid developments in the field and promising advances. Key Messages: With the increasing availability of surgical video sources and the convergence of technologiesaround video storage, processing, and understanding, we believe clinical solutions and products leveraging vision are going to become an important component of modern surgical capabilities. However, both technical and clinical challenges remain to be overcome to efficiently make use of vision-based approaches into the clinic
    corecore