1,728 research outputs found
Robotic instrument segmentation with image-to-image translation
The semantic segmentation of robotic surgery video and the delineation of robotic instruments are important for enabling automation. Despite major recent progresses, the majority of the latest deep learning models for instrument detection and segmentation rely on large datasets with ground truth labels. While demonstrating the capability, reliance on large labelled data is a problem for practical applications because systems would need to be re-trained on domain variations such as procedure type or instrument sets. In this paper, we propose to alleviate this problem by training deep learning models on datasets that are synthesised using image-to-image translation techniques and we investigate different methods to perform this process optimally. Experimentally, we demonstrate that the same deep network architecture for robotic instrument segmentation can be trained on both real data and on our proposed synthetic data without affecting the quality of the output models' performance. We show this for several recent approaches and provide experimental support on publicly available datasets, which highlight the potential value of this approach
Synthetic and Real Inputs for Tool Segmentation in Robotic Surgery
Semantic tool segmentation in surgical videos is important for surgical scene understanding and computer-assisted interventions as well as for the development of robotic automation. The problem is challenging because different illumination conditions, bleeding, smoke and occlusions can reduce algorithm robustness. At present labelled data for training deep learning models is still lacking for semantic surgical instrument segmentation and in this paper we show that it may be possible to use robot kinematic data coupled with laparoscopic images to alleviate the labelling problem. We propose a new deep learning based model for parallel processing of both laparoscopic and simulation images for robust segmentation of surgical tools. Due to the lack of laparoscopic frames annotated with both segmentation ground truth and kinematic information a new custom dataset was generated using the da Vinci Research Kit (dVRK) and is made available
Instabilities at vicinal crystal surfaces - competition between the electromigration of the adatoms and the kinetic memory effect
We studied the step dynamics during sublimation and growth in the presence of
electromigration force acting on the adatoms. In the limit of fast surface
diffusion and slow kinetics of atom attachment-detachment at the steps we
formulate a model free of the quasi-static approximation in the calculation of
the adatom concentration on the terraces. Numerical integration of the
equations for the time evolution of the adatom concentrations and the equations
of step motion reveals two different step bunching instabilities: 1) step
density waves (small bunches which do not manifest any coarsening) induced by
the kinetic memory effect and 2) step bunching with coarsening when the
dynamics is dominated by the electromigration. The model developed in this
paper also provides very instructive illustrations of the Popkov-Krug dynamical
phase transition during sublimation and growth of a vicinal crystal surface.Comment: 15 pages, 6 figure
Widening siamese architectures for stereo matching
Computational stereo is one of the classical problems in computer vision. Numerous algorithms and solutions have been reported in recent years focusing on developing methods for computing similarity, aggregating it to obtain spatial support and finally optimizing an energy function to find the final disparity. In this paper, we focus on the feature extraction component of stereo matching architecture and we show standard CNNs operation can be used to improve the quality of the features used to find point correspondences. Furthermore, we use a simple space aggregation that hugely simplifies the correlation learning problem, allowing us to better evaluate the quality of the features extracted. Our results on benchmark data are compelling and show promising potential even without refining the solution
Evaporation and growth of crystals - propagation of step density compression waves at vicinal surfaces
We studied the step dynamics during crystal sublimation and growth in the
limit of fast surface diffusion and slow kinetics of atom attachment-detachment
at the steps. For this limit we formulate a model free of the quasi-static
approximation in the calculation of the adatom concentration on the terraces at
the crystal surface. Such a model provides a relatively simple way to study the
linear stability of a step train in a presence of step-step repulsion and an
absence of destabilizing factors (as Schwoebel effect, surface electromigration
etc.). The central result is that a critical velocity of the steps in the train
exists which separates the stability and instability regimes. When the step
velocity exceeds its critical value the plot of these trajectories manifests
clear space and time periodicity (step density compression waves propagate on
the vicinal surface). This ordered motion of the steps is preceded by a
relatively short transition period of disordered step dynamics.Comment: 18 pages, 6 figure
RCM-SLAM: Visual localisation and mapping under remote centre of motion constraints
In robotic surgery the motion of instruments
and the laparoscopic camera is constrained by their insertion
ports, i. e. a remote centre of motion (RCM). We propose a
Simultaneous Localisation and Mapping (SLAM) approach that
estimates laparoscopic camera motion under RCM constraints.
To achieve this we derive a minimal solver for the absolute
camera pose given two 2D-3D point correspondences (RCMPnP) and also a bundle adjustment optimiser that refines
camera poses within an RCM-constrained parameterisation.
These two methods are used together with previous work on
relative pose estimation under RCM [1] to assemble a SLAM
pipeline suitable for robotic surgery. Our simulations show that
RCM-PnP outperforms conventional PnP for a wide noise range
in the RCM position. Results with video footage from a robotic
prostatectomy show that RCM constraints significantly improve
camera pose estimatio
Catheter segmentation in X-ray fluoroscopy using synthetic data and transfer learning with light U-nets
Background and objectivesAutomated segmentation and tracking of surgical instruments and catheters under X-ray fluoroscopy hold the potential for enhanced image guidance in catheter-based endovascular procedures. This article presents a novel method for real-time segmentation of catheters and guidewires in 2d X-ray images. We employ Convolutional Neural Networks (CNNs) and propose a transfer learning approach, using synthetic fluoroscopic images, to develop a lightweight version of the U-Net architecture. Our strategy, requiring a small amount of manually annotated data, streamlines the training process and results in a U-Net model, which achieves comparable performance to the state-of-the-art segmentation, with a decreased number of trainable parameters.
MethodsThe proposed transfer learning approach exploits high-fidelity synthetic images generated from real fluroscopic backgrounds. We implement a two-stage process, initial end-to-end training and fine-tuning, to develop two versions of our model, using synthetic and phantom fluoroscopic images independently. A small number of manually annotated in-vivo images is employed to fine-tune the deepest 7 layers of the U-Net architecture, producing a network specialized for pixel-wise catheter/guidewire segmentation. The network takes as input a single grayscale image and outputs the segmentation result as a binary mask against the background.
ResultsEvaluation is carried out with images from in-vivo fluoroscopic video sequences from six endovascular procedures, with different surgical setups. We validate the effectiveness of developing the U-Net models using synthetic data, in tests where fine-tuning and testing in-vivo takes place both by dividing data from all procedures into independent fine-tuning/testing subsets as well as by using different in-vivo sequences. Accurate catheter/guidewire segmentation (average Dice coefficient of ~ 0.55, ~ 0.26 and ~ 0.17) is obtained with both U-Net models. Compared to the state-of-the-art CNN models, the proposed U-Net achieves comparable performance ( ± 5% average Dice coefficients) in terms of segmentation accuracy, while yielding a 84% reduction of the testing time. This adds flexibility for real-time operation and makes our network adaptable to increased input resolution.
ConclusionsThis work presents a new approach in the development of CNN models for pixel-wise segmentation of surgical catheters in X-ray fluoroscopy, exploiting synthetic images and transfer learning. Our methodology reduces the need for manually annotating large volumes of data for training. This represents an important advantage, given that manual pixel-wise annotations is a key bottleneck in developing CNN segmentation models. Combined with a simplified U-Net model, our work yields significant advantages compared to current state-of-the-art solutions
RCM-SLAM: Visual localisation and mapping under remote centre of motion constraints
In robotic surgery the motion of instruments and the laparoscopic camera is constrained by their insertion ports, i. e. a remote centre of motion (RCM). We propose a Simultaneous Localisation and Mapping (SLAM) approach that estimates laparoscopic camera motion under RCM constraints. To achieve this we derive a minimal solver for the absolute camera pose given two 2D-3D point correspondences (RCM-PnP) and also a bundle adjustment optimiser that refines camera poses within an RCM-constrained parameterisation. These two methods are used together with previous work on relative pose estimation under RCM [1] to assemble a SLAM pipeline suitable for robotic surgery. Our simulations show that RCM-PnP outperforms conventional PnP for a wide noise range in the RCM position. Results with video footage from a robotic prostatectomy show that RCM constraints significantly improve camera pose estimation
Computer Vision in the Surgical Operating Room
Background: Multiple types of surgical cameras are used in modern surgical practice and provide a rich visual signal that is used by surgeons to visualize the clinical site and make clinical decisions. This signal can also be used by artificial intelligence (AI) methods to provide support in identifying instruments, structures, or activities both in real-time during procedures and postoperatively for analytics and understanding of surgical processes. Summary: In this paper, we provide a succinct perspective on the use of AI and especially computer vision to power solutions for the surgical operating room (OR). The synergy between data availability and technical advances in computational power and AI methodology has led to rapid developments in the field and promising advances. Key Messages: With the increasing availability of surgical video sources and the convergence of technologiesaround video storage, processing, and understanding, we believe clinical solutions and products leveraging vision are going to become an important component of modern surgical capabilities. However, both technical and clinical challenges remain to be overcome to efficiently make use of vision-based approaches into the clinic
Photometric and spectroscopic variability of the FUor star V582 Aurigae
We carried out BVRI CCD photometric observations in the field of V582 Aur
from 2009 August to 2013 February. We acquired high-, medium-, and
low-resolution spectroscopy of V582 Aur during this period. To study the
pre-outburst variability of the target and construct its historical light
curve, we searched for archival observations in photographic plate collections.
Both CCD and photographic observations were analyzed using a sequence of 14
stars in the field of V582 Aur calibrated in BVRI. The pre-outburst
photographic observations of V582 Aur show low-amplitude light variations
typical of T Tauri stars. Archival photographic observations indicate that the
increase in brightness began in late 1984 or early 1985 and the star reached
the maximum level of brightness at 1986 January. The spectral type of V582 Aur
can be defined as G0I with strong P Cyg profiles of H alpha and Na I D lines,
which are typical of FU Orionis objects. Our BVRI photometric observations show
large amplitude variations V~2.8 mag. during the 3.5 year period of
observations. Most of the time, however, the star remains in a state close to
the maximum brightness. The deepest drop in brightness was observed in the
spring of 2012, when the brightness of the star fell to a level close to the
pre-outburst. The multicolor photometric data show a color reversal during the
minimum in brightness, which is typical of UX Ori variables. The corresponding
spectral observations show strong variability in the profiles and intensities
of the spectral lines (especially H alpha), which indicate significant changes
in the accretion rate. On the basis of photometric monitoring performed over
the past three years, the spectral properties of the maximal light, and the
shape of the long-term light curve, we confirm the affiliation of V582 Aur to
the group of FU Orionis objects.Comment: 9 pages, 8 figures, accepted for publication in A&
- …