479 research outputs found
ToolNet: Holistically-Nested Real-Time Segmentation of Robotic Surgical Tools
Real-time tool segmentation from endoscopic videos is an essential part of
many computer-assisted robotic surgical systems and of critical importance in
robotic surgical data science. We propose two novel deep learning architectures
for automatic segmentation of non-rigid surgical instruments. Both methods take
advantage of automated deep-learning-based multi-scale feature extraction while
trying to maintain an accurate segmentation quality at all resolutions. The two
proposed methods encode the multi-scale constraint inside the network
architecture. The first proposed architecture enforces it by cascaded
aggregation of predictions and the second proposed network does it by means of
a holistically-nested architecture where the loss at each scale is taken into
account for the optimization process. As the proposed methods are for real-time
semantic labeling, both present a reduced number of parameters. We propose the
use of parametric rectified linear units for semantic labeling in these small
architectures to increase the regularization ability of the design and maintain
the segmentation accuracy without overfitting the training sets. We compare the
proposed architectures against state-of-the-art fully convolutional networks.
We validate our methods using existing benchmark datasets, including ex vivo
cases with phantom tissue and different robotic surgical instruments present in
the scene. Our results show a statistically significant improved Dice
Similarity Coefficient over previous instrument segmentation methods. We
analyze our design choices and discuss the key drivers for improving accuracy.Comment: Paper accepted at IROS 201
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
Vision-based retargeting for endoscopic navigation
Endoscopy is a standard procedure for visualising the human gastrointestinal tract. With the advances in biophotonics, imaging techniques such as narrow band imaging, confocal laser endomicroscopy, and optical coherence tomography can be combined with normal endoscopy for assisting the early diagnosis of diseases, such as cancer. In the past decade, optical biopsy has emerged to be an effective tool for tissue analysis, allowing in vivo and in situ assessment of pathological sites with real-time feature-enhanced microscopic images. However, the non-invasive nature of optical biopsy leads to an intra-examination retargeting problem, which is associated with the difficulty of re-localising a biopsied site consistently throughout the whole examination. In addition to intra-examination retargeting, retargeting of a pathological site is even more challenging across examinations, due to tissue deformation and changing tissue morphologies and appearances. The purpose of this thesis is to address both the intra- and inter-examination retargeting problems associated with optical biopsy. We propose a novel vision-based framework for intra-examination retargeting. The proposed framework is based on combining visual tracking and detection with online learning of the appearance of the biopsied site. Furthermore, a novel cascaded detection approach based on random forests and structured support vector machines is developed to achieve efficient retargeting. To cater for reliable inter-examination retargeting, the solution provided in this thesis is achieved by solving an image retrieval problem, for which an online scene association approach is proposed to summarise an endoscopic video collected in the first examination into distinctive scenes. A hashing-based approach is then used to learn the intrinsic representations of these scenes, such that retargeting can be achieved in subsequent examinations by retrieving the relevant images using the learnt representations. For performance evaluation of the proposed frameworks, extensive phantom, ex vivo and in vivo experiments have been conducted, with results demonstrating the robustness and potential clinical values of the methods proposed.Open Acces
Dense Vision in Image-guided Surgery
Image-guided surgery needs an efficient and effective camera tracking system in order to perform augmented reality for overlaying preoperative models or label cancerous tissues on the 2D video images of the surgical scene. Tracking in endoscopic/laparoscopic scenes however is an extremely difficult task primarily due to tissue deformation, instrument invasion into the surgical scene and the presence of specular highlights. State of the art feature-based SLAM systems such as PTAM fail in tracking such scenes since the number of good features to track is very limited. When the scene is smoky and when there are instrument motions, it will cause feature-based tracking to fail immediately.
The work of this thesis provides a systematic approach to this problem using dense vision. We initially attempted to register a 3D preoperative model with multiple 2D endoscopic/laparoscopic images using a dense method but this approach did not perform well. We subsequently proposed stereo reconstruction to directly obtain the 3D structure of the scene. By using the dense reconstructed model together with robust estimation, we demonstrate that dense stereo tracking can be incredibly robust even within extremely challenging endoscopic/laparoscopic scenes.
Several validation experiments have been conducted in this thesis. The proposed stereo reconstruction algorithm has turned out to be the state of the art method for several publicly available ground truth datasets. Furthermore, the proposed robust dense stereo tracking algorithm has been proved highly accurate in synthetic environment (< 0.1 mm RMSE) and qualitatively extremely robust when being applied to real scenes in RALP prostatectomy surgery. This is an important step toward achieving accurate image-guided laparoscopic surgery.Open Acces
Unsupervised Odometry and Depth Learning for Endoscopic Capsule Robots
In the last decade, many medical companies and research groups have tried to
convert passive capsule endoscopes as an emerging and minimally invasive
diagnostic technology into actively steerable endoscopic capsule robots which
will provide more intuitive disease detection, targeted drug delivery and
biopsy-like operations in the gastrointestinal(GI) tract. In this study, we
introduce a fully unsupervised, real-time odometry and depth learner for
monocular endoscopic capsule robots. We establish the supervision by warping
view sequences and assigning the re-projection minimization to the loss
function, which we adopt in multi-view pose estimation and single-view depth
estimation network. Detailed quantitative and qualitative analyses of the
proposed framework performed on non-rigidly deformable ex-vivo porcine stomach
datasets proves the effectiveness of the method in terms of motion estimation
and depth recovery.Comment: submitted to IROS 201
Tracking and Mapping in Medical Computer Vision: A Review
As computer vision algorithms are becoming more capable, their applications
in clinical systems will become more pervasive. These applications include
diagnostics such as colonoscopy and bronchoscopy, guiding biopsies and
minimally invasive interventions and surgery, automating instrument motion and
providing image guidance using pre-operative scans. Many of these applications
depend on the specific visual nature of medical scenes and require designing
and applying algorithms to perform in this environment.
In this review, we provide an update to the field of camera-based tracking
and scene mapping in surgery and diagnostics in medical computer vision. We
begin with describing our review process, which results in a final list of 515
papers that we cover. We then give a high-level summary of the state of the art
and provide relevant background for those who need tracking and mapping for
their clinical applications. We then review datasets provided in the field and
the clinical needs therein. Then, we delve in depth into the algorithmic side,
and summarize recent developments, which should be especially useful for
algorithm designers and to those looking to understand the capability of
off-the-shelf methods. We focus on algorithms for deformable environments while
also reviewing the essential building blocks in rigid tracking and mapping
since there is a large amount of crossover in methods. Finally, we discuss the
current state of the tracking and mapping methods along with needs for future
algorithms, needs for quantification, and the viability of clinical
applications in the field. We conclude that new methods need to be designed or
combined to support clinical applications in deformable environments, and more
focus needs to be put into collecting datasets for training and evaluation.Comment: 31 pages, 17 figure
A comprehensive survey on recent deep learning-based methods applied to surgical data
Minimally invasive surgery is highly operator dependant with a lengthy
procedural time causing fatigue to surgeon and risks to patients such as injury
to organs, infection, bleeding, and complications of anesthesia. To mitigate
such risks, real-time systems are desired to be developed that can provide
intra-operative guidance to surgeons. For example, an automated system for tool
localization, tool (or tissue) tracking, and depth estimation can enable a
clear understanding of surgical scenes preventing miscalculations during
surgical procedures. In this work, we present a systematic review of recent
machine learning-based approaches including surgical tool localization,
segmentation, tracking, and 3D scene perception. Furthermore, we provide a
detailed overview of publicly available benchmark datasets widely used for
surgical navigation tasks. While recent deep learning architectures have shown
promising results, there are still several open research problems such as a
lack of annotated datasets, the presence of artifacts in surgical scenes, and
non-textured surfaces that hinder 3D reconstruction of the anatomical
structures. Based on our comprehensive review, we present a discussion on
current gaps and needed steps to improve the adaptation of technology in
surgery.Comment: This paper is to be submitted to International journal of computer
visio
Long Term Safety Area Tracking (LT-SAT) with online failure detection and recovery for robotic minimally invasive surgery
partially_open6Despite the benefits introduced by robotic systems in abdominal Minimally Invasive Surgery (MIS), major complications can still affect the outcome of the procedure, such as intra-operative bleeding. One of the causes is attributed to accidental damages to arteries or veins by the surgical tools, and some of the possible risk factors are related to the lack of sub-surface visibilty. Assistive tools guiding the surgical gestures to prevent these kind of injuries would represent a relevant step towards safer clinical procedures. However, it is still challenging to develop computer vision systems able to fulfill the main requirements: (i) long term robustness, (ii) adaptation to environment/object variation and (iii) real time processing. The purpose of this paper is to develop computer vision algorithms to robustly track soft tissue areas (Safety Area, SA), defined intra-operatively by the surgeon based on the real-time endoscopic images, or registered from a pre-operative surgical plan. We propose a framework to combine an optical flow algorithm with a tracking-by-detection approach in order to be robust against failures caused by: (i) partial occlusion, (ii) total occlusion, (iii) SA out of the field of view, (iv) deformation, (v) illumination changes, (vi) abrupt camera motion, (vii), blur and (viii) smoke. A Bayesian inference-based approach is used to detect the failure of the tracker, based on online context information. A Model Update Strategy (MUpS) is also proposed to improve the SA re-detection after failures, taking into account the changes of appearance of the SA model due to contact with instruments or image noise. The performance of the algorithm was assessed on two datasets, representing ex-vivo organs and in-vivo surgical scenarios. Results show that the proposed framework, enhanced with MUpS, is capable of maintain high tracking performance for extended periods of time ( ≃ 4 min - containing the aforementioned events) with high precision (0.7) and recall (0.8) values, and with a recovery time after a failure between 1 and 8 frames in the worst case.openPenza, Veronica; Du, Xiaofei; Stoyanov, Danail; Forgione, Antonello; Mattos, Leonardo S; De Momi, ElenaPenza, Veronica; Du, Xiaofei; Stoyanov, DANAIL VALENTINOV; Forgione, Antonello; Mattos, Leonardo S; De Momi, Elen
Stereo Dense Scene Reconstruction and Accurate Localization for Learning-Based Navigation of Laparoscope in Minimally Invasive Surgery
Objective: The computation of anatomical information and laparoscope position
is a fundamental block of surgical navigation in Minimally Invasive Surgery
(MIS). Recovering a dense 3D structure of surgical scene using visual cues
remains a challenge, and the online laparoscopic tracking primarily relies on
external sensors, which increases system complexity. Methods: Here, we propose
a learning-driven framework, in which an image-guided laparoscopic localization
with 3D reconstructions of complex anatomical structures is obtained. To
reconstruct the 3D structure of the whole surgical environment, we first
fine-tune a learning-based stereoscopic depth perception method, which is
robust to the texture-less and variant soft tissues, for depth estimation.
Then, we develop a dense visual reconstruction algorithm to represent the scene
by surfels, estimate the laparoscope poses and fuse the depth maps into a
unified reference coordinate for tissue reconstruction. To estimate poses of
new laparoscope views, we achieve a coarse-to-fine localization method, which
incorporates our reconstructed 3D model. Results: We evaluate the
reconstruction method and the localization module on three datasets, namely,
the stereo correspondence and reconstruction of endoscopic data (SCARED), the
ex-vivo phantom and tissue data collected with Universal Robot (UR) and Karl
Storz Laparoscope, and the in-vivo DaVinci robotic surgery dataset, where the
reconstructed 3D structures have rich details of surface texture with an
accuracy error under 1.71 mm and the localization module can accurately track
the laparoscope with only images as input. Conclusions: Experimental results
demonstrate the superior performance of the proposed method in 3D anatomy
reconstruction and laparoscopic localization. Significance: The proposed
framework can be potentially extended to the current surgical navigation
system
- …