751 research outputs found
A comprehensive survey on recent deep learning-based methods applied to surgical data
Minimally invasive surgery is highly operator dependant with a lengthy
procedural time causing fatigue to surgeon and risks to patients such as injury
to organs, infection, bleeding, and complications of anesthesia. To mitigate
such risks, real-time systems are desired to be developed that can provide
intra-operative guidance to surgeons. For example, an automated system for tool
localization, tool (or tissue) tracking, and depth estimation can enable a
clear understanding of surgical scenes preventing miscalculations during
surgical procedures. In this work, we present a systematic review of recent
machine learning-based approaches including surgical tool localization,
segmentation, tracking, and 3D scene perception. Furthermore, we provide a
detailed overview of publicly available benchmark datasets widely used for
surgical navigation tasks. While recent deep learning architectures have shown
promising results, there are still several open research problems such as a
lack of annotated datasets, the presence of artifacts in surgical scenes, and
non-textured surfaces that hinder 3D reconstruction of the anatomical
structures. Based on our comprehensive review, we present a discussion on
current gaps and needed steps to improve the adaptation of technology in
surgery.Comment: This paper is to be submitted to International journal of computer
visio
From Forks to Forceps: A New Framework for Instance Segmentation of Surgical Instruments
Minimally invasive surgeries and related applications demand surgical tool
classification and segmentation at the instance level. Surgical tools are
similar in appearance and are long, thin, and handled at an angle. The
fine-tuning of state-of-the-art (SOTA) instance segmentation models trained on
natural images for instrument segmentation has difficulty discriminating
instrument classes. Our research demonstrates that while the bounding box and
segmentation mask are often accurate, the classification head mis-classifies
the class label of the surgical instrument. We present a new neural network
framework that adds a classification module as a new stage to existing instance
segmentation models. This module specializes in improving the classification of
instrument masks generated by the existing model. The module comprises
multi-scale mask attention, which attends to the instrument region and masks
the distracting background features. We propose training our classifier module
using metric learning with arc loss to handle low inter-class variance of
surgical instruments. We conduct exhaustive experiments on the benchmark
datasets EndoVis2017 and EndoVis2018. We demonstrate that our method
outperforms all (more than 18) SOTA methods compared with, and improves the
SOTA performance by at least 12 points (20%) on the EndoVis2017 benchmark
challenge and generalizes effectively across the datasets.Comment: WACV 202
MATIS: Masked-Attention Transformers for Surgical Instrument Segmentation
We propose Masked-Attention Transformers for Surgical Instrument Segmentation
(MATIS), a two-stage, fully transformer-based method that leverages modern
pixel-wise attention mechanisms for instrument segmentation. MATIS exploits the
instance-level nature of the task by employing a masked attention module that
generates and classifies a set of fine instrument region proposals. Our method
incorporates long-term video-level information through video transformers to
improve temporal consistency and enhance mask classification. We validate our
approach in the two standard public benchmarks, Endovis 2017 and Endovis 2018.
Our experiments demonstrate that MATIS' per-frame baseline outperforms previous
state-of-the-art methods and that including our temporal consistency module
boosts our model's performance further
SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation
The Segment Anything Model (SAM) is a powerful foundation model that has
revolutionised image segmentation. To apply SAM to surgical instrument
segmentation, a common approach is to locate precise points or boxes of
instruments and then use them as prompts for SAM in a zero-shot manner.
However, we observe two problems with this naive pipeline: (1) the domain gap
between natural objects and surgical instruments leads to poor generalisation
of SAM; and (2) SAM relies on precise point or box locations for accurate
segmentation, requiring either extensive manual guidance or a well-performing
specialist detector for prompt preparation, which leads to a complex
multi-stage pipeline. To address these problems, we introduce SurgicalSAM, a
novel end-to-end efficient-tuning approach for SAM to effectively integrate
surgical-specific information with SAM's pre-trained knowledge for improved
generalisation. Specifically, we propose a lightweight prototype-based class
prompt encoder for tuning, which directly generates prompt embeddings from
class prototypes and eliminates the use of explicit prompts for improved
robustness and a simpler pipeline. In addition, to address the low inter-class
variance among surgical instrument categories, we propose contrastive prototype
learning, further enhancing the discrimination of the class prototypes for more
accurate class prompting. The results of extensive experiments on both
EndoVis2018 and EndoVis2017 datasets demonstrate that SurgicalSAM achieves
state-of-the-art performance while only requiring a small number of tunable
parameters. The source code will be released at
https://github.com/wenxi-yue/SurgicalSAM.Comment: Technical Report. The source code will be released at
https://github.com/wenxi-yue/SurgicalSA
Domain Adaptive Sim-to-Real Segmentation of Oropharyngeal Organs
Video-assisted transoral tracheal intubation (TI) necessitates using an
endoscope that helps the physician insert a tracheal tube into the glottis
instead of the esophagus. The growing trend of robotic-assisted TI would
require a medical robot to distinguish anatomical features like an experienced
physician which can be imitated by utilizing supervised deep-learning
techniques. However, the real datasets of oropharyngeal organs are often
inaccessible due to limited open-source data and patient privacy. In this work,
we propose a domain adaptive Sim-to-Real framework called IoU-Ranking
Blend-ArtFlow (IRB-AF) for image segmentation of oropharyngeal organs. The
framework includes an image blending strategy called IoU-Ranking Blend (IRB)
and style-transfer method ArtFlow. Here, IRB alleviates the problem of poor
segmentation performance caused by significant datasets domain differences;
while ArtFlow is introduced to reduce the discrepancies between datasets
further. A virtual oropharynx image dataset generated by the SOFA framework is
used as the learning subject for semantic segmentation to deal with the limited
availability of actual endoscopic images. We adapted IRB-AF with the
state-of-the-art domain adaptive segmentation models. The results demonstrate
the superior performance of our approach in further improving the segmentation
accuracy and training stability.Comment: The manuscript is accepted by Medical & Biological Engineering &
Computing. Code and dataset:
https://github.com/gkw0010/EISOST-Sim2Real-Dataset-Releas
Context-aware learning for robot-assisted endovascular catheterization
Endovascular intervention has become a mainstream treatment of cardiovascular diseases. However, multiple challenges remain such as unwanted radiation exposures, limited two-dimensional image guidance, insufficient force perception and haptic cues. Fast evolving robot-assisted platforms improve the stability and accuracy of instrument manipulation. The master-slave system also removes radiation to the operator. However, the integration of robotic systems into the current surgical workflow is still debatable since repetitive, easy tasks have little value to be executed by the robotic teleoperation. Current systems offer very low autonomy, potential autonomous features could bring more benefits such as reduced cognitive workloads and human error, safer and more consistent instrument manipulation, ability to incorporate various medical imaging and sensing modalities. This research proposes frameworks for automated catheterisation with different machine learning-based algorithms, includes Learning-from-Demonstration, Reinforcement Learning, and Imitation Learning. Those frameworks focused on integrating context for tasks in the process of skill learning, hence achieving better adaptation to different situations and safer tool-tissue interactions. Furthermore, the autonomous feature was applied to next-generation, MR-safe robotic catheterisation platform. The results provide important insights into improving catheter navigation in the form of autonomous task planning, self-optimization with clinical relevant factors, and motivate the design of intelligent, intuitive, and collaborative robots under non-ionizing image modalities.Open Acces
Iterative multi-path tracking for video and volume segmentation with sparse point supervision
Recent machine learning strategies for segmentation tasks have shown great
ability when trained on large pixel-wise annotated image datasets. It remains a
major challenge however to aggregate such datasets, as the time and monetary
cost associated with collecting extensive annotations is extremely high. This
is particularly the case for generating precise pixel-wise annotations in video
and volumetric image data. To this end, this work presents a novel framework to
produce pixel-wise segmentations using minimal supervision. Our method relies
on 2D point supervision, whereby a single 2D location within an object of
interest is provided on each image of the data. Our method then estimates the
object appearance in a semi-supervised fashion by learning
object-image-specific features and by using these in a semi-supervised learning
framework. Our object model is then used in a graph-based optimization problem
that takes into account all provided locations and the image data in order to
infer the complete pixel-wise segmentation. In practice, we solve this
optimally as a tracking problem using a K-shortest path approach. Both the
object model and segmentation are then refined iteratively to further improve
the final segmentation. We show that by collecting 2D locations using a gaze
tracker, our approach can provide state-of-the-art segmentations on a range of
objects and image modalities (video and 3D volumes), and that these can then be
used to train supervised machine learning classifiers
Surgical Data Science - from Concepts toward Clinical Translation
Recent developments in data science in general and machine learning in particular have transformed the way experts envision the future of surgery. Surgical Data Science (SDS) is a new research field that aims to improve the quality of interventional healthcare through the capture, organization, analysis and modeling of data. While an increasing number of data-driven approaches and clinical applications have been studied in the fields of radiological and clinical data science, translational success stories are still lacking in surgery. In this publication, we shed light on the underlying reasons and provide a roadmap for future advances in the field. Based on an international workshop involving leading researchers in the field of SDS, we review current practice, key achievements and initiatives as well as available standards and tools for a number of topics relevant to the field, namely (1) infrastructure for data acquisition, storage and access in the presence of regulatory constraints, (2) data annotation and sharing and (3) data analytics. We further complement this technical perspective with (4) a review of currently available SDS products and the translational progress from academia and (5) a roadmap for faster clinical translation and exploitation of the full potential of SDS, based on an international multi-round Delphi process
- …