982 research outputs found
Activity profiling for minimally invasive surgery
Imperial Users onl
MATIS: Masked-Attention Transformers for Surgical Instrument Segmentation
We propose Masked-Attention Transformers for Surgical Instrument Segmentation
(MATIS), a two-stage, fully transformer-based method that leverages modern
pixel-wise attention mechanisms for instrument segmentation. MATIS exploits the
instance-level nature of the task by employing a masked attention module that
generates and classifies a set of fine instrument region proposals. Our method
incorporates long-term video-level information through video transformers to
improve temporal consistency and enhance mask classification. We validate our
approach in the two standard public benchmarks, Endovis 2017 and Endovis 2018.
Our experiments demonstrate that MATIS' per-frame baseline outperforms previous
state-of-the-art methods and that including our temporal consistency module
boosts our model's performance further
A comprehensive survey on recent deep learning-based methods applied to surgical data
Minimally invasive surgery is highly operator dependant with a lengthy
procedural time causing fatigue to surgeon and risks to patients such as injury
to organs, infection, bleeding, and complications of anesthesia. To mitigate
such risks, real-time systems are desired to be developed that can provide
intra-operative guidance to surgeons. For example, an automated system for tool
localization, tool (or tissue) tracking, and depth estimation can enable a
clear understanding of surgical scenes preventing miscalculations during
surgical procedures. In this work, we present a systematic review of recent
machine learning-based approaches including surgical tool localization,
segmentation, tracking, and 3D scene perception. Furthermore, we provide a
detailed overview of publicly available benchmark datasets widely used for
surgical navigation tasks. While recent deep learning architectures have shown
promising results, there are still several open research problems such as a
lack of annotated datasets, the presence of artifacts in surgical scenes, and
non-textured surfaces that hinder 3D reconstruction of the anatomical
structures. Based on our comprehensive review, we present a discussion on
current gaps and needed steps to improve the adaptation of technology in
surgery.Comment: This paper is to be submitted to International journal of computer
visio
Towards real-time multiple surgical tool tracking
Surgical tool tracking is an essential building block for computer-assisted interventions (CAI) and applications like video summarisation, workflow analysis and surgical navigation. Vision-based instrument tracking in laparoscopic surgical data faces significant challenges such as fast instrument motion, multiple simultaneous instruments and re-initialisation due to out-of-view conditions or instrument occlusions. In this paper, we propose a real-time multiple object tracking framework for whole laparoscopic tools, which extends an existing single object tracker. We introduce a geometric object descriptor, which helps with overlapping bounding box disambiguation, fast motion and optimal assignment between existing trajectories and new hypotheses. We achieve 99.51% and 75.64% average accuracy on ex-vivo robotic data and in-vivo laparoscopic sequences respectively from the Endovis’15 Instrument Tracking Dataset. The proposed geometric descriptor increased the performance on laparoscopic data by 32%, significantly reducing identity switches, false negatives and false positives. Overall, the proposed pipeline can successfully recover trajectories over long-sequences and it runs in real-time at approximately 25–29 fps
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
SAF-IS: a Spatial Annotation Free Framework for Instance Segmentation of Surgical Tools
Instance segmentation of surgical instruments is a long-standing research
problem, crucial for the development of many applications for computer-assisted
surgery. This problem is commonly tackled via fully-supervised training of deep
learning models, requiring expensive pixel-level annotations to train. In this
work, we develop a framework for instance segmentation not relying on spatial
annotations for training. Instead, our solution only requires binary tool
masks, obtainable using recent unsupervised approaches, and binary tool
presence labels, freely obtainable in robot-assisted surgery. Based on the
binary mask information, our solution learns to extract individual tool
instances from single frames, and to encode each instance into a compact vector
representation, capturing its semantic features. Such representations guide the
automatic selection of a tiny number of instances (8 only in our experiments),
displayed to a human operator for tool-type labelling. The gathered information
is finally used to match each training instance with a binary tool presence
label, providing an effective supervision signal to train a tool instance
classifier. We validate our framework on the EndoVis 2017 and 2018 segmentation
datasets. We provide results using binary masks obtained either by manual
annotation or as predictions of an unsupervised binary segmentation model. The
latter solution yields an instance segmentation approach completely free from
spatial annotations, outperforming several state-of-the-art fully-supervised
segmentation approaches
- …