14 research outputs found
Surgical Phase and Instrument Recognition: How to identify appropriate Dataset Splits
Purpose: The development of machine learning models for surgical workflow and
instrument recognition from temporal data represents a challenging task due to
the complex nature of surgical workflows. In particular, the imbalanced
distribution of data is one of the major challenges in the domain of surgical
workflow recognition. In order to obtain meaningful results, careful
partitioning of data into training, validation, and test sets, as well as the
selection of suitable evaluation metrics are crucial. Methods: In this work, we
present an openly available web-based application that enables interactive
exploration of dataset partitions. The proposed visual framework facilitates
the assessment of dataset splits for surgical workflow recognition, especially
with regard to identifying sub-optimal dataset splits. Currently, it supports
visualization of surgical phase and instrument annotations. Results: In order
to validate the dedicated interactive visualizations, we use a dataset split of
the Cholec80 dataset. This dataset split was specifically selected to reflect a
case of strong data imbalance. Using our software, we were able to identify
phases, phase transitions, and combinations of surgical instruments that were
not represented in one of the sets. Conclusion: In order to obtain meaningful
results in highly unbalanced class distributions, special care should be taken
with respect to the selection of an appropriate split. Interactive data
visualization represents a promising approach for the assessment of machine
learning datasets. The source code is available at
https://github.com/Cardio-AI/endovis-mlComment: Accepted at the 14th International Conference on Information
Processing in Computer-Assisted Interventions (IPCAI 2023); 9 pages, 4
figures, 1 tabl
Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features
Recognizing the phases of a laparoscopic surgery (LS) operation form its
video constitutes a fundamental step for efficient content representation,
indexing and retrieval in surgical video databases. In the literature, most
techniques focus on phase segmentation of the entire LS video using
hand-crafted visual features, instrument usage signals, and recently
convolutional neural networks (CNNs). In this paper we address the problem of
phase recognition of short video shots (10s) of the operation, without
utilizing information about the preceding/forthcoming video frames, their phase
labels or the instruments used. We investigate four state-of-the-art CNN
architectures (Alexnet, VGG19, GoogleNet, and ResNet101), for feature
extraction via transfer learning. Visual saliency was employed for selecting
the most informative region of the image as input to the CNN. Video shot
representation was based on two temporal pooling mechanisms. Most importantly,
we investigate the role of 'elapsed time' (from the beginning of the
operation), and we show that inclusion of this feature can increase performance
dramatically (69% vs. 75% mean accuracy). Finally, a long short-term memory
(LSTM) network was trained for video shot classification based on the fusion of
CNN features with 'elapsed time', increasing the accuracy to 86%. Our results
highlight the prominent role of visual saliency, long-range temporal recursion
and 'elapsed time' (a feature so far ignored), for surgical phase recognition.Comment: 6 pages, 4 figures, 6 table
DeepPhase: Surgical Phase Recognition in CATARACTS Videos
Automated surgical workflow analysis and understanding can assist surgeons to
standardize procedures and enhance post-surgical assessment and indexing, as
well as, interventional monitoring. Computer-assisted interventional (CAI)
systems based on video can perform workflow estimation through surgical
instruments' recognition while linking them to an ontology of procedural
phases. In this work, we adopt a deep learning paradigm to detect surgical
instruments in cataract surgery videos which in turn feed a surgical phase
inference recurrent network that encodes temporal aspects of phase steps within
the phase classification. Our models present comparable to state-of-the-art
results for surgical tool detection and phase recognition with accuracies of 99
and 78% respectively.Comment: 8 pages, 3 figures, 1 table, MICCAI 201
Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic Surgery
The visual-question localized-answering (VQLA) system can serve as a
knowledgeable assistant in surgical education. Except for providing text-based
answers, the VQLA system can highlight the interested region for better
surgical scene understanding. However, deep neural networks (DNNs) suffer from
catastrophic forgetting when learning new knowledge. Specifically, when DNNs
learn on incremental classes or tasks, their performance on old tasks drops
dramatically. Furthermore, due to medical data privacy and licensing issues, it
is often difficult to access old data when updating continual learning (CL)
models. Therefore, we develop a non-exemplar continual surgical VQLA framework,
to explore and balance the rigidity-plasticity trade-off of DNNs in a
sequential learning paradigm. We revisit the distillation loss in CL tasks, and
propose rigidity-plasticity-aware distillation (RP-Dist) and self-calibrated
heterogeneous distillation (SH-Dist) to preserve the old knowledge. The weight
aligning (WA) technique is also integrated to adjust the weight bias between
old and new tasks. We further establish a CL framework on three public surgical
datasets in the context of surgical settings that consist of overlapping
classes between old and new surgical VQLA tasks. With extensive experiments, we
demonstrate that our proposed method excellently reconciles learning and
forgetting on the continual surgical VQLA over conventional CL methods. Our
code is publicly accessible.Comment: To appear in MICCAI 2023. Code availability:
https://github.com/longbai1006/CS-VQL
Computer Vision in the Surgical Operating Room
Background: Multiple types of surgical cameras are used in modern surgical practice and provide a rich visual signal that is used by surgeons to visualize the clinical site and make clinical decisions. This signal can also be used by artificial intelligence (AI) methods to provide support in identifying instruments, structures, or activities both in real-time during procedures and postoperatively for analytics and understanding of surgical processes. Summary: In this paper, we provide a succinct perspective on the use of AI and especially computer vision to power solutions for the surgical operating room (OR). The synergy between data availability and technical advances in computational power and AI methodology has led to rapid developments in the field and promising advances. Key Messages: With the increasing availability of surgical video sources and the convergence of technologiesaround video storage, processing, and understanding, we believe clinical solutions and products leveraging vision are going to become an important component of modern surgical capabilities. However, both technical and clinical challenges remain to be overcome to efficiently make use of vision-based approaches into the clinic
Surgical data science, an emerging field of medicine
Computer Assisted Surgery (CAS) significantly changed the course of interventional medicine. The development of medical imaging opened up the possibility for accurate, patient specific planning, and advanced imaging techniques provided the ground for the development of real-time navigation systems. The advancement of minimally invasive surgical techniques and tools required increasing manuality from the surgeon, which facilitated the development of tele-robotic manipulation. These systems provide a vast amount of objective inta-operative data, thus many believe that the next step could be big data analysis for creating and evaluating surgical process models. This emerging field of medicine, called Surgical Data Science, has the potential to improve intervetional medicine with objective statistical analysis, and therefore to provide better patient outcomes and a reduction in healthcare costs