57 research outputs found
CholecTrack20: A Dataset for Multi-Class Multiple Tool Tracking in Laparoscopic Surgery
Tool tracking in surgical videos is vital in computer-assisted intervention
for tasks like surgeon skill assessment, safety zone estimation, and
human-machine collaboration during minimally invasive procedures. The lack of
large-scale datasets hampers Artificial Intelligence implementation in this
domain. Current datasets exhibit overly generic tracking formalization, often
lacking surgical context: a deficiency that becomes evident when tools move out
of the camera's scope, resulting in rigid trajectories that hinder realistic
surgical representation. This paper addresses the need for a more precise and
adaptable tracking formalization tailored to the intricacies of endoscopic
procedures by introducing CholecTrack20, an extensive dataset meticulously
annotated for multi-class multi-tool tracking across three perspectives
representing the various ways of considering the temporal duration of a tool
trajectory: (1) intraoperative, (2) intracorporeal, and (3) visibility within
the camera's scope. The dataset comprises 20 laparoscopic videos with over
35,000 frames and 65,000 annotated tool instances with details on spatial
location, category, identity, operator, phase, and surgical visual conditions.
This detailed dataset caters to the evolving assistive requirements within a
procedure.Comment: Surgical tool tracking dataset paper, 15 pages, 9 figures, 4 table
CholecTriplet2021: A benchmark challenge for surgical action triplet recognition
Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging
real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a
coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing
surgical actions as triplets of 〈instrument, verb, target 〉 combination delivers more comprehensive details about
the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision
challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The
challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet
information. In this paper, we present the challenge setup and the assessment of the state-of-the-art deep
learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the
challenge organizers and 19 new deep learning algorithms from the competing teams are presented to recognize
surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from
4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches,
performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel
ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet
solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition
which is of utmost importance for the development of AI in surgery
- …