1 research outputs found
Event Recognition in Laparoscopic Gynecology Videos with Hybrid Transformers
Analyzing laparoscopic surgery videos presents a complex and multifaceted
challenge, with applications including surgical training, intra-operative
surgical complication prediction, and post-operative surgical assessment.
Identifying crucial events within these videos is a significant prerequisite in
a majority of these applications. In this paper, we introduce a comprehensive
dataset tailored for relevant event recognition in laparoscopic gynecology
videos. Our dataset includes annotations for critical events associated with
major intra-operative challenges and post-operative complications. To validate
the precision of our annotations, we assess event recognition performance using
several CNN-RNN architectures. Furthermore, we introduce and evaluate a hybrid
transformer architecture coupled with a customized training-inference framework
to recognize four specific events in laparoscopic surgery videos. Leveraging
the Transformer networks, our proposed architecture harnesses inter-frame
dependencies to counteract the adverse effects of relevant content occlusion,
motion blur, and surgical scene variation, thus significantly enhancing event
recognition accuracy. Moreover, we present a frame sampling strategy designed
to manage variations in surgical scenes and the surgeons' skill level,
resulting in event recognition with high temporal resolution. We empirically
demonstrate the superiority of our proposed methodology in event recognition
compared to conventional CNN-RNN architectures through a series of extensive
experiments