420 research outputs found

    A deep learning framework for quality assessment and restoration in video endoscopy

    Full text link
    Endoscopy is a routine imaging technique used for both diagnosis and minimally invasive surgical treatment. Artifacts such as motion blur, bubbles, specular reflections, floating objects and pixel saturation impede the visual interpretation and the automated analysis of endoscopy videos. Given the widespread use of endoscopy in different clinical applications, we contend that the robust and reliable identification of such artifacts and the automated restoration of corrupted video frames is a fundamental medical imaging problem. Existing state-of-the-art methods only deal with the detection and restoration of selected artifacts. However, typically endoscopy videos contain numerous artifacts which motivates to establish a comprehensive solution. We propose a fully automatic framework that can: 1) detect and classify six different primary artifacts, 2) provide a quality score for each frame and 3) restore mildly corrupted frames. To detect different artifacts our framework exploits fast multi-scale, single stage convolutional neural network detector. We introduce a quality metric to assess frame quality and predict image restoration success. Generative adversarial networks with carefully chosen regularization are finally used to restore corrupted frames. Our detector yields the highest mean average precision (mAP at 5% threshold) of 49.0 and the lowest computational time of 88 ms allowing for accurate real-time processing. Our restoration models for blind deblurring, saturation correction and inpainting demonstrate significant improvements over previous methods. On a set of 10 test videos we show that our approach preserves an average of 68.7% which is 25% more frames than that retained from the raw videos.Comment: 14 page

    Keyframe Extraction in Endoscopic Video

    Get PDF
    In medical endoscopy more and more surgeons archive the recorded video streams in a long-term storage. One reason for this development, which is enforced by law in some countries, is to have evidence in case of lawsuits from patients. Another more practical reason is to allow later inspection of previous procedures and also to use parts of such videos for research and for training. However, due to the dramatic amount of video data recorded in a hospital on a daily basis, it is very important to have good preview images for these videos in order to allow for quick filtering of undesired content and for easier browsing through such a video archive. Unfortunately, common shot detection and keyframe extraction methods cannot be used for that video data, because these videos contain unedited and highly similar content, especially in terms of color and texture, and no shot boundaries at all. We propose a new keyframe extraction approach for this special video domain and show that our method is signi๏ฟฝcantly better than a previously proposed approach

    Exploiting Temporal Image Information in Minimally Invasive Surgery

    Get PDF
    Minimally invasive procedures rely on medical imaging instead of the surgeons direct vision. While preoperative images can be used for surgical planning and navigation, once the surgeon arrives at the target site real-time intraoperative imaging is needed. However, acquiring and interpreting these images can be challenging and much of the rich temporal information present in these images is not visible. The goal of this thesis is to improve image guidance for minimally invasive surgery in two main areas. First, by showing how high-quality ultrasound video can be obtained by integrating an ultrasound transducer directly into delivery devices for beating heart valve surgery. Secondly, by extracting hidden temporal information through video processing methods to help the surgeon localize important anatomical structures. Prototypes of delivery tools, with integrated ultrasound imaging, were developed for both transcatheter aortic valve implantation and mitral valve repair. These tools provided an on-site view that shows the tool-tissue interactions during valve repair. Additionally, augmented reality environments were used to add more anatomical context that aids in navigation and in interpreting the on-site video. Other procedures can be improved by extracting hidden temporal information from the intraoperative video. In ultrasound guided epidural injections, dural pulsation provides a cue in finding a clear trajectory to the epidural space. By processing the video using extended Kalman filtering, subtle pulsations were automatically detected and visualized in real-time. A statistical framework for analyzing periodicity was developed based on dynamic linear modelling. In addition to detecting dural pulsation in lumbar spine ultrasound, this approach was used to image tissue perfusion in natural video and generate ventilation maps from free-breathing magnetic resonance imaging. A second statistical method, based on spectral analysis of pixel intensity values, allowed blood flow to be detected directly from high-frequency B-mode ultrasound video. Finally, pulsatile cues in endoscopic video were enhanced through Eulerian video magnification to help localize critical vasculature. This approach shows particular promise in identifying the basilar artery in endoscopic third ventriculostomy and the prostatic artery in nerve-sparing prostatectomy. A real-time implementation was developed which processed full-resolution stereoscopic video on the da Vinci Surgical System

    ์ž„์ƒ์ˆ ๊ธฐ ํ–ฅ์ƒ์„ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฒ• ์—ฐ๊ตฌ: ๋Œ€์žฅ๋‚ด์‹œ๊ฒฝ ์ง„๋‹จ ๋ฐ ๋กœ๋ด‡์ˆ˜์ˆ  ์ˆ ๊ธฐ ํ‰๊ฐ€์— ์ ์šฉ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ์˜์šฉ์ƒ์ฒด๊ณตํ•™์ „๊ณต, 2020. 8. ๊น€ํฌ์ฐฌ.This paper presents deep learning-based methods for improving performance of clinicians. Novel methods were applied to the following two clinical cases and the results were evaluated. In the first study, a deep learning-based polyp classification algorithm for improving clinical performance of endoscopist during colonoscopy diagnosis was developed. Colonoscopy is the main method for diagnosing adenomatous polyp, which can multiply into a colorectal cancer and hyperplastic polyps. The classification algorithm was developed using convolutional neural network (CNN), trained with colorectal polyp images taken by a narrow-band imaging colonoscopy. The proposed method is built around an automatic machine learning (AutoML) which searches for the optimal architecture of CNN for colorectal polyp image classification and trains the weights of the architecture. In addition, gradient-weighted class activation mapping technique was used to overlay the probabilistic basis of the prediction result on the polyp location to aid the endoscopists visually. To verify the improvement in diagnostic performance, the efficacy of endoscopists with varying proficiency levels were compared with or without the aid of the proposed polyp classification algorithm. The results confirmed that, on average, diagnostic accuracy was improved and diagnosis time was shortened in all proficiency groups significantly. In the second study, a surgical instruments tracking algorithm for robotic surgery video was developed, and a model for quantitatively evaluating the surgeons surgical skill based on the acquired motion information of the surgical instruments was proposed. The movement of surgical instruments is the main component of evaluation for surgical skill. Therefore, the focus of this study was develop an automatic surgical instruments tracking algorithm, and to overcome the limitations presented by previous methods. The instance segmentation framework was developed to solve the instrument occlusion issue, and a tracking framework composed of a tracker and a re-identification algorithm was developed to maintain the type of surgical instruments being tracked in the video. In addition, algorithms for detecting the tip position of instruments and arm-indicator were developed to acquire the movement of devices specialized for the robotic surgery video. The performance of the proposed method was evaluated by measuring the difference between the predicted tip position and the ground truth position of the instruments using root mean square error, area under the curve, and Pearsons correlation analysis. Furthermore, motion metrics were calculated from the movement of surgical instruments, and a machine learning-based robotic surgical skill evaluation model was developed based on these metrics. These models were used to evaluate clinicians, and results were similar in the developed evaluation models, the Objective Structured Assessment of Technical Skill (OSATS), and the Global Evaluative Assessment of Robotic Surgery (GEARS) evaluation methods. In this study, deep learning technology was applied to colorectal polyp images for a polyp classification, and to robotic surgery videos for surgical instruments tracking. The improvement in clinical performance with the aid of these methods were evaluated and verified.๋ณธ ๋…ผ๋ฌธ์€ ์˜๋ฃŒ์ง„์˜ ์ž„์ƒ์ˆ ๊ธฐ ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฒ•๋“ค์„ ์ œ์•ˆํ•˜๊ณ  ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ์‹ค๋ก€์— ๋Œ€ํ•ด ์ ์šฉํ•˜์—ฌ ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. ์ฒซ ๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋Œ€์žฅ๋‚ด์‹œ๊ฒฝ์œผ๋กœ ๊ด‘ํ•™ ์ง„๋‹จ ์‹œ, ๋‚ด์‹œ๊ฒฝ ์ „๋ฌธ์˜์˜ ์ง„๋‹จ ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•˜์—ฌ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜์˜ ์šฉ์ข… ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜๊ณ , ๋‚ด์‹œ๊ฒฝ ์ „๋ฌธ์˜์˜ ์ง„๋‹จ ๋Šฅ๋ ฅ ํ–ฅ์ƒ ์—ฌ๋ถ€๋ฅผ ๊ฒ€์ฆํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ๋Œ€์žฅ๋‚ด์‹œ๊ฒฝ ๊ฒ€์‚ฌ๋กœ ์•”์ข…์œผ๋กœ ์ฆ์‹ํ•  ์ˆ˜ ์žˆ๋Š” ์„ ์ข…๊ณผ ๊ณผ์ฆ์‹์„ฑ ์šฉ์ข…์„ ์ง„๋‹จํ•˜๋Š” ๊ฒƒ์€ ์ค‘์š”ํ•˜๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ํ˜‘๋Œ€์—ญ ์˜์ƒ ๋‚ด์‹œ๊ฒฝ์œผ๋กœ ์ดฌ์˜ํ•œ ๋Œ€์žฅ ์šฉ์ข… ์˜์ƒ์œผ๋กœ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์„ ํ•™์Šตํ•˜์—ฌ ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์ œ์•ˆํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ž๋™ ๊ธฐ๊ณ„ํ•™์Šต (AutoML) ๋ฐฉ๋ฒ•์œผ๋กœ, ๋Œ€์žฅ ์šฉ์ข… ์˜์ƒ์— ์ตœ์ ํ™”๋œ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ๋ฅผ ์ฐพ๊ณ  ์‹ ๊ฒฝ๋ง์˜ ๊ฐ€์ค‘์น˜๋ฅผ ํ•™์Šตํ•˜์˜€๋‹ค. ๋˜ํ•œ ๊ธฐ์šธ๊ธฐ-๊ฐ€์ค‘์น˜ ํด๋ž˜์Šค ํ™œ์„ฑํ™” ๋งตํ•‘ ๊ธฐ๋ฒ•์„ ์ด์šฉํ•˜์—ฌ ๊ฐœ๋ฐœํ•œ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ๊ฒฐ๊ณผ์˜ ํ™•๋ฅ ์  ๊ทผ๊ฑฐ๋ฅผ ์šฉ์ข… ์œ„์น˜์— ์‹œ๊ฐ์ ์œผ๋กœ ๋‚˜ํƒ€๋‚˜๋„๋ก ํ•จ์œผ๋กœ ๋‚ด์‹œ๊ฒฝ ์ „๋ฌธ์˜์˜ ์ง„๋‹จ์„ ๋•๋„๋ก ํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ˆ™๋ จ๋„ ๊ทธ๋ฃน๋ณ„๋กœ ๋‚ด์‹œ๊ฒฝ ์ „๋ฌธ์˜๊ฐ€ ์šฉ์ข… ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๊ฒฐ๊ณผ๋ฅผ ์ฐธ๊ณ ํ•˜์˜€์„ ๋•Œ ์ง„๋‹จ ๋Šฅ๋ ฅ์ด ํ–ฅ์ƒ๋˜์—ˆ๋Š”์ง€ ๋น„๊ต ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๊ณ , ๋ชจ๋“  ๊ทธ๋ฃน์—์„œ ์œ ์˜๋ฏธํ•˜๊ฒŒ ์ง„๋‹จ ์ •ํ™•๋„๊ฐ€ ํ–ฅ์ƒ๋˜๊ณ  ์ง„๋‹จ ์‹œ๊ฐ„์ด ๋‹จ์ถ•๋˜์—ˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋‘ ๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋กœ๋ด‡์ˆ˜์ˆ  ๋™์˜์ƒ์—์„œ ์ˆ˜์ˆ ๋„๊ตฌ ์œ„์น˜ ์ถ”์  ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜๊ณ , ํš๋“ํ•œ ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์›€์ง์ž„ ์ •๋ณด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ˆ˜์ˆ ์ž์˜ ์ˆ™๋ จ๋„๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ํ‰๊ฐ€ํ•˜๋Š” ๋ชจ๋ธ์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์›€์ง์ž„์€ ์ˆ˜์ˆ ์ž์˜ ๋กœ๋ด‡์ˆ˜์ˆ  ์ˆ™๋ จ๋„๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ์ฃผ์š”ํ•œ ์ •๋ณด์ด๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ์—ฐ๊ตฌ๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜์˜ ์ž๋™ ์ˆ˜์ˆ ๋„๊ตฌ ์ถ”์  ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜์˜€์œผ๋ฉฐ, ๋‹ค์Œ ๋‘๊ฐ€์ง€ ์„ ํ–‰์—ฐ๊ตฌ์˜ ํ•œ๊ณ„์ ์„ ๊ทน๋ณตํ•˜์˜€๋‹ค. ์ธ์Šคํ„ด์Šค ๋ถ„ํ•  (Instance Segmentation) ํ”„๋ ˆ์ž„์›์„ ๊ฐœ๋ฐœํ•˜์—ฌ ํ์ƒ‰ (Occlusion) ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์˜€๊ณ , ์ถ”์ ๊ธฐ (Tracker)์™€ ์žฌ์‹๋ณ„ํ™” (Re-Identification) ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ๊ตฌ์„ฑ๋œ ์ถ”์  ํ”„๋ ˆ์ž„์›์„ ๊ฐœ๋ฐœํ•˜์—ฌ ๋™์˜์ƒ์—์„œ ์ถ”์ ํ•˜๋Š” ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์ข…๋ฅ˜๊ฐ€ ์œ ์ง€๋˜๋„๋ก ํ•˜์˜€๋‹ค. ๋˜ํ•œ ๋กœ๋ด‡์ˆ˜์ˆ  ๋™์˜์ƒ์˜ ํŠน์ˆ˜์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์›€์ง์ž„์„ ํš๋“ํ•˜๊ธฐ์œ„ํ•ด ์ˆ˜์ˆ ๋„๊ตฌ ๋ ์œ„์น˜์™€ ๋กœ๋ด‡ ํŒ”-์ธ๋””์ผ€์ดํ„ฐ (Arm-Indicator) ์ธ์‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์ œ์•ˆํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์„ฑ๋Šฅ์€ ์˜ˆ์ธกํ•œ ์ˆ˜์ˆ ๋„๊ตฌ ๋ ์œ„์น˜์™€ ์ •๋‹ต ์œ„์น˜ ๊ฐ„์˜ ํ‰๊ท  ์ œ๊ณฑ๊ทผ ์˜ค์ฐจ, ๊ณก์„  ์•„๋ž˜ ๋ฉด์ , ํ”ผ์–ด์Šจ ์ƒ๊ด€๋ถ„์„์œผ๋กœ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์›€์ง์ž„์œผ๋กœ๋ถ€ํ„ฐ ์›€์ง์ž„ ์ง€ํ‘œ๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ๋ฐ˜์˜ ๋กœ๋ด‡์ˆ˜์ˆ  ์ˆ™๋ จ๋„ ํ‰๊ฐ€ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ๊ฐœ๋ฐœํ•œ ํ‰๊ฐ€ ๋ชจ๋ธ์€ ๊ธฐ์กด์˜ Objective Structured Assessment of Technical Skill (OSATS), Global Evaluative Assessment of Robotic Surgery (GEARS) ํ‰๊ฐ€ ๋ฐฉ๋ฒ•๊ณผ ์œ ์‚ฌํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž„์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์˜๋ฃŒ์ง„์˜ ์ž„์ƒ์ˆ ๊ธฐ ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•˜์—ฌ ๋Œ€์žฅ ์šฉ์ข… ์˜์ƒ๊ณผ ๋กœ๋ด‡์ˆ˜์ˆ  ๋™์˜์ƒ์— ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ ์„ ์ ์šฉํ•˜๊ณ  ๊ทธ ์œ ํšจ์„ฑ์„ ํ™•์ธํ•˜์˜€์œผ๋ฉฐ, ํ–ฅํ›„์— ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์ž„์ƒ์—์„œ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋Š” ์ง„๋‹จ ๋ฐ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์˜ ๋Œ€์•ˆ์ด ๋  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€ํ•œ๋‹ค.Chapter 1 General Introduction 1 1.1 Deep Learning for Medical Image Analysis 1 1.2 Deep Learning for Colonoscipic Diagnosis 2 1.3 Deep Learning for Robotic Surgical Skill Assessment 3 1.4 Thesis Objectives 5 Chapter 2 Optical Diagnosis of Colorectal Polyps using Deep Learning with Visual Explanations 7 2.1 Introduction 7 2.1.1 Background 7 2.1.2 Needs 8 2.1.3 Related Work 9 2.2 Methods 11 2.2.1 Study Design 11 2.2.2 Dataset 14 2.2.3 Preprocessing 17 2.2.4 Convolutional Neural Networks (CNN) 21 2.2.4.1 Standard CNN 21 2.2.4.2 Search for CNN Architecture 22 2.2.4.3 Searched CNN Training 23 2.2.4.4 Visual Explanation 24 2.2.5 Evaluation of CNN and Endoscopist Performances 25 2.3 Experiments and Results 27 2.3.1 CNN Performance 27 2.3.2 Results of Visual Explanation 31 2.3.3 Endoscopist with CNN Performance 33 2.4 Discussion 45 2.4.1 Research Significance 45 2.4.2 Limitations 47 2.5 Conclusion 49 Chapter 3 Surgical Skill Assessment during Robotic Surgery by Deep Learning-based Surgical Instrument Tracking 50 3.1 Introduction 50 3.1.1 Background 50 3.1.2 Needs 51 3.1.3 Related Work 52 3.2 Methods 56 3.2.1 Study Design 56 3.2.2 Dataset 59 3.2.3 Instance Segmentation Framework 63 3.2.4 Tracking Framework 66 3.2.4.1 Tracker 66 3.2.4.2 Re-identification 68 3.2.5 Surgical Instrument Tip Detection 69 3.2.6 Arm-Indicator Recognition 71 3.2.7 Surgical Skill Prediction Model 71 3.3 Experiments and Results 78 3.3.1 Performance of Instance Segmentation Framework 78 3.3.2 Performance of Tracking Framework 82 3.3.3 Evaluation of Surgical Instruments Trajectory 83 3.3.4 Evaluation of Surgical Skill Prediction Model 86 3.4 Discussion 90 3.4.1 Research Significance 90 3.4.2 Limitations 92 3.5 Conclusion 96 Chapter 4 Summary and Future Works 97 4.1 Thesis Summary 97 4.2 Limitations and Future Works 98 Bibliography 100 Abstract in Korean 116 Acknowledgement 119Docto

    Efficient tool segmentation for endoscopic videos in the wild

    Get PDF
    In recent years, deep learning methods have become the most effective approach for tool segmentation in endoscopic images, achieving the state of the art on the available public benchmarks. However, these methods present some challenges that hinder their direct deployment in real world scenarios. This work explores how to solve two of the most common challenges: real-time and memory restrictions and false positives in frames with no tools. To cope with the first case, we show how to adapt an efficient general purpose semantic segmentation model. Then, we study how to cope with the common issue of only training on images with at least one tool. Then, when images of endoscopic procedures without tools are processed, there are a lot of false positives. To solve this, we propose to add an extra classification head that performs binary frame classification, to identify frames with no tools present. Finally, we present a thorough comparison of this approach with current state of the art on different benchmarks, including real medical practice recordings, demonstrating similar accuracy with much lower computational requirements

    A comprehensive survey on recent deep learning-based methods applied to surgical data

    Full text link
    Minimally invasive surgery is highly operator dependant with a lengthy procedural time causing fatigue to surgeon and risks to patients such as injury to organs, infection, bleeding, and complications of anesthesia. To mitigate such risks, real-time systems are desired to be developed that can provide intra-operative guidance to surgeons. For example, an automated system for tool localization, tool (or tissue) tracking, and depth estimation can enable a clear understanding of surgical scenes preventing miscalculations during surgical procedures. In this work, we present a systematic review of recent machine learning-based approaches including surgical tool localization, segmentation, tracking, and 3D scene perception. Furthermore, we provide a detailed overview of publicly available benchmark datasets widely used for surgical navigation tasks. While recent deep learning architectures have shown promising results, there are still several open research problems such as a lack of annotated datasets, the presence of artifacts in surgical scenes, and non-textured surfaces that hinder 3D reconstruction of the anatomical structures. Based on our comprehensive review, we present a discussion on current gaps and needed steps to improve the adaptation of technology in surgery.Comment: This paper is to be submitted to International journal of computer visio

    Stereoscopic Medical Data Video Quality Issues

    Get PDF
    Stereoscopic medical videos are recorded, e.g., in stereo endoscopy or during video recording medical/dental operations. This paper examines quality issues in the recorded stereoscopic medical videos, as insufficient quality may induce visual fatigue to doctors. No attention has been paid to stereo quality and ensuing fatigue issues in the scientific literature so far. Two of the most commonly encountered quality issues in stereoscopic data, namely stereoscopic window violations and bent windows, were searched for in stereo endoscopic medical videos. Furthermore, an additional stereo quality issue encountered in dental operation videos, namely excessive disparity, was detected and fixed. The conducted experiments prove the existence of such quality issues in stereoscopic medical data and highlight the need for their detection and correction

    Surgical video retrieval using deep neural networks

    Get PDF
    Although the amount of raw surgical videos, namely videos captured during surgical interventions, is growing fast, automatic retrieval and search remains a challenge. This is mainly due to the nature of the content, i.e. visually non-consistent tissue, diversity of internal organs, abrupt viewpoint changes and illumination variation. We propose a framework for retrieving surgical videos and a protocol for evaluating the results. The method is composed of temporal shot segmentation and representation based on deep features, and the protocol introduces novel criteria to the field. The experimental results prove the superiority of the proposed method and highlight the path towards a more effective protocol for evaluating surgical videos
    • โ€ฆ
    corecore