    Polyp Localization and Segmentation in Colonoscopy Images by Means of a Model of Appearance for Polyps

    Advisor/s: F. Javier Sánchez, Fernando Vilariño. Date and location of PhD thesis defense: 17 December 2012, Autonomous University of Barcelona.Colorectal cancer is the fourth most common cause of cancer death worldwide and its survival rate depends on the stage in which it is detected on hence the necessity for an early colon screening. There are several screening techniques but colonoscopy is still nowadays the gold standard, although it has some drawbacks such as the miss rate. Our contribution, in the field of intelligent systems for colonoscopy, aims at providing a polyp localization and a polyp segmentation system based on a model of appearance for polyps. To develop both methods we define a model of appearance for polyps, which describes a polyp as enclosed by intensity valleys. The novelty of our contribution resides on the fact that we include in our model aspects of the image formation and we also consider the presence of other elements from the endoluminal scene such as specular highlights and blood vessels, which have an impact on the performance of our methods. In order to develop our polyp localization method we accumulate valley information in order to generate energy maps, which are also used to guide the polyp segmentation. Our methods achieve promising results in polyp localization and segmentation. As we want to explore the usability of our methods we present a comparative analysis between physicians fixations obtained via an eye tracking device and our polyp localization method. The results show that our method is indistinguishable to novice physicians although it is far from expert physicians

    Comparative Validation of Polyp Detection Methods in Video Colonoscopy: Results from the MICCAI 2015 Endoscopic Vision Challenge

    Colonoscopy is the gold standard for colon cancer screening though still some polyps are missed, thus preventing early disease detection and treatment. Several computational systems have been proposed to assist polyp detection during colonoscopy but so far without consistent evaluation. The lack of publicly available annotated databases has made it difficult to compare methods and to assess if they achieve performance levels acceptable for clinical use. The Automatic Polyp Detection subchallenge, conducted as part of the Endoscopic Vision Challenge (http://endovis.grand-challenge.org) at the international conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2015, was an effort to address this need. In this paper, we report the results of this comparative evaluation of polyp detection methods, as well as describe additional experiments to further explore differences between methods. We define performance metrics and provide evaluation databases that allow comparison of multiple methodologies. Results show that convolutional neural networks (CNNs) are the state of the art. Nevertheless it is also demonstrated that combining different methodologies can lead to an improved overall performance

    A Benchmark for endoluminal scene segmentation of colonoscopy images

    Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss rate and the inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing decision support systems (DSS) aiming to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image segmentation, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. The proposed dataset consists of 4 relevant classes to inspect the endoluminal scene, targeting different clinical needs. Together with the dataset and taking advantage of advances in semantic segmentation literature, we provide new baselines by training standard fully convolutional networks (FCNs). We perform a comparative study to show that FCNs significantly outperform, without any further postprocessing, prior results in endoluminal scene segmentation, especially with respect to polyp segmentation and localization

    Deep Learning-based Solutions to Improve Diagnosis in Wireless Capsule Endoscopy

    [eng] Deep Learning (DL) models have gained extensive attention due to their remarkable performance in a wide range of real-world applications, particularly in computer vision. This achievement, combined with the increase in available medical records, has made it possible to open up new opportunities for analyzing and interpreting healthcare data. This symbiotic relationship can enhance the diagnostic process by identifying abnormalities, patterns, and trends, resulting in more precise, personalized, and effective healthcare for patients. Wireless Capsule Endoscopy (WCE) is a non-invasive medical imaging technique used to visualize the entire Gastrointestinal (GI) tract. Up to this moment, physicians meticulously review the captured frames to identify pathologies and diagnose patients. This manual process is time- consuming and prone to errors due to the challenges of interpreting the complex nature of WCE procedures. Thus, it demands a high level of attention, expertise, and experience. To overcome these drawbacks, shorten the screening process, and improve the diagnosis, efficient and accurate DL methods are required. This thesis proposes DL solutions to the following problems encountered in the analysis of WCE studies: pathology detection, anatomical landmark identification, and Out-of-Distribution (OOD) sample handling. These solutions aim to achieve robust systems that minimize the duration of the video analysis and reduce the number of undetected lesions. Throughout their development, several DL drawbacks have appeared, including small and imbalanced datasets. These limitations have also been addressed, ensuring that they do not hinder the generalization of neural networks, leading to suboptimal performance and overfitting. To address the previous WCE problems and overcome the DL challenges, the proposed systems adopt various strategies that utilize the power advantage of Triplet Loss (TL) and Self-Supervised Learning (SSL) techniques. Mainly, TL has been used to improve the generalization of the models, while SSL methods have been employed to leverage the unlabeled data to obtain useful representations. The presented methods achieve State-of-the-art results in the aforementioned medical problems and contribute to the ongoing research to improve the diagnostic of WCE studies.[cat] Els models d’aprenentatge profund (AP) han acaparat molta atenció a causa del seu rendiment en una àmplia gamma d'aplicacions del món real, especialment en visió per ordinador. Aquest fet, combinat amb l'increment de registres mèdics disponibles, ha permès obrir noves oportunitats per analitzar i interpretar les dades sanitàries. Aquesta relació simbiòtica pot millorar el procés de diagnòstic identificant anomalies, patrons i tendències, amb la conseqüent obtenció de diagnòstics sanitaris més precisos, personalitzats i eficients per als pacients. La Capsula endoscòpica (WCE) és una tècnica d'imatge mèdica no invasiva utilitzada per visualitzar tot el tracte gastrointestinal (GI). Fins ara, els metges revisen minuciosament els fotogrames capturats per identificar patologies i diagnosticar pacients. Aquest procés manual requereix temps i és propens a errors. Per tant, exigeix un alt nivell d'atenció, experiència i especialització. Per superar aquests inconvenients, reduir la durada del procés de detecció i millorar el diagnòstic, es requereixen mètodes eficients i precisos d’AP. Aquesta tesi proposa solucions que utilitzen AP per als següents problemes trobats en l'anàlisi dels estudis de WCE: detecció de patologies, identificació de punts de referència anatòmics i gestió de mostres que pertanyen fora del domini. Aquestes solucions tenen com a objectiu aconseguir sistemes robustos que minimitzin la durada de l'anàlisi del vídeo i redueixin el nombre de lesions no detectades. Durant el seu desenvolupament, han sorgit diversos inconvenients relacionats amb l’AP, com ara conjunts de dades petits i desequilibrats. Aquestes limitacions també s'han abordat per assegurar que no obstaculitzin la generalització de les xarxes neuronals, evitant un rendiment subòptim. Per abordar els problemes anteriors de WCE i superar els reptes d’AP, els sistemes proposats adopten diverses estratègies que aprofiten l'avantatge de la Triplet Loss (TL) i les tècniques d’auto-aprenentatge. Principalment, s'ha utilitzat TL per millorar la generalització dels models, mentre que els mètodes d’autoaprenentatge s'han emprat per aprofitar les dades sense etiquetar i obtenir representacions útils. Els mètodes presentats aconsegueixen bons resultats en els problemes mèdics esmentats i contribueixen a la investigació en curs per millorar el diagnòstic dels estudis de WCE

    임상술기 향상을 위한 딥러닝 기법 연구: 대장내시경 진단 및 로봇수술 술기 평가에 적용

    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 협동과정 의용생체공학전공, 2020. 8. 김희찬.This paper presents deep learning-based methods for improving performance of clinicians. Novel methods were applied to the following two clinical cases and the results were evaluated. In the first study, a deep learning-based polyp classification algorithm for improving clinical performance of endoscopist during colonoscopy diagnosis was developed. Colonoscopy is the main method for diagnosing adenomatous polyp, which can multiply into a colorectal cancer and hyperplastic polyps. The classification algorithm was developed using convolutional neural network (CNN), trained with colorectal polyp images taken by a narrow-band imaging colonoscopy. The proposed method is built around an automatic machine learning (AutoML) which searches for the optimal architecture of CNN for colorectal polyp image classification and trains the weights of the architecture. In addition, gradient-weighted class activation mapping technique was used to overlay the probabilistic basis of the prediction result on the polyp location to aid the endoscopists visually. To verify the improvement in diagnostic performance, the efficacy of endoscopists with varying proficiency levels were compared with or without the aid of the proposed polyp classification algorithm. The results confirmed that, on average, diagnostic accuracy was improved and diagnosis time was shortened in all proficiency groups significantly. In the second study, a surgical instruments tracking algorithm for robotic surgery video was developed, and a model for quantitatively evaluating the surgeons surgical skill based on the acquired motion information of the surgical instruments was proposed. The movement of surgical instruments is the main component of evaluation for surgical skill. Therefore, the focus of this study was develop an automatic surgical instruments tracking algorithm, and to overcome the limitations presented by previous methods. The instance segmentation framework was developed to solve the instrument occlusion issue, and a tracking framework composed of a tracker and a re-identification algorithm was developed to maintain the type of surgical instruments being tracked in the video. In addition, algorithms for detecting the tip position of instruments and arm-indicator were developed to acquire the movement of devices specialized for the robotic surgery video. The performance of the proposed method was evaluated by measuring the difference between the predicted tip position and the ground truth position of the instruments using root mean square error, area under the curve, and Pearsons correlation analysis. Furthermore, motion metrics were calculated from the movement of surgical instruments, and a machine learning-based robotic surgical skill evaluation model was developed based on these metrics. These models were used to evaluate clinicians, and results were similar in the developed evaluation models, the Objective Structured Assessment of Technical Skill (OSATS), and the Global Evaluative Assessment of Robotic Surgery (GEARS) evaluation methods. In this study, deep learning technology was applied to colorectal polyp images for a polyp classification, and to robotic surgery videos for surgical instruments tracking. The improvement in clinical performance with the aid of these methods were evaluated and verified.본 논문은 의료진의 임상술기 능력을 향상시키기 위하여 새로운 딥러닝 기법들을 제안하고 다음 두 가지 실례에 대해 적용하여 그 결과를 평가하였다. 첫 번째 연구에서는 대장내시경으로 광학 진단 시, 내시경 전문의의 진단 능력을 향상시키기 위하여 딥러닝 기반의 용종 분류 알고리즘을 개발하고, 내시경 전문의의 진단 능력 향상 여부를 검증하고자 하였다. 대장내시경 검사로 암종으로 증식할 수 있는 선종과 과증식성 용종을 진단하는 것은 중요하다. 본 연구에서는 협대역 영상 내시경으로 촬영한 대장 용종 영상으로 합성곱 신경망을 학습하여 분류 알고리즘을 개발하였다. 제안하는 알고리즘은 자동 기계학습 (AutoML) 방법으로, 대장 용종 영상에 최적화된 합성곱 신경망 구조를 찾고 신경망의 가중치를 학습하였다. 또한 기울기-가중치 클래스 활성화 맵핑 기법을 이용하여 개발한 합성곱 신경망 결과의 확률적 근거를 용종 위치에 시각적으로 나타나도록 함으로 내시경 전문의의 진단을 돕도록 하였다. 마지막으로, 숙련도 그룹별로 내시경 전문의가 용종 분류 알고리즘의 결과를 참고하였을 때 진단 능력이 향상되었는지 비교 실험을 진행하였고, 모든 그룹에서 유의미하게 진단 정확도가 향상되고 진단 시간이 단축되었음을 확인하였다. 두 번째 연구에서는 로봇수술 동영상에서 수술도구 위치 추적 알고리즘을 개발하고, 획득한 수술도구의 움직임 정보를 바탕으로 수술자의 숙련도를 정량적으로 평가하는 모델을 제안하였다. 수술도구의 움직임은 수술자의 로봇수술 숙련도를 평가하기 위한 주요한 정보이다. 따라서 본 연구는 딥러닝 기반의 자동 수술도구 추적 알고리즘을 개발하였으며, 다음 두가지 선행연구의 한계점을 극복하였다. 인스턴스 분할 (Instance Segmentation) 프레임웍을 개발하여 폐색 (Occlusion) 문제를 해결하였고, 추적기 (Tracker)와 재식별화 (Re-Identification) 알고리즘으로 구성된 추적 프레임웍을 개발하여 동영상에서 추적하는 수술도구의 종류가 유지되도록 하였다. 또한 로봇수술 동영상의 특수성을 고려하여 수술도구의 움직임을 획득하기위해 수술도구 끝 위치와 로봇 팔-인디케이터 (Arm-Indicator) 인식 알고리즘을 개발하였다. 제안하는 알고리즘의 성능은 예측한 수술도구 끝 위치와 정답 위치 간의 평균 제곱근 오차, 곡선 아래 면적, 피어슨 상관분석으로 평가하였다. 마지막으로, 수술도구의 움직임으로부터 움직임 지표를 계산하고 이를 바탕으로 기계학습 기반의 로봇수술 숙련도 평가 모델을 개발하였다. 개발한 평가 모델은 기존의 Objective Structured Assessment of Technical Skill (OSATS), Global Evaluative Assessment of Robotic Surgery (GEARS) 평가 방법과 유사한 성능을 보임을 확인하였다. 본 논문은 의료진의 임상술기 능력을 향상시키기 위하여 대장 용종 영상과 로봇수술 동영상에 딥러닝 기술을 적용하고 그 유효성을 확인하였으며, 향후에 제안하는 방법이 임상에서 사용되고 있는 진단 및 평가 방법의 대안이 될 것으로 기대한다.Chapter 1 General Introduction 1 1.1 Deep Learning for Medical Image Analysis 1 1.2 Deep Learning for Colonoscipic Diagnosis 2 1.3 Deep Learning for Robotic Surgical Skill Assessment 3 1.4 Thesis Objectives 5 Chapter 2 Optical Diagnosis of Colorectal Polyps using Deep Learning with Visual Explanations 7 2.1 Introduction 7 2.1.1 Background 7 2.1.2 Needs 8 2.1.3 Related Work 9 2.2 Methods 11 2.2.1 Study Design 11 2.2.2 Dataset 14 2.2.3 Preprocessing 17 2.2.4 Convolutional Neural Networks (CNN) 21 Standard CNN 21 Search for CNN Architecture 22 Searched CNN Training 23 Visual Explanation 24 2.2.5 Evaluation of CNN and Endoscopist Performances 25 2.3 Experiments and Results 27 2.3.1 CNN Performance 27 2.3.2 Results of Visual Explanation 31 2.3.3 Endoscopist with CNN Performance 33 2.4 Discussion 45 2.4.1 Research Significance 45 2.4.2 Limitations 47 2.5 Conclusion 49 Chapter 3 Surgical Skill Assessment during Robotic Surgery by Deep Learning-based Surgical Instrument Tracking 50 3.1 Introduction 50 3.1.1 Background 50 3.1.2 Needs 51 3.1.3 Related Work 52 3.2 Methods 56 3.2.1 Study Design 56 3.2.2 Dataset 59 3.2.3 Instance Segmentation Framework 63 3.2.4 Tracking Framework 66 Tracker 66 Re-identification 68 3.2.5 Surgical Instrument Tip Detection 69 3.2.6 Arm-Indicator Recognition 71 3.2.7 Surgical Skill Prediction Model 71 3.3 Experiments and Results 78 3.3.1 Performance of Instance Segmentation Framework 78 3.3.2 Performance of Tracking Framework 82 3.3.3 Evaluation of Surgical Instruments Trajectory 83 3.3.4 Evaluation of Surgical Skill Prediction Model 86 3.4 Discussion 90 3.4.1 Research Significance 90 3.4.2 Limitations 92 3.5 Conclusion 96 Chapter 4 Summary and Future Works 97 4.1 Thesis Summary 97 4.2 Limitations and Future Works 98 Bibliography 100 Abstract in Korean 116 Acknowledgement 119Docto

    Colonoscopy Image Pre-Processing for the Development of Computer-Aided Diagnostic Tools

    Colorrectal cancer is the third most frequently diagnosed cancer worldwide. The American Cancer Society estimates that there will be almost 100,000 new patients diagnosed with colorectal cancer and that around 50,000 people will die as a consequence of this in 2016. The increase of life expectancy and the increment of the number of diagnostic tests conducted have had a great impact on the amount of cancers being detected. Among other diagnostic tools, colonoscopy is the most prevalent. In order to help endoscopists cope with the increasing amount of tests that have to be carried out, there exists a need to develop automated tools that aid diagnosis. The characteristics of the colon make pre-processing essential to eliminate artefacts that degrade the quality of exploratory images. The goal of this chapter is to describe the most common issues of colonoscopic imagery as well the existing methods for their optimal detection and correction