53 research outputs found

    Instance segmentation of upper aerodigestive tract cancer: site-specific outcomes

    Get PDF
    Objective. To achieve instance segmentation of upper aerodigestive tract (UADT) neoplasms using a deep learning (DL) algorithm, and to identify differences in its diagnostic performance in three different sites: larynx/hypopharynx, oral cavity and oropharynx.Methods. A total of 1034 endoscopic images from 323 patients were examined under narrow band imaging (NBI). The Mask R-CNN algorithm was used for the analysis. The dataset split was: 935 training, 48 validation and 51 testing images. Dice Similarity Coefficient (Dsc) was the main outcome measure.Results. Instance segmentation was effective in 76.5% of images. The mean Dsc was 0.90 & PLUSMN; 0.05. The algorithm correctly predicted 77.8%, 86.7% and 55.5% of lesions in the larynx/hypopharynx, oral cavity, and oropharynx, respectively. The mean Dsc was 0.90 & PLUSMN; 0.05 for the larynx/hypopharynx, 0.60 & PLUSMN; 0.26 for the oral cavity, and 0.81 & PLUSMN; 0.30 for the oropharynx. The analysis showed inferior diagnostic results in the oral cavity compared with the larynx/hypopharynx (p < 0.001). Conclusions. The study confirms the feasibility of instance segmentation of UADT using DL algorithms and shows inferior diagnostic results in the oral cavity compared with other anatomic areas

    Artificial intelligence in clinical endoscopy: Insights in the field of videomics

    Get PDF
    Artificial intelligence is being increasingly seen as a useful tool in medicine. Specifically, these technologies have the objective to extract insights from complex datasets that cannot easily be analyzed by conventional statistical methods. While promising results have been obtained for various -omics datasets, radiological images, and histopathologic slides, analysis of videoendoscopic frames still represents a major challenge. In this context, videomics represents a burgeoning field wherein several methods of computer vision are systematically used to organize unstructured data from frames obtained during diagnostic videoendoscopy. Recent studies have focused on five broad tasks with increasing complexity: quality assessment of endoscopic images, classification of pathologic and nonpathologic frames, detection of lesions inside frames, segmentation of pathologic lesions, and in-depth characterization of neoplastic lesions. Herein, we present a broad overview of the field, with a focus on conceptual key points and future perspectives

    Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network

    Get PDF
    The objective investigation of the dynamic properties of vocal fold vibrations demands the recording and further quantitative analysis of laryngeal high-speed video (HSV). Quantification of the vocal fold vibration patterns requires as a first step the segmentation of the glottal area within each video frame from which the vibrating edges of the vocal folds are usually derived. Consequently, the outcome of any further vibration analysis depends on the quality of this initial segmentation process. In this work we propose for the first time a procedure to fully automatically segment not only the time-varying glottal area but also the vocal fold tissue directly from laryngeal high-speed video (HSV) using a deep Convolutional Neural Network (CNN) approach. Eighteen different Convolutional Neural Network (CNN) network configurations were trained and evaluated on totally 13,000 high-speed video (HSV) frames obtained from 56 healthy and 74 pathologic subjects. The segmentation quality of the best performing Convolutional Neural Network (CNN) model, which uses Long Short-Term Memory (LSTM) cells to take also the temporal context into account, was intensely investigated on 15 test video sequences comprising 100 consecutive images each. As performance measures the Dice Coefficient (DC) as well as the precisions of four anatomical landmark positions were used. Over all test data a mean Dice Coefficient (DC) of 0.85 was obtained for the glottis and 0.91 and 0.90 for the right and left vocal fold (VF) respectively. The grand average precision of the identified landmarks amounts 2.2 pixels and is in the same range as comparable manual expert segmentations which can be regarded as Gold Standard. The method proposed here requires no user interaction and overcomes the limitations of current semiautomatic or computational expensive approaches. Thus, it allows also for the analysis of long high-speed video (HSV)-sequences and holds the promise to facilitate the objective analysis of vocal fold vibrations in clinical routine. The here used dataset including the ground truth will be provided freely for all scientific groups to allow a quantitative benchmarking of segmentation approaches in future

    Multi-modal and multi-dimensional biomedical image data analysis using deep learning

    Get PDF
    There is a growing need for the development of computational methods and tools for automated, objective, and quantitative analysis of biomedical signal and image data to facilitate disease and treatment monitoring, early diagnosis, and scientific discovery. Recent advances in artificial intelligence and machine learning, particularly in deep learning, have revolutionized computer vision and image analysis for many application areas. While processing of non-biomedical signal, image, and video data using deep learning methods has been very successful, high-stakes biomedical applications present unique challenges such as different image modalities, limited training data, need for explainability and interpretability etc. that need to be addressed. In this dissertation, we developed novel, explainable, and attention-based deep learning frameworks for objective, automated, and quantitative analysis of biomedical signal, image, and video data. The proposed solutions involve multi-scale signal analysis for oraldiadochokinesis studies; ensemble of deep learning cascades using global soft attention mechanisms for segmentation of meningeal vascular networks in confocal microscopy; spatial attention and spatio-temporal data fusion for detection of rare and short-term video events in laryngeal endoscopy videos; and a novel discrete Fourier transform driven class activation map for explainable-AI and weakly-supervised object localization and segmentation for detailed vocal fold motion analysis using laryngeal endoscopy videos. Experiments conducted on the proposed methods showed robust and promising results towards automated, objective, and quantitative analysis of biomedical data, that is of great value for potential early diagnosis and effective disease progress or treatment monitoring.Includes bibliographical references

    μž„μƒμˆ κΈ° ν–₯상을 μœ„ν•œ λ”₯λŸ¬λ‹ 기법 연ꡬ: λŒ€μž₯λ‚΄μ‹œκ²½ 진단 및 λ‘œλ΄‡μˆ˜μˆ  술기 평가에 적용

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ ν˜‘λ™κ³Όμ • μ˜μš©μƒμ²΄κ³΅ν•™μ „κ³΅, 2020. 8. 김희찬.This paper presents deep learning-based methods for improving performance of clinicians. Novel methods were applied to the following two clinical cases and the results were evaluated. In the first study, a deep learning-based polyp classification algorithm for improving clinical performance of endoscopist during colonoscopy diagnosis was developed. Colonoscopy is the main method for diagnosing adenomatous polyp, which can multiply into a colorectal cancer and hyperplastic polyps. The classification algorithm was developed using convolutional neural network (CNN), trained with colorectal polyp images taken by a narrow-band imaging colonoscopy. The proposed method is built around an automatic machine learning (AutoML) which searches for the optimal architecture of CNN for colorectal polyp image classification and trains the weights of the architecture. In addition, gradient-weighted class activation mapping technique was used to overlay the probabilistic basis of the prediction result on the polyp location to aid the endoscopists visually. To verify the improvement in diagnostic performance, the efficacy of endoscopists with varying proficiency levels were compared with or without the aid of the proposed polyp classification algorithm. The results confirmed that, on average, diagnostic accuracy was improved and diagnosis time was shortened in all proficiency groups significantly. In the second study, a surgical instruments tracking algorithm for robotic surgery video was developed, and a model for quantitatively evaluating the surgeons surgical skill based on the acquired motion information of the surgical instruments was proposed. The movement of surgical instruments is the main component of evaluation for surgical skill. Therefore, the focus of this study was develop an automatic surgical instruments tracking algorithm, and to overcome the limitations presented by previous methods. The instance segmentation framework was developed to solve the instrument occlusion issue, and a tracking framework composed of a tracker and a re-identification algorithm was developed to maintain the type of surgical instruments being tracked in the video. In addition, algorithms for detecting the tip position of instruments and arm-indicator were developed to acquire the movement of devices specialized for the robotic surgery video. The performance of the proposed method was evaluated by measuring the difference between the predicted tip position and the ground truth position of the instruments using root mean square error, area under the curve, and Pearsons correlation analysis. Furthermore, motion metrics were calculated from the movement of surgical instruments, and a machine learning-based robotic surgical skill evaluation model was developed based on these metrics. These models were used to evaluate clinicians, and results were similar in the developed evaluation models, the Objective Structured Assessment of Technical Skill (OSATS), and the Global Evaluative Assessment of Robotic Surgery (GEARS) evaluation methods. In this study, deep learning technology was applied to colorectal polyp images for a polyp classification, and to robotic surgery videos for surgical instruments tracking. The improvement in clinical performance with the aid of these methods were evaluated and verified.λ³Έ 논문은 μ˜λ£Œμ§„μ˜ μž„μƒμˆ κΈ° λŠ₯λ ₯을 ν–₯μƒμ‹œν‚€κΈ° μœ„ν•˜μ—¬ μƒˆλ‘œμš΄ λ”₯λŸ¬λ‹ 기법듀을 μ œμ•ˆν•˜κ³  λ‹€μŒ 두 가지 싀둀에 λŒ€ν•΄ μ μš©ν•˜μ—¬ κ·Έ κ²°κ³Όλ₯Ό ν‰κ°€ν•˜μ˜€λ‹€. 첫 번째 μ—°κ΅¬μ—μ„œλŠ” λŒ€μž₯λ‚΄μ‹œκ²½μœΌλ‘œ κ΄‘ν•™ 진단 μ‹œ, λ‚΄μ‹œκ²½ μ „λ¬Έμ˜μ˜ 진단 λŠ₯λ ₯을 ν–₯μƒμ‹œν‚€κΈ° μœ„ν•˜μ—¬ λ”₯λŸ¬λ‹ 기반의 μš©μ’… λΆ„λ₯˜ μ•Œκ³ λ¦¬μ¦˜μ„ κ°œλ°œν•˜κ³ , λ‚΄μ‹œκ²½ μ „λ¬Έμ˜μ˜ 진단 λŠ₯λ ₯ ν–₯상 μ—¬λΆ€λ₯Ό κ²€μ¦ν•˜κ³ μž ν•˜μ˜€λ‹€. λŒ€μž₯λ‚΄μ‹œκ²½ κ²€μ‚¬λ‘œ μ•”μ’…μœΌλ‘œ 증식할 수 μžˆλŠ” μ„ μ’…κ³Ό 과증식성 μš©μ’…μ„ μ§„λ‹¨ν•˜λŠ” 것은 μ€‘μš”ν•˜λ‹€. λ³Έ μ—°κ΅¬μ—μ„œλŠ” ν˜‘λŒ€μ—­ μ˜μƒ λ‚΄μ‹œκ²½μœΌλ‘œ μ΄¬μ˜ν•œ λŒ€μž₯ μš©μ’… μ˜μƒμœΌλ‘œ ν•©μ„±κ³± 신경망을 ν•™μŠ΅ν•˜μ—¬ λΆ„λ₯˜ μ•Œκ³ λ¦¬μ¦˜μ„ κ°œλ°œν•˜μ˜€λ‹€. μ œμ•ˆν•˜λŠ” μ•Œκ³ λ¦¬μ¦˜μ€ μžλ™ κΈ°κ³„ν•™μŠ΅ (AutoML) λ°©λ²•μœΌλ‘œ, λŒ€μž₯ μš©μ’… μ˜μƒμ— μ΅œμ ν™”λœ ν•©μ„±κ³± 신경망 ꡬ쑰λ₯Ό μ°Ύκ³  μ‹ κ²½λ§μ˜ κ°€μ€‘μΉ˜λ₯Ό ν•™μŠ΅ν•˜μ˜€λ‹€. λ˜ν•œ 기울기-κ°€μ€‘μΉ˜ 클래슀 ν™œμ„±ν™” 맡핑 기법을 μ΄μš©ν•˜μ—¬ κ°œλ°œν•œ ν•©μ„±κ³± 신경망 결과의 ν™•λ₯ μ  κ·Όκ±°λ₯Ό μš©μ’… μœ„μΉ˜μ— μ‹œκ°μ μœΌλ‘œ λ‚˜νƒ€λ‚˜λ„λ‘ ν•¨μœΌλ‘œ λ‚΄μ‹œκ²½ μ „λ¬Έμ˜μ˜ 진단을 돕도둝 ν•˜μ˜€λ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ, μˆ™λ ¨λ„ κ·Έλ£Ήλ³„λ‘œ λ‚΄μ‹œκ²½ μ „λ¬Έμ˜κ°€ μš©μ’… λΆ„λ₯˜ μ•Œκ³ λ¦¬μ¦˜μ˜ κ²°κ³Όλ₯Ό μ°Έκ³ ν•˜μ˜€μ„ λ•Œ 진단 λŠ₯λ ₯이 ν–₯μƒλ˜μ—ˆλŠ”μ§€ 비ꡐ μ‹€ν—˜μ„ μ§„ν–‰ν•˜μ˜€κ³ , λͺ¨λ“  κ·Έλ£Ήμ—μ„œ μœ μ˜λ―Έν•˜κ²Œ 진단 정확도가 ν–₯μƒλ˜κ³  진단 μ‹œκ°„μ΄ λ‹¨μΆ•λ˜μ—ˆμŒμ„ ν™•μΈν•˜μ˜€λ‹€. 두 번째 μ—°κ΅¬μ—μ„œλŠ” λ‘œλ΄‡μˆ˜μˆ  λ™μ˜μƒμ—μ„œ μˆ˜μˆ λ„κ΅¬ μœ„μΉ˜ 좔적 μ•Œκ³ λ¦¬μ¦˜μ„ κ°œλ°œν•˜κ³ , νšλ“ν•œ μˆ˜μˆ λ„κ΅¬μ˜ μ›€μ§μž„ 정보λ₯Ό λ°”νƒ•μœΌλ‘œ 수술자의 μˆ™λ ¨λ„λ₯Ό μ •λŸ‰μ μœΌλ‘œ ν‰κ°€ν•˜λŠ” λͺ¨λΈμ„ μ œμ•ˆν•˜μ˜€λ‹€. μˆ˜μˆ λ„κ΅¬μ˜ μ›€μ§μž„μ€ 수술자의 λ‘œλ΄‡μˆ˜μˆ  μˆ™λ ¨λ„λ₯Ό ν‰κ°€ν•˜κΈ° μœ„ν•œ μ£Όμš”ν•œ 정보이닀. λ”°λΌμ„œ λ³Έ μ—°κ΅¬λŠ” λ”₯λŸ¬λ‹ 기반의 μžλ™ μˆ˜μˆ λ„κ΅¬ 좔적 μ•Œκ³ λ¦¬μ¦˜μ„ κ°œλ°œν•˜μ˜€μœΌλ©°, λ‹€μŒ 두가지 μ„ ν–‰μ—°κ΅¬μ˜ ν•œκ³„μ μ„ κ·Ήλ³΅ν•˜μ˜€λ‹€. μΈμŠ€ν„΄μŠ€ λΆ„ν•  (Instance Segmentation) ν”„λ ˆμž„μ›μ„ κ°œλ°œν•˜μ—¬ 폐색 (Occlusion) 문제λ₯Ό ν•΄κ²°ν•˜μ˜€κ³ , 좔적기 (Tracker)와 μž¬μ‹λ³„ν™” (Re-Identification) μ•Œκ³ λ¦¬μ¦˜μœΌλ‘œ κ΅¬μ„±λœ 좔적 ν”„λ ˆμž„μ›μ„ κ°œλ°œν•˜μ—¬ λ™μ˜μƒμ—μ„œ μΆ”μ ν•˜λŠ” μˆ˜μˆ λ„κ΅¬μ˜ μ’…λ₯˜κ°€ μœ μ§€λ˜λ„λ‘ ν•˜μ˜€λ‹€. λ˜ν•œ λ‘œλ΄‡μˆ˜μˆ  λ™μ˜μƒμ˜ νŠΉμˆ˜μ„±μ„ κ³ λ €ν•˜μ—¬ μˆ˜μˆ λ„κ΅¬μ˜ μ›€μ§μž„μ„ νšλ“ν•˜κΈ°μœ„ν•΄ μˆ˜μˆ λ„κ΅¬ 끝 μœ„μΉ˜μ™€ λ‘œλ΄‡ νŒ”-인디케이터 (Arm-Indicator) 인식 μ•Œκ³ λ¦¬μ¦˜μ„ κ°œλ°œν•˜μ˜€λ‹€. μ œμ•ˆν•˜λŠ” μ•Œκ³ λ¦¬μ¦˜μ˜ μ„±λŠ₯은 μ˜ˆμΈ‘ν•œ μˆ˜μˆ λ„κ΅¬ 끝 μœ„μΉ˜μ™€ μ •λ‹΅ μœ„μΉ˜ κ°„μ˜ 평균 제곱근 였차, 곑선 μ•„λž˜ 면적, ν”Όμ–΄μŠ¨ μƒκ΄€λΆ„μ„μœΌλ‘œ ν‰κ°€ν•˜μ˜€λ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ, μˆ˜μˆ λ„κ΅¬μ˜ μ›€μ§μž„μœΌλ‘œλΆ€ν„° μ›€μ§μž„ μ§€ν‘œλ₯Ό κ³„μ‚°ν•˜κ³  이λ₯Ό λ°”νƒ•μœΌλ‘œ κΈ°κ³„ν•™μŠ΅ 기반의 λ‘œλ΄‡μˆ˜μˆ  μˆ™λ ¨λ„ 평가 λͺ¨λΈμ„ κ°œλ°œν•˜μ˜€λ‹€. κ°œλ°œν•œ 평가 λͺ¨λΈμ€ 기쑴의 Objective Structured Assessment of Technical Skill (OSATS), Global Evaluative Assessment of Robotic Surgery (GEARS) 평가 방법과 μœ μ‚¬ν•œ μ„±λŠ₯을 λ³΄μž„μ„ ν™•μΈν•˜μ˜€λ‹€. λ³Έ 논문은 μ˜λ£Œμ§„μ˜ μž„μƒμˆ κΈ° λŠ₯λ ₯을 ν–₯μƒμ‹œν‚€κΈ° μœ„ν•˜μ—¬ λŒ€μž₯ μš©μ’… μ˜μƒκ³Ό λ‘œλ΄‡μˆ˜μˆ  λ™μ˜μƒμ— λ”₯λŸ¬λ‹ κΈ°μˆ μ„ μ μš©ν•˜κ³  κ·Έ μœ νš¨μ„±μ„ ν™•μΈν•˜μ˜€μœΌλ©°, ν–₯후에 μ œμ•ˆν•˜λŠ” 방법이 μž„μƒμ—μ„œ μ‚¬μš©λ˜κ³  μžˆλŠ” 진단 및 평가 λ°©λ²•μ˜ λŒ€μ•ˆμ΄ 될 κ²ƒμœΌλ‘œ κΈ°λŒ€ν•œλ‹€.Chapter 1 General Introduction 1 1.1 Deep Learning for Medical Image Analysis 1 1.2 Deep Learning for Colonoscipic Diagnosis 2 1.3 Deep Learning for Robotic Surgical Skill Assessment 3 1.4 Thesis Objectives 5 Chapter 2 Optical Diagnosis of Colorectal Polyps using Deep Learning with Visual Explanations 7 2.1 Introduction 7 2.1.1 Background 7 2.1.2 Needs 8 2.1.3 Related Work 9 2.2 Methods 11 2.2.1 Study Design 11 2.2.2 Dataset 14 2.2.3 Preprocessing 17 2.2.4 Convolutional Neural Networks (CNN) 21 2.2.4.1 Standard CNN 21 2.2.4.2 Search for CNN Architecture 22 2.2.4.3 Searched CNN Training 23 2.2.4.4 Visual Explanation 24 2.2.5 Evaluation of CNN and Endoscopist Performances 25 2.3 Experiments and Results 27 2.3.1 CNN Performance 27 2.3.2 Results of Visual Explanation 31 2.3.3 Endoscopist with CNN Performance 33 2.4 Discussion 45 2.4.1 Research Significance 45 2.4.2 Limitations 47 2.5 Conclusion 49 Chapter 3 Surgical Skill Assessment during Robotic Surgery by Deep Learning-based Surgical Instrument Tracking 50 3.1 Introduction 50 3.1.1 Background 50 3.1.2 Needs 51 3.1.3 Related Work 52 3.2 Methods 56 3.2.1 Study Design 56 3.2.2 Dataset 59 3.2.3 Instance Segmentation Framework 63 3.2.4 Tracking Framework 66 3.2.4.1 Tracker 66 3.2.4.2 Re-identification 68 3.2.5 Surgical Instrument Tip Detection 69 3.2.6 Arm-Indicator Recognition 71 3.2.7 Surgical Skill Prediction Model 71 3.3 Experiments and Results 78 3.3.1 Performance of Instance Segmentation Framework 78 3.3.2 Performance of Tracking Framework 82 3.3.3 Evaluation of Surgical Instruments Trajectory 83 3.3.4 Evaluation of Surgical Skill Prediction Model 86 3.4 Discussion 90 3.4.1 Research Significance 90 3.4.2 Limitations 92 3.5 Conclusion 96 Chapter 4 Summary and Future Works 97 4.1 Thesis Summary 97 4.2 Limitations and Future Works 98 Bibliography 100 Abstract in Korean 116 Acknowledgement 119Docto

    Supervised cnn strategies for optical image segmentation and classification in interventional medicine

    Get PDF
    The analysis of interventional images is a topic of high interest for the medical-image analysis community. Such an analysis may provide interventional-medicine professionals with both decision support and context awareness, with the final goal of improving patient safety. The aim of this chapter is to give an overview of some of the most recent approaches (up&nbsp;to 2018) in the field, with a focus on Convolutional Neural Networks (CNNs) for both segmentation and classification tasks. For each approach, summary tables are presented reporting the used dataset, involved anatomical region and achieved performance. Benefits and disadvantages of each approach are highlighted and discussed. Available datasets for algorithm training and testing and commonly used performance metrics are summarized to offer a source of information for researchers that are approaching the field of interventional-image analysis. The advancements in deep learning for medical-image analysis are involving more and more the interventional-medicine field. However, these advancements are undeniably slower than in other fields (e.g. preoperative-image analysis) and considerable work still needs to be done in order to provide clinicians with all possible support during interventional-medicine procedures

    U-Net and its variants for medical image segmentation: theory and applications

    Full text link
    U-net is an image segmentation technique developed primarily for medical image analysis that can precisely segment images using a scarce amount of training data. These traits provide U-net with a very high utility within the medical imaging community and have resulted in extensive adoption of U-net as the primary tool for segmentation tasks in medical imaging. The success of U-net is evident in its widespread use in all major image modalities from CT scans and MRI to X-rays and microscopy. Furthermore, while U-net is largely a segmentation tool, there have been instances of the use of U-net in other applications. As the potential of U-net is still increasing, in this review we look at the various developments that have been made in the U-net architecture and provide observations on recent trends. We examine the various innovations that have been made in deep learning and discuss how these tools facilitate U-net. Furthermore, we look at image modalities and application areas where U-net has been applied.Comment: 42 pages, in IEEE Acces
    • …
    corecore