184 research outputs found

    An Automatic Gastrointestinal Polyp Detection System in Video Endoscopy Using Fusion of Color Wavelet and Convolutional Neural Network Features

    Get PDF
    Gastrointestinal polyps are considered to be the precursors of cancer development in most of the cases. Therefore, early detection and removal of polyps can reduce the possibility of cancer. Video endoscopy is the most used diagnostic modality for gastrointestinal polyps. But, because it is an operator dependent procedure, several human factors can lead to misdetection of polyps. Computer aided polyp detection can reduce polyp miss detection rate and assists doctors in finding the most important regions to pay attention to. In this paper, an automatic system has been proposed as a support to gastrointestinal polyp detection. This system captures the video streams from endoscopic video and, in the output, it shows the identified polyps. Color wavelet (CW) features and convolutional neural network (CNN) features of video frames are extracted and combined together which are used to train a linear support vector machine (SVM). Evaluations on standard public databases show that the proposed system outperforms the state-of-the-art methods, gaining accuracy of 98.65%, sensitivity of 98.79%, and specificity of 98.52%

    Study to integrate CNN inside a WCE to realize a screening tool

    Get PDF
    International audienceScreening is a method to improve the earlydetection of colorectal cancer. Now, screening is basedon an immunochemical test that look for blood in faecalsamples, but image is the best modality to detect themarker of colorectal cancer : polyps. In 2003 WirelessCapsule Endoscopy was introduced and opened a way tointegrate automatic image processing to realize a screen-ing tool. In parallel Convolutionnal Neural Networkshave demonstrated their high capacity to detect polyps inmany scientific studies, but fail to be integrable. In thisarticle we present our works to integrate CNN or imageprocessing based on a CNN inside a WCE to realize apowerful screening too

    Uncertainty, interpretability and dataset limitations in Deep Learning

    Full text link
    [eng] Deep Learning (DL) has gained traction in the last years thanks to the exponential increase in compute power. New techniques and methods are published at a daily basis, and records are being set across multiple disciplines. Undeniably, DL has brought a revolution to the machine learning field and to our lives. However, not everything has been resolved and some considerations must be taken into account. For instance, obtaining uncertainty measures and bounds is still an open problem. Models should be able to capture and express the confidence they have in their decisions, and Artificial Neural Networks (ANN) are known to lack in this regard. Be it through out of distribution samples, adversarial attacks, or simply unrelated or nonsensical inputs, ANN models demonstrate an unfounded and incorrect tendency to still output high probabilities. Likewise, interpretability remains an unresolved question. Some fields not only need but rely on being able to provide human interpretations of the thought process of models. ANNs, and specially deep models trained with DL, are hard to reason about. Last but not least, there is a tendency that indicates that models are getting deeper and more complex. At the same time, to cope with the increasing number of parameters, datasets are required to be of higher quality and, usually, larger. Not all research, and even less real world applications, can keep with the increasing demands. Therefore, taking into account the previous issues, the main aim of this thesis is to provide methods and frameworks to tackle each of them. These approaches should be applicable to any suitable field and dataset, and are employed with real world datasets as proof of concept. First, we propose a method that provides interpretability with respect to the results through uncertainty measures. The model in question is capable of reasoning about the uncertainty inherent in data and leverages that information to progressively refine its outputs. In particular, the method is applied to land cover segmentation, a classification task that aims to assign a type of land to each pixel in satellite images. The dataset and application serve to prove that the final uncertainty bound enables the end-user to reason about the possible errors in the segmentation result. Second, Recurrent Neural Networks are used as a method to create robust models towards lacking datasets, both in terms of size and class balance. We apply them to two different fields, road extraction in satellite images and Wireless Capsule Endoscopy (WCE). The former demonstrates that contextual information in the temporal axis of data can be used to create models that achieve comparable results to state-of-the-art while being less complex. The latter, in turn, proves that contextual information for polyp detection can be crucial to obtain models that generalize better and obtain higher performance. Last, we propose two methods to leverage unlabeled data in the model creation process. Often datasets are easier to obtain than to label, which results in many wasted opportunities with traditional classification approaches. Our approaches based on self-supervised learning result in a novel contrastive loss that is capable of extracting meaningful information out of pseudo-labeled data. Applying both methods to WCE data proves that the extracted inherent knowledge creates models that perform better in extremely unbalanced datasets and with lack of data. To summarize, this thesis demonstrates potential solutions to obtain uncertainty bounds, provide reasonable explanations of the outputs, and to combat lack of data or unbalanced datasets. Overall, the presented methods have a positive impact on the DL field and could have a real and tangible effect for the society.[cat] És innegable que el Deep Learning ha causat una revolució en molts aspectes no solament de l’aprenentatge automàtic però també de les nostres vides diàries. Tot i així, encara queden aspectes a millorar. Les xarxes neuronals tenen problemes per estimar la seva confiança en les prediccions, i sovint reporten probabilitats altes en casos que no tenen relació amb el model o que directament no tenen sentit. De la mateixa forma, interpretar els resultats d’un model profund i complex resulta una tasca extremadament complicada. Aquests mateixos models, cada cop amb més paràmetres i més potents, requereixen també de dades més ben etiquetades i més completes. Tenint en compte aquestes limitacions, l’objectiu principal és el de buscar mètodes i algoritmes per trobar-ne solució. Primerament, es proposa la creació d’un mètode capaç d’obtenir incertesa en imatges satèl·lit i d’utilitzar-la per crear models més robustos i resultats interpretables. En segon lloc, s’utilitzen Recurrent Neural Networks (RNN) per combatre la falta de dades mitjançant l’obtenció d’informació contextual de dades temporals. Aquestes s’apliquen per l’extracció de carreteres d’imatges satèl·lit i per la classificació de pòlips en imatges obtingudes amb Wireless Capsule Endoscopy (WCE). Finalment, es plantegen dos mètodes per tractar amb la falta de dades etiquetades i desbalancejos en les classes amb l’ús de Self-supervised Learning (SSL). Seqüències no etiquetades d’imatges d’intestins s’incorporen en el models en una fase prèvia a la classificació tradicional. Aquesta tesi demostra que les solucions proposades per obtenir mesures d’incertesa són efectives per donar explicacions raonables i interpretables sobre els resultats. Igualment, es prova que el context en dades de caràcter temporal, obtingut amb RNNs, serveix per obtenir models més simples que poden arribar a solucionar els problemes derivats de la falta de dades. Per últim, es mostra que SSL serveix per combatre de forma efectiva els problemes de generalització degut a dades no balancejades en diversos dominis de WCE. Concloem que aquesta tesi presenta mètodes amb un impacte real en diversos aspectes de DL a la vegada que demostra la capacitat de tenir un impacte positiu en la societat

    Deep Learning-based Solutions to Improve Diagnosis in Wireless Capsule Endoscopy

    Full text link
    [eng] Deep Learning (DL) models have gained extensive attention due to their remarkable performance in a wide range of real-world applications, particularly in computer vision. This achievement, combined with the increase in available medical records, has made it possible to open up new opportunities for analyzing and interpreting healthcare data. This symbiotic relationship can enhance the diagnostic process by identifying abnormalities, patterns, and trends, resulting in more precise, personalized, and effective healthcare for patients. Wireless Capsule Endoscopy (WCE) is a non-invasive medical imaging technique used to visualize the entire Gastrointestinal (GI) tract. Up to this moment, physicians meticulously review the captured frames to identify pathologies and diagnose patients. This manual process is time- consuming and prone to errors due to the challenges of interpreting the complex nature of WCE procedures. Thus, it demands a high level of attention, expertise, and experience. To overcome these drawbacks, shorten the screening process, and improve the diagnosis, efficient and accurate DL methods are required. This thesis proposes DL solutions to the following problems encountered in the analysis of WCE studies: pathology detection, anatomical landmark identification, and Out-of-Distribution (OOD) sample handling. These solutions aim to achieve robust systems that minimize the duration of the video analysis and reduce the number of undetected lesions. Throughout their development, several DL drawbacks have appeared, including small and imbalanced datasets. These limitations have also been addressed, ensuring that they do not hinder the generalization of neural networks, leading to suboptimal performance and overfitting. To address the previous WCE problems and overcome the DL challenges, the proposed systems adopt various strategies that utilize the power advantage of Triplet Loss (TL) and Self-Supervised Learning (SSL) techniques. Mainly, TL has been used to improve the generalization of the models, while SSL methods have been employed to leverage the unlabeled data to obtain useful representations. The presented methods achieve State-of-the-art results in the aforementioned medical problems and contribute to the ongoing research to improve the diagnostic of WCE studies.[cat] Els models d’aprenentatge profund (AP) han acaparat molta atenció a causa del seu rendiment en una àmplia gamma d'aplicacions del món real, especialment en visió per ordinador. Aquest fet, combinat amb l'increment de registres mèdics disponibles, ha permès obrir noves oportunitats per analitzar i interpretar les dades sanitàries. Aquesta relació simbiòtica pot millorar el procés de diagnòstic identificant anomalies, patrons i tendències, amb la conseqüent obtenció de diagnòstics sanitaris més precisos, personalitzats i eficients per als pacients. La Capsula endoscòpica (WCE) és una tècnica d'imatge mèdica no invasiva utilitzada per visualitzar tot el tracte gastrointestinal (GI). Fins ara, els metges revisen minuciosament els fotogrames capturats per identificar patologies i diagnosticar pacients. Aquest procés manual requereix temps i és propens a errors. Per tant, exigeix un alt nivell d'atenció, experiència i especialització. Per superar aquests inconvenients, reduir la durada del procés de detecció i millorar el diagnòstic, es requereixen mètodes eficients i precisos d’AP. Aquesta tesi proposa solucions que utilitzen AP per als següents problemes trobats en l'anàlisi dels estudis de WCE: detecció de patologies, identificació de punts de referència anatòmics i gestió de mostres que pertanyen fora del domini. Aquestes solucions tenen com a objectiu aconseguir sistemes robustos que minimitzin la durada de l'anàlisi del vídeo i redueixin el nombre de lesions no detectades. Durant el seu desenvolupament, han sorgit diversos inconvenients relacionats amb l’AP, com ara conjunts de dades petits i desequilibrats. Aquestes limitacions també s'han abordat per assegurar que no obstaculitzin la generalització de les xarxes neuronals, evitant un rendiment subòptim. Per abordar els problemes anteriors de WCE i superar els reptes d’AP, els sistemes proposats adopten diverses estratègies que aprofiten l'avantatge de la Triplet Loss (TL) i les tècniques d’auto-aprenentatge. Principalment, s'ha utilitzat TL per millorar la generalització dels models, mentre que els mètodes d’autoaprenentatge s'han emprat per aprofitar les dades sense etiquetar i obtenir representacions útils. Els mètodes presentats aconsegueixen bons resultats en els problemes mèdics esmentats i contribueixen a la investigació en curs per millorar el diagnòstic dels estudis de WCE

    EndoSLAM Dataset and An Unsupervised Monocular Visual Odometry and Depth Estimation Approach for Endoscopic Videos: Endo-SfMLearner

    Full text link
    Deep learning techniques hold promise to develop dense topography reconstruction and pose estimation methods for endoscopic videos. However, currently available datasets do not support effective quantitative benchmarking. In this paper, we introduce a comprehensive endoscopic SLAM dataset consisting of 3D point cloud data for six porcine organs, capsule and standard endoscopy recordings as well as synthetically generated data. A Panda robotic arm, two commercially available capsule endoscopes, two conventional endoscopes with different camera properties, and two high precision 3D scanners were employed to collect data from 8 ex-vivo porcine gastrointestinal (GI)-tract organs. In total, 35 sub-datasets are provided with 6D pose ground truth for the ex-vivo part: 18 sub-dataset for colon, 12 sub-datasets for stomach and 5 sub-datasets for small intestine, while four of these contain polyp-mimicking elevations carried out by an expert gastroenterologist. Synthetic capsule endoscopy frames from GI-tract with both depth and pose annotations are included to facilitate the study of simulation-to-real transfer learning algorithms. Additionally, we propound Endo-SfMLearner, an unsupervised monocular depth and pose estimation method that combines residual networks with spatial attention module in order to dictate the network to focus on distinguishable and highly textured tissue regions. The proposed approach makes use of a brightness-aware photometric loss to improve the robustness under fast frame-to-frame illumination changes. To exemplify the use-case of the EndoSLAM dataset, the performance of Endo-SfMLearner is extensively compared with the state-of-the-art. The codes and the link for the dataset are publicly available at https://github.com/CapsuleEndoscope/EndoSLAM. A video demonstrating the experimental setup and procedure is accessible through https://www.youtube.com/watch?v=G_LCe0aWWdQ.Comment: 27 pages, 16 figure

    Explainable Information Retrieval using Deep Learning for Medical images

    Get PDF
    Image segmentation is useful to extract valuable information for an efficient analysis on the region of interest. Mostly, the number of images generated from a real life situation such as streaming video, is large and not ideal for traditional segmentation with machine learning algorithms. This is due to the following factors (a) numerous image features (b) complex distribution of shapes, colors and textures (c) imbalance data ratio of underlying classes (d) movements of the camera, objects and (e) variations in luminance for site capture. So, we have proposed an efficient deep learning model for image classification and the proof-of-concept has been the case studied on gastrointestinal images for bleeding detection. The Explainable Artificial Intelligence (XAI) module has been utilised to reverse engineer the test results for the impact of features on a given test dataset. The architecture is generally applicable in other areas of image classification. The proposed method has been compared with state-of-the-art including Logistic Regression, Support Vector Machine, Artificial Neural Network and Random Forest. It has reported F1 score of 0.76 on the real world streaming dataset which is comparatively better than traditional methods

    Deep learning to find colorectal polyps in colonoscopy: A systematic literature review

    Get PDF
    Colorectal cancer has a great incidence rate worldwide, but its early detection significantly increases the survival rate. Colonoscopy is the gold standard procedure for diagnosis and removal of colorectal lesions with potential to evolve into cancer and computer-aided detection systems can help gastroenterologists to increase the adenoma detection rate, one of the main indicators for colonoscopy quality and predictor for colorectal cancer prevention. The recent success of deep learning approaches in computer vision has also reached this field and has boosted the number of proposed methods for polyp detection, localization and segmentation. Through a systematic search, 35 works have been retrieved. The current systematic review provides an analysis of these methods, stating advantages and disadvantages for the different categories used; comments seven publicly available datasets of colonoscopy images; analyses the metrics used for reporting and identifies future challenges and recommendations. Convolutional neural networks are the most used architecture together with an important presence of data augmentation strategies, mainly based on image transformations and the use of patches. End-to-end methods are preferred over hybrid methods, with a rising tendency. As for detection and localization tasks, the most used metric for reporting is the recall, while Intersection over Union is highly used in segmentation. One of the major concerns is the difficulty for a fair comparison and reproducibility of methods. Even despite the organization of challenges, there is still a need for a common validation framework based on a large, annotated and publicly available database, which also includes the most convenient metrics to report results. Finally, it is also important to highlight that efforts should be focused in the future on proving the clinical value of the deep learning based methods, by increasing the adenoma detection rate.This work was partially supported by PICCOLO project. This project has received funding from the European Union's Horizon2020 Research and Innovation Programme under grant agreement No. 732111. The sole responsibility of this publication lies with the author. The European Union is not responsible for any use that may be made of the information contained therein. The authors would also like to thank Dr. Federico Soria for his support on this manuscript and Dr. José Carlos Marín, from Hospital 12 de Octubre, and Dr. Ángel Calderón and Dr. Francisco Polo, from Hospital de Basurto, for the images in Fig. 4
    corecore