Search CORE

9 research outputs found

Time-based self-supervised learning for Wireless Capsule Endoscopy

Author: García Albert
Laiz Treceño Pablo
Pascual i Guinovart Guillem
Seguí Mesquida Santi
Vitrià i Marca Jordi
Wenzek Hagen
Publication venue: 'Elsevier BV'
Publication date: 21/02/2023
Field of study

State-of-the-art machine learning models, and especially deep learning ones, are significantly data-hungry; they require vast amounts of manually labeled samples to function correctly. However, in most medical imaging fields, obtaining said data can be challenging. Not only the volume of data is a problem, but also the imbalances within its classes; it is common to have many more images of healthy patients than of those with pathology. Computer-aided diagnostic systems suffer from these issues, usually over-designing their models to perform accurately. This work proposes using self-supervised learning for wireless endoscopy videos by introducing a custom-tailored method that does not initially need labels or appropriate balance. We prove that using the inferred inherent structure learned by our method, extracted from the temporal axis, improves the detection rate on several domain-specific applications even under severe imbalance. State-of-the-art results are achieved in polyp detection, with 95.00 ± 2.09% Area Under the Curve, and 92.77 ± 1.20% accuracy in the CAD-CAP dataset

Diposit Digital de la Universitat de Barcelona

Deep Learning-based Solutions to Improve Diagnosis in Wireless Capsule Endoscopy

Author: Laiz Treceño Pablo
Publication venue: 'Edicions de la Universitat de Barcelona'
Publication date: 12/12/2023
Field of study

[eng] Deep Learning (DL) models have gained extensive attention due to their remarkable performance in a wide range of real-world applications, particularly in computer vision. This achievement, combined with the increase in available medical records, has made it possible to open up new opportunities for analyzing and interpreting healthcare data. This symbiotic relationship can enhance the diagnostic process by identifying abnormalities, patterns, and trends, resulting in more precise, personalized, and effective healthcare for patients. Wireless Capsule Endoscopy (WCE) is a non-invasive medical imaging technique used to visualize the entire Gastrointestinal (GI) tract. Up to this moment, physicians meticulously review the captured frames to identify pathologies and diagnose patients. This manual process is time- consuming and prone to errors due to the challenges of interpreting the complex nature of WCE procedures. Thus, it demands a high level of attention, expertise, and experience. To overcome these drawbacks, shorten the screening process, and improve the diagnosis, efficient and accurate DL methods are required. This thesis proposes DL solutions to the following problems encountered in the analysis of WCE studies: pathology detection, anatomical landmark identification, and Out-of-Distribution (OOD) sample handling. These solutions aim to achieve robust systems that minimize the duration of the video analysis and reduce the number of undetected lesions. Throughout their development, several DL drawbacks have appeared, including small and imbalanced datasets. These limitations have also been addressed, ensuring that they do not hinder the generalization of neural networks, leading to suboptimal performance and overfitting. To address the previous WCE problems and overcome the DL challenges, the proposed systems adopt various strategies that utilize the power advantage of Triplet Loss (TL) and Self-Supervised Learning (SSL) techniques. Mainly, TL has been used to improve the generalization of the models, while SSL methods have been employed to leverage the unlabeled data to obtain useful representations. The presented methods achieve State-of-the-art results in the aforementioned medical problems and contribute to the ongoing research to improve the diagnostic of WCE studies.[cat] Els models d’aprenentatge profund (AP) han acaparat molta atenció a causa del seu rendiment en una àmplia gamma d'aplicacions del món real, especialment en visió per ordinador. Aquest fet, combinat amb l'increment de registres mèdics disponibles, ha permès obrir noves oportunitats per analitzar i interpretar les dades sanitàries. Aquesta relació simbiòtica pot millorar el procés de diagnòstic identificant anomalies, patrons i tendències, amb la conseqüent obtenció de diagnòstics sanitaris més precisos, personalitzats i eficients per als pacients. La Capsula endoscòpica (WCE) és una tècnica d'imatge mèdica no invasiva utilitzada per visualitzar tot el tracte gastrointestinal (GI). Fins ara, els metges revisen minuciosament els fotogrames capturats per identificar patologies i diagnosticar pacients. Aquest procés manual requereix temps i és propens a errors. Per tant, exigeix un alt nivell d'atenció, experiència i especialització. Per superar aquests inconvenients, reduir la durada del procés de detecció i millorar el diagnòstic, es requereixen mètodes eficients i precisos d’AP. Aquesta tesi proposa solucions que utilitzen AP per als següents problemes trobats en l'anàlisi dels estudis de WCE: detecció de patologies, identificació de punts de referència anatòmics i gestió de mostres que pertanyen fora del domini. Aquestes solucions tenen com a objectiu aconseguir sistemes robustos que minimitzin la durada de l'anàlisi del vídeo i redueixin el nombre de lesions no detectades. Durant el seu desenvolupament, han sorgit diversos inconvenients relacionats amb l’AP, com ara conjunts de dades petits i desequilibrats. Aquestes limitacions també s'han abordat per assegurar que no obstaculitzin la generalització de les xarxes neuronals, evitant un rendiment subòptim. Per abordar els problemes anteriors de WCE i superar els reptes d’AP, els sistemes proposats adopten diverses estratègies que aprofiten l'avantatge de la Triplet Loss (TL) i les tècniques d’auto-aprenentatge. Principalment, s'ha utilitzat TL per millorar la generalització dels models, mentre que els mètodes d’autoaprenentatge s'han emprat per aprofitar les dades sense etiquetar i obtenir representacions útils. Els mètodes presentats aconsegueixen bons resultats en els problemes mèdics esmentats i contribueixen a la investigació en curs per millorar el diagnòstic dels estudis de WCE

Diposit Digital de la Universitat de Barcelona

Uncertainty, interpretability and dataset limitations in Deep Learning

Author: Pascual i Guinovart Guillem
Publication venue: 'Edicions de la Universitat de Barcelona'
Publication date: 17/02/2023
Field of study

[eng] Deep Learning (DL) has gained traction in the last years thanks to the exponential increase in compute power. New techniques and methods are published at a daily basis, and records are being set across multiple disciplines. Undeniably, DL has brought a revolution to the machine learning field and to our lives. However, not everything has been resolved and some considerations must be taken into account. For instance, obtaining uncertainty measures and bounds is still an open problem. Models should be able to capture and express the confidence they have in their decisions, and Artificial Neural Networks (ANN) are known to lack in this regard. Be it through out of distribution samples, adversarial attacks, or simply unrelated or nonsensical inputs, ANN models demonstrate an unfounded and incorrect tendency to still output high probabilities. Likewise, interpretability remains an unresolved question. Some fields not only need but rely on being able to provide human interpretations of the thought process of models. ANNs, and specially deep models trained with DL, are hard to reason about. Last but not least, there is a tendency that indicates that models are getting deeper and more complex. At the same time, to cope with the increasing number of parameters, datasets are required to be of higher quality and, usually, larger. Not all research, and even less real world applications, can keep with the increasing demands. Therefore, taking into account the previous issues, the main aim of this thesis is to provide methods and frameworks to tackle each of them. These approaches should be applicable to any suitable field and dataset, and are employed with real world datasets as proof of concept. First, we propose a method that provides interpretability with respect to the results through uncertainty measures. The model in question is capable of reasoning about the uncertainty inherent in data and leverages that information to progressively refine its outputs. In particular, the method is applied to land cover segmentation, a classification task that aims to assign a type of land to each pixel in satellite images. The dataset and application serve to prove that the final uncertainty bound enables the end-user to reason about the possible errors in the segmentation result. Second, Recurrent Neural Networks are used as a method to create robust models towards lacking datasets, both in terms of size and class balance. We apply them to two different fields, road extraction in satellite images and Wireless Capsule Endoscopy (WCE). The former demonstrates that contextual information in the temporal axis of data can be used to create models that achieve comparable results to state-of-the-art while being less complex. The latter, in turn, proves that contextual information for polyp detection can be crucial to obtain models that generalize better and obtain higher performance. Last, we propose two methods to leverage unlabeled data in the model creation process. Often datasets are easier to obtain than to label, which results in many wasted opportunities with traditional classification approaches. Our approaches based on self-supervised learning result in a novel contrastive loss that is capable of extracting meaningful information out of pseudo-labeled data. Applying both methods to WCE data proves that the extracted inherent knowledge creates models that perform better in extremely unbalanced datasets and with lack of data. To summarize, this thesis demonstrates potential solutions to obtain uncertainty bounds, provide reasonable explanations of the outputs, and to combat lack of data or unbalanced datasets. Overall, the presented methods have a positive impact on the DL field and could have a real and tangible effect for the society.[cat] És innegable que el Deep Learning ha causat una revolució en molts aspectes no solament de l’aprenentatge automàtic però també de les nostres vides diàries. Tot i així, encara queden aspectes a millorar. Les xarxes neuronals tenen problemes per estimar la seva confiança en les prediccions, i sovint reporten probabilitats altes en casos que no tenen relació amb el model o que directament no tenen sentit. De la mateixa forma, interpretar els resultats d’un model profund i complex resulta una tasca extremadament complicada. Aquests mateixos models, cada cop amb més paràmetres i més potents, requereixen també de dades més ben etiquetades i més completes. Tenint en compte aquestes limitacions, l’objectiu principal és el de buscar mètodes i algoritmes per trobar-ne solució. Primerament, es proposa la creació d’un mètode capaç d’obtenir incertesa en imatges satèl·lit i d’utilitzar-la per crear models més robustos i resultats interpretables. En segon lloc, s’utilitzen Recurrent Neural Networks (RNN) per combatre la falta de dades mitjançant l’obtenció d’informació contextual de dades temporals. Aquestes s’apliquen per l’extracció de carreteres d’imatges satèl·lit i per la classificació de pòlips en imatges obtingudes amb Wireless Capsule Endoscopy (WCE). Finalment, es plantegen dos mètodes per tractar amb la falta de dades etiquetades i desbalancejos en les classes amb l’ús de Self-supervised Learning (SSL). Seqüències no etiquetades d’imatges d’intestins s’incorporen en el models en una fase prèvia a la classificació tradicional. Aquesta tesi demostra que les solucions proposades per obtenir mesures d’incertesa són efectives per donar explicacions raonables i interpretables sobre els resultats. Igualment, es prova que el context en dades de caràcter temporal, obtingut amb RNNs, serveix per obtenir models més simples que poden arribar a solucionar els problemes derivats de la falta de dades. Per últim, es mostra que SSL serveix per combatre de forma efectiva els problemes de generalització degut a dades no balancejades en diversos dominis de WCE. Concloem que aquesta tesi presenta mètodes amb un impacte real en diversos aspectes de DL a la vegada que demostra la capacitat de tenir un impacte positiu en la societat

Diposit Digital de la Universitat de Barcelona

Self-supervised out-of-distribution detection in wireless capsule endoscopy images.

Author: Laiz Treceño Pablo
Quindós Sánchez Arnau
Seguí Mesquida Santi
Vitrià i Marca Jordi
Publication venue: Elsevier
Publication date: 19/02/2024
Field of study

While deep learning has displayed excellent performance in a broad spectrum of application areas, neural networks still struggle to recognize what they have not seen, i.e., out-of-distribution (OOD) inputs. In the medical field, building robust models that are able to detect OOD images is highly critical, as these rare images could show diseases or anomalies that should be detected. In this study, we use wireless capsule endoscopy (WCE) images to present a novel patch-based self-supervised approach comprising three stages. First, we train a triplet network to learn vector representations of WCE image patches. Second, we cluster the patch embeddings to group patches in terms of visual similarity. Third, we use the cluster assignments as pseudolabels to train a patch classifier and use the Out-of-Distribution Detector for Neural Networks (ODIN) for OOD detection. The system has been tested on the Kvasir-capsule, a publicly released WCE dataset. Empirical results show an OOD detection improvement compared to baseline methods. Our method can detect unseen pathologies and anomalies such as lymphangiectasia, foreign bodies and blood with > 0.6. This work presents an effective solution for OOD detection models without needing labeled images

Diposit Digital de la Universitat de Barcelona

Artificial intelligence to improve polyp detection and screening time in colon capsule endoscopy

Author: Gilabert Pere
Laiz Pablo
Malagelada Prats Carolina
Segui Santi
Vitrià i Marca Jordi
Watson Angus
Wenzek Hagen
Publication venue
Publication date: 01/01/2022
Field of study

Colon Capsule Endoscopy (CCE) is a minimally invasive procedure which is increasingly being used as an alternative to conventional colonoscopy. Videos recorded by the capsule cameras are long and require one or more experts' time to review and identify polyps or other potential intestinal problems that can lead to major health issues. We developed and tested a multi-platform web application, AI-Tool, which embeds a Convolution Neural Network (CNN) to help CCE reviewers. With the help of artificial intelligence, AI-Tool is able to detect images with high probability of containing a polyp and prioritize them during the reviewing process. With the collaboration of 3 experts that reviewed 18 videos, we compared the classical linear review method using RAPID Reader Software v9.0 and the new software we present. Applying the new strategy, reviewing time was reduced by a factor of 6 and polyp detection sensitivity was increased from 81.08 to 87.80%

Aberdeen University Research

PubMed Central

Diposit Digital de Documents de la UAB

Classification of Gastric Lesions Using Gabor Block Local Binary Patterns

Author: Riaz Farhan
Publication venue: Tech Science Press
Publication date: 03/04/2023
Field of study

The identification of cancer tissues in Gastroenterology imaging poses novel challenges to the computer vision community in designing generic decision support systems. This generic nature demands the image descriptors to be invariant to illumination gradients, scaling, homogeneous illumination, and rotation. In this article, we devise a novel feature extraction methodology, which explores the effectiveness of Gabor filters coupled with Block Local Binary Patterns in designing such descriptors. We effectively exploit the illumination invariance properties of Block Local Binary Patterns and the inherent capability of convolutional neural networks to construct novel rotation, scale and illumination invariant features. The invariance characteristics of the proposed Gabor Block Local Binary Patterns (GBLBP) are demonstrated using a publicly available texture dataset. We use the proposed feature extraction methodology to extract texture features from Chromoendoscopy (CH) images for the classification of cancer lesions. The proposed feature set is later used in conjuncture with convolutional neural networks to classify the CH images. The proposed convolutional neural network is a shallow network comprising of fewer parameters in contrast to other state-of-the-art networks exhibiting millions of parameters required for effective training. The obtained results reveal that the proposed GBLBP performs favorably to several other state-of-the-art methods including both hand crafted and convolutional neural networks-based features

University of Lincoln Institutional Repository

Assessing generalisability of deep learning-based polyp detection and segmentation methods through a computer vision challenge

Author: Ahmed Amr
Ali Sharib
Ballester Miguel-Ángel González
Cannizzaro Renato
Daul Christian
de Lange Thomas
East James E
Galdran Adrian
Gan Tianyuan
Ghatwary Noha
Haithami Mahmood
Halvorsen Pål
Hicks Steven
Isik-Polat Ece
Jha Debesh
Jin Ziyi
Lamarque Dominique
Lee Hyunseok
Lee Sang-Woong
Li Wuyang
Polat Gorkem
Poudel Sahadev
Realdon Stefano
Riegler Michael A
Rittscher Jens
Salem Osama E
Thambawita Vajira
Tomar Nikhil Kumar
Yan JiangPeng
Yang Chen
Yeo Doyeob
Yu ChengHui
Publication venue: Springer Nature
Publication date: 23/01/2024
Field of study

Polyps are well-known cancer precursors identified by colonoscopy. However, variability in their size, appearance, and location makes the detection of polyps challenging. Moreover, colonoscopy surveillance and removal of polyps are highly operator-dependent procedures and occur in a highly complex organ topology. There exists a high missed detection rate and incomplete removal of colonic polyps. To assist in clinical procedures and reduce missed rates, automated methods for detecting and segmenting polyps using machine learning have been achieved in past years. However, the major drawback in most of these methods is their ability to generalise to out-of-sample unseen datasets from different centres, populations, modalities, and acquisition systems. To test this hypothesis rigorously, we, together with expert gastroenterologists, curated a multi-centre and multi-population dataset acquired from six different colonoscopy systems and challenged the computational expert teams to develop robust automated detection and segmentation methods in a crowd-sourcing Endoscopic computer vision challenge. This work put forward rigorous generalisability tests and assesses the usability of devised deep learning methods in dynamic and actual clinical colonoscopy procedures. We analyse the results of four top performing teams for the detection task and five top performing teams for the segmentation task. Our analyses demonstrate that the top-ranking teams concentrated mainly on accuracy over the real-time performance required for clinical applicability. We further dissect the devised methods and provide an experiment-based hypothesis that reveals the need for improved generalisability to tackle diversity present in multi-centre datasets and routine clinical procedures

Oxford University Research Archive

Recommended from our members

Machine learning based small bowel video capsule endoscopy analysis: Challenges and opportunities

Author: Mehmood Irfan
Muhammad K.
Sangaiah A.K.
Ugail Hassan
Wahab Haroon
Publication venue
Publication date: 19/07/2023
Field of study

YesVideo capsule endoscopy (VCE) is a revolutionary technology for the early diagnosis of gastric disorders. However, owing to the high redundancy and subtle manifestation of anomalies among thousands of frames, the manual construal of VCE videos requires considerable patience, focus, and time. The automatic analysis of these videos using computational methods is a challenge as the capsule is untamed in motion and captures frames inaptly. Several machine learning (ML) methods, including recent deep convolutional neural networks approaches, have been adopted after evaluating their potential of improving the VCE analysis. However, the clinical impact of these methods is yet to be investigated. This survey aimed to highlight the gaps between existing ML-based research methodologies and clinically significant rules recently established by gastroenterologists based on VCE. A framework for interpreting raw frames into contextually relevant frame-level findings and subsequently merging these findings with meta-data to obtain a disease-level diagnosis was formulated. Frame-level findings can be more intelligible for discriminative learning when organized in a taxonomical hierarchy. The proposed taxonomical hierarchy, which is formulated based on pathological and visual similarities, may yield better classification metrics by setting inference classes at a higher level than training classes. Mapping from the frame level to the disease level was structured in the form of a graph based on clinical relevance inspired by the recent international consensus developed by domain experts. Furthermore, existing methods for VCE summarization, classification, segmentation, detection, and localization were critically evaluated and compared based on aspects deemed significant by clinicians. Numerous studies pertain to single anomaly detection instead of a pragmatic approach in a clinical setting. The challenges and opportunities associated with VCE analysis were delineated. A focus on maximizing the discriminative power of features corresponding to various subtle lesions and anomalies may help cope with the diverse and mimicking nature of different VCE frames. Large multicenter datasets must be created to cope with data sparsity, bias, and class imbalance. Explainability, reliability, traceability, and transparency are important for an ML-based diagnostics system in a VCE. Existing ethical and legal bindings narrow the scope of possibilities where ML can potentially be leveraged in healthcare. Despite these limitations, ML based video capsule endoscopy will revolutionize clinical practice, aiding clinicians in rapid and accurate diagnosis

Bradford Scholars

WCE polyp detection with triplet based embeddings

Author: Azpiroz Vidaur Fernando
Laiz Treceño Pablo
Malagelada Prats Carolina
Seguí Mesquida Santi
Vitrià i Marca Jordi
Wenzek Hagen
Publication venue: 'Elsevier BV'
Publication date: 02/10/2020
Field of study

Wireless capsule endoscopy is a medical procedure used to visualize the entire gastrointestinal tractand to diagnose intestinal conditions, such as polyps or bleeding. Current analyses are performedby manually inspecting nearly each one of the frames of the video, a tedious and error-prone task.Automatic image analysis methods can be used to reduce the time needed for physicians to evaluate acapsule endoscopy video. However these methods are still in a research phase.In this paper we focus on computer-aided polyp detection in capsule endoscopy images. This is achallenging problem because of the diversity of polyp appearance, the imbalanced dataset structureand the scarcity of data. We have developed a new polyp computer-aided decision system thatcombines a deep convolutional neural network and metric learning. The key point of the method isthe use of the Triplet Loss function with the aim of improving feature extraction from the imageswhen having small dataset. The Triplet Loss function allows to train robust detectors by forcingimages from the same category to be represented by similar embedding vectors while ensuring thatimages from different categories are represented by dissimilar vectors. Empirical results show ameaningful increase of AUC values compared to state-of-the-art methods.A good performance is not the only requirement when considering the adoption of this technologyto clinical practice. Trust and explainability of decisions are as important as performance. Withthis purpose, we also provide a method to generate visual explanations of the outcome of our polypdetector. These explanations can be used to build a physician's trust in the system and also to conveyinformation about the inner working of the method to the designer for debugging purposes

arXiv.org e-Print Archive

Diposit Digital de la Universitat de Barcelona