160 research outputs found

    Anatomical Classification of the Gastrointestinal Tract Using Ensemble Transfer Learning

    Get PDF
    Endoscopy is a procedure used to visualize disorders of the gastrointestinal (GI) lumen. GI disorders can occur without symptoms, which is why gastroenterologists often recommend routine examinations of the GI tract. It allows a doctor to directly visualize the inside of the GI tract and identify the cause of symptoms, reducing the need for exploratory surgery or other invasive procedures. It can also detect the early stages of GI disorders, such as cancer, enabling prompt treatment that can improve outcomes. Endoscopic examinations generate significant numbers of GI images. Because of this vast amount of endoscopic image data, relying solely on human interpretation can be problematic. Artificial intelligence is gaining popularity in clinical medicine. Assist in medical image analysis and early detection of diseases, help with personalized treatment planning by analyzing a patient’s medical history and genomic data, and be used by surgical robots to improve precision and reduce invasiveness. It enables automated diagnosis, provides physicians with assistance, and may improve performance. One of the significant challenges is defining the specific anatomic locations of GI tract abnormalities. Clinicians can then determine appropriate treatment options, reducing the need for repetitive endoscopy. Due to the difficulty of collecting annotated data, very limited research has been conducted on the localization of anatomical locations by classification of endoscopy images. In this study, we present a classification of GI tract anatomical localization based on transfer learning and ensemble learning. Our approach involves the use of an autoencoder and the Xception model. The autoencoder was initially trained on thousands of unlabeled images, and the encoder then separated and used as a feature extractor. The Xception model was also used as a second model to extract features from the input images. The extracted feature vectors were then concatenated and fed into a Convolutional Neural Network for classification. This combination of models provides a powerful and versatile solution for image classification. By using the encoder as a feature extractor that can transfer the learned knowledge, it is possible to improve learning by allowing the model to focus on more relevant and useful data, which is extremely valuable when there are not enough appropriately labelled data. On the other hand, the Xception model provides additional feature extraction capabilities. Sometimes, one classifier is not enough in machine learning, as it depends on the problem we are trying to solve and the quality and quantity of data available. With ensemble learning, multiple learning networks can work together to create a stronger classifier. The final classification results are obtained by combining the information from both models through the CNN model. This approach demonstrates the potential for combining multiple models to improve the accuracy of image classification tasks in the medical domain. The HyperKvasir dataset is the main dataset used in this study. It contains 4,104 labelled and 99,417 unlabeled images taken at six different locations in the GI tract, including the cecum, ileum, pylorus, rectum, stomach, and Z line. After dataset preprocessing, which includes noise deduction and similarity removal, 871 labelled images remained for the purpose of this study. Our method was more accurate than state-of-the-art studies and had a higher F1 score while categorizing the input images into six different anatomical locations with less than a thousand labelled images. According to the results, feature extraction and ensemble learning increase accuracy by 5%, and a comparison with existing methods using the same dataset indicate improved performance and reduced cross entropy loss. The proposed method can therefore be used in the classification of endoscopy images

    Learning Through Guidance: Knowledge Distillation for Endoscopic Image Classification

    Full text link
    Endoscopy plays a major role in identifying any underlying abnormalities within the gastrointestinal (GI) tract. There are multiple GI tract diseases that are life-threatening, such as precancerous lesions and other intestinal cancers. In the usual process, a diagnosis is made by a medical expert which can be prone to human errors and the accuracy of the test is also entirely dependent on the expert's level of experience. Deep learning, specifically Convolution Neural Networks (CNNs) which are designed to perform automatic feature learning without any prior feature engineering, has recently reported great benefits for GI endoscopy image analysis. Previous research has developed models that focus only on improving performance, as such, the majority of introduced models contain complex deep network architectures with a large number of parameters that require longer training times. However, there is a lack of focus on developing lightweight models which can run in low-resource environments, which are typically encountered in medical clinics. We investigate three KD-based learning frameworks, response-based, feature-based, and relation-based mechanisms, and introduce a novel multi-head attention-based feature fusion mechanism to support relation-based learning. Compared to the existing relation-based methods that follow simplistic aggregation techniques of multi-teacher response/feature-based knowledge, we adopt the multi-head attention technique to provide flexibility towards localising and transferring important details from each teacher to better guide the student. We perform extensive evaluations on two widely used public datasets, KVASIR-V2 and Hyper-KVASIR, and our experimental results signify the merits of our proposed relation-based framework in achieving an improved lightweight model (only 51.8k trainable parameters) that can run in a resource-limited environment

    Two-Stream Deep Feature Modelling for Automated Video Endoscopy Data Analysis

    Full text link
    Automating the analysis of imagery of the Gastrointestinal (GI) tract captured during endoscopy procedures has substantial potential benefits for patients, as it can provide diagnostic support to medical practitioners and reduce mistakes via human error. To further the development of such methods, we propose a two-stream model for endoscopic image analysis. Our model fuses two streams of deep feature inputs by mapping their inherent relations through a novel relational network model, to better model symptoms and classify the image. In contrast to handcrafted feature-based models, our proposed network is able to learn features automatically and outperforms existing state-of-the-art methods on two public datasets: KVASIR and Nerthus. Our extensive evaluations illustrate the importance of having two streams of inputs instead of a single stream and also demonstrates the merits of the proposed relational network architecture to combine those streams.Comment: Accepted for Publication at MICCAI 202

    Classification of Anomalies in Gastrointestinal Tract Using Deep Learning

    Get PDF
    Automatic detection of diseases and anatomical landmarks in medical images by the use of computers is important and considered a challenging process that could help medical diagnosis and reduce the cost and time of investigational procedures and refine health care systems all over the world. Recently, gastrointestinal (GI) tract disease diagnosis through endoscopic image classification is an active research area in the biomedical field. Several GI tract disease classification methods based on image processing and machine learning techniques have been proposed by diverse research groups in the recent past. However, yet effective and comprehensive deep ensemble neural network-based classification model with high accuracy classification results is not available in the literature. In this thesis, we review ways and mechanisms to use deep learning techniques to research on multi-disease computer-aided detection about gastrointestinal and identify these images. We re-trained five state-of-the-art neural network architectures, VGG16, ResNet, MobileNet, Inception-v3, and Xception on the Kvasir dataset to classify eight categories that include an anatomical landmark (pylorus, z-line, cecum), a diseased state (esophagitis, ulcerative colitis, polyps), or a medical procedure (dyed lifted polyps, dyed resection margins) in the Gastrointestinal Tract. Our models have showed results with a promising accuracy which is a remarkable performance with respect to the state-of-the-art approaches. The resulting accuracies achieved using VGG, ResNet, MobileNet, Inception-v3, and Xception were 98.3%, 92.3%, 97.6%, 90% and 98.2%, respectively. As it appears, the most accurate result has been achieved when retraining VGG16 and Xception neural networks with accuracy reache to 98% due to its high performance on training on ImageNet dataset and internal structure that support classification problems

    Detection of various gastrointestinal tract diseases through a deep learning method with ensemble ELM and explainable AI

    Get PDF
    The rising prevalence of gastrointestinal (GI) tract disorders worldwide highlights the urgent need for precise diagnosis, as these diseases greatly affect human life and contribute to high mortality rates. Fast identification, accurate classification, and efficient treatment approaches are essential for addressing this critical health issue. Common side effects include abdominal pain, bloating, and discomfort, which can be chronic and debilitating. Nausea and vomiting are also frequent, leading to difficulties in maintaining adequate nutrition and hydration. The current study intends to develop a deep learning (DL)-based approach that automatically classifies GI tract diseases. For the first time, a GastroVision dataset with 8000 images of 27 different GI diseases was utilized in this work to design a computer-aided diagnosis (CAD) system. This study presents a novel lightweight feature extractor with a compact size and minimum number of layers named Parallel Depthwise Separable Convolutional Neural Network (PD-CNN) and a Pearson Correlation Coefficient (PCC) as the feature selector. Furthermore, a robust classifier named the Ensemble Extreme Learning Machine (EELM), combined with pseudo inverse ELM (ELM) and L1 Regularized ELM (RELM), has been proposed to identify diseases more precisely. A hybrid preprocessing technique, including scaling, normalization, and image enhancement techniques such as erosion, CLAHE, sharpening, and Gaussian filtering, are employed to enhance image representation and improve classification performance. The proposed approach consists of twenty-four layers and only 0.815 million parameters with a 9.79 MB model size. The proposed PD-CNN-PCC-EELM extracts essential features, reduces computational overhead, and achieves excellent classification performance on multiclass GI images. The PD-CNN-PCC-EELM achieved the highest precision, recall, f1, accuracy, ROC-AUC, and AUC-PR values of 88.12 ± 0.332 %, 87.75 ± 0.348 %, 87.12 ± 0.324 %, 87.75 %, 98.89 %, and 92 %, respectively, while maintaining a minimum testing time of 0.000001 s. A comparative study utilizes 10-fold cross-validation, ablation study and various state-of-the-art (SOTA) transfer learning (TL) models as feature extractors. Then, the PCC and EELM are integrated with TL to generate predictions, notably in terms of performance and real-time processing capability; the proposed model significantly outperforms the other models. Moreover, various explainable AI (XAI) methods, such as SHAP (Shapley Additive Explanations), heatmap, guided heatmap, Grad-Cam (Gradient-weighted Class Activation Mapping), guided Grad-CAM, and guided Saliency mapping, have been employed to explore the interpretability and decision-making capability of the proposed model. Therefore, the model provides practical intelligence for increasing confidence in diagnosing GI diseases in real-world scenarios

    New Techniques in Gastrointestinal Endoscopy

    Get PDF
    As result of progress, endoscopy has became more complex, using more sophisticated devices and has claimed a special form. In this moment, the gastroenterologist performing endoscopy has to be an expert in macroscopic view of the lesions in the gut, with good skills for using standard endoscopes, with good experience in ultrasound (for performing endoscopic ultrasound), with pathology experience for confocal examination. It is compulsory to get experience and to have patience and attention for the follow-up of thousands of images transmitted during capsule endoscopy or to have knowledge in physics necessary for autofluorescence imaging endoscopy. Therefore, the idea of an endoscopist has changed. Examinations mentioned need a special formation, a superior level of instruction, accessible to those who have already gained enough experience in basic diagnostic endoscopy. This is the reason for what these new issues of endoscopy are presented in this book of New techniques in Gastrointestinal Endoscopy

    Deep Learning-based Solutions to Improve Diagnosis in Wireless Capsule Endoscopy

    Full text link
    [eng] Deep Learning (DL) models have gained extensive attention due to their remarkable performance in a wide range of real-world applications, particularly in computer vision. This achievement, combined with the increase in available medical records, has made it possible to open up new opportunities for analyzing and interpreting healthcare data. This symbiotic relationship can enhance the diagnostic process by identifying abnormalities, patterns, and trends, resulting in more precise, personalized, and effective healthcare for patients. Wireless Capsule Endoscopy (WCE) is a non-invasive medical imaging technique used to visualize the entire Gastrointestinal (GI) tract. Up to this moment, physicians meticulously review the captured frames to identify pathologies and diagnose patients. This manual process is time- consuming and prone to errors due to the challenges of interpreting the complex nature of WCE procedures. Thus, it demands a high level of attention, expertise, and experience. To overcome these drawbacks, shorten the screening process, and improve the diagnosis, efficient and accurate DL methods are required. This thesis proposes DL solutions to the following problems encountered in the analysis of WCE studies: pathology detection, anatomical landmark identification, and Out-of-Distribution (OOD) sample handling. These solutions aim to achieve robust systems that minimize the duration of the video analysis and reduce the number of undetected lesions. Throughout their development, several DL drawbacks have appeared, including small and imbalanced datasets. These limitations have also been addressed, ensuring that they do not hinder the generalization of neural networks, leading to suboptimal performance and overfitting. To address the previous WCE problems and overcome the DL challenges, the proposed systems adopt various strategies that utilize the power advantage of Triplet Loss (TL) and Self-Supervised Learning (SSL) techniques. Mainly, TL has been used to improve the generalization of the models, while SSL methods have been employed to leverage the unlabeled data to obtain useful representations. The presented methods achieve State-of-the-art results in the aforementioned medical problems and contribute to the ongoing research to improve the diagnostic of WCE studies.[cat] Els models d’aprenentatge profund (AP) han acaparat molta atenció a causa del seu rendiment en una àmplia gamma d'aplicacions del món real, especialment en visió per ordinador. Aquest fet, combinat amb l'increment de registres mèdics disponibles, ha permès obrir noves oportunitats per analitzar i interpretar les dades sanitàries. Aquesta relació simbiòtica pot millorar el procés de diagnòstic identificant anomalies, patrons i tendències, amb la conseqüent obtenció de diagnòstics sanitaris més precisos, personalitzats i eficients per als pacients. La Capsula endoscòpica (WCE) és una tècnica d'imatge mèdica no invasiva utilitzada per visualitzar tot el tracte gastrointestinal (GI). Fins ara, els metges revisen minuciosament els fotogrames capturats per identificar patologies i diagnosticar pacients. Aquest procés manual requereix temps i és propens a errors. Per tant, exigeix un alt nivell d'atenció, experiència i especialització. Per superar aquests inconvenients, reduir la durada del procés de detecció i millorar el diagnòstic, es requereixen mètodes eficients i precisos d’AP. Aquesta tesi proposa solucions que utilitzen AP per als següents problemes trobats en l'anàlisi dels estudis de WCE: detecció de patologies, identificació de punts de referència anatòmics i gestió de mostres que pertanyen fora del domini. Aquestes solucions tenen com a objectiu aconseguir sistemes robustos que minimitzin la durada de l'anàlisi del vídeo i redueixin el nombre de lesions no detectades. Durant el seu desenvolupament, han sorgit diversos inconvenients relacionats amb l’AP, com ara conjunts de dades petits i desequilibrats. Aquestes limitacions també s'han abordat per assegurar que no obstaculitzin la generalització de les xarxes neuronals, evitant un rendiment subòptim. Per abordar els problemes anteriors de WCE i superar els reptes d’AP, els sistemes proposats adopten diverses estratègies que aprofiten l'avantatge de la Triplet Loss (TL) i les tècniques d’auto-aprenentatge. Principalment, s'ha utilitzat TL per millorar la generalització dels models, mentre que els mètodes d’autoaprenentatge s'han emprat per aprofitar les dades sense etiquetar i obtenir representacions útils. Els mètodes presentats aconsegueixen bons resultats en els problemes mèdics esmentats i contribueixen a la investigació en curs per millorar el diagnòstic dels estudis de WCE

    Towards real-world clinical colonoscopy deep learning models for video-based bowel preparation and generalisable polyp segmentation

    Get PDF
    Colorectal cancer is the most prevalence type of cancers within the digestive system. Early screening and removal of precancerous growths in the colon decrease mortality rate. The golden standard screening type for colon is colonoscopy which is conducted by a medical expert (i.e., colonoscopist). Nevertheless, due to human biases, fatigue, and experience level of the colonoscopist, colorectal cancer missing rate is negatively affected. Artificial intelligence (AI) methods hold immense promise not just in automating colonoscopy tasks but also enhancing the performance of colonoscopy screening in general. The recent development of intense computational GPUs enabled a computational-demanding AI method (i.e., deep learning) to be utilised in various medical applications. However, given the gap between the clinical-practice and the proposed deep learning models in the literature, the actual effectiveness of such methods is questionable. Hence, this thesis highlights such gaps that arises from the separation between the theoretical and practical aspect of deep learning methods applied to colonoscopy. The aim is to evaluate the current state of deep learning models applied in colonoscopy from a clinical angle, and accordingly propose better evaluation strategies and deep learning models. The aim is translated into three distinct objectives. The first objective is to develop a systematic evaluation method to assess deep learning models from a clinical perspective. The second objective is to develop a novel deep learning architecture that leverages spatial information within colonoscopy videos to enhance the effectiveness of deep learning models on real-clinical environments. The third objective is to enhance the generalisability of deep learning models on unseen test images by developing a novel deep learning framework. To translate these objectives into practice, two critical colonoscopy tasks, namely, automatic bowel preparation and polyp segmentation are attacked. In both tasks, subtle overestimations are found in the literature and discussed in the thesis theoretically and demonstrated empirically. These overestimations are induced by improper validation sets that would not appear or represent the real-world clinical environment. Arbitrary dividing colonoscopy datasets to do deep learning evaluation can result in producing similar distributions, hence, achieving unrealistic results. Accordingly, these factors are considered in the thesis to avoid such subtle overestimation. For the automatic bowel preparation task, colonoscopy videos that closely resemble clinical settings are considered as input and accordingly it necessitates the design of the proposed model as well as evaluation experiments. The proposed model’s architecture is designed to utilise both temporal and spatial information within colonoscopy videos using Gated Recurrent Unit (GRU) and a proposed Multiplexer unit, respectively. Meanwhile for the polyp segmentation task, the efficiency of current deep learning models is tested in terms of their generalisation capabilities using unseen test sets from different medical centres. The proposed framework consists of two connected models. The first model is responsible for gradually transforming textures of input images and arbitrary change their colours. Meanwhile the second model is a segmentation model that outlines polyp regions. Exposing the segmentation model to such transformed images acquires the segmentation model texture/colour invariant properties, hence, enhances the generalisability of the segmentation model. In this thesis, rigorous experiments are conducted to evaluate the proposed models against the state-of-the-art models. The yielded results indicate that the proposed models outperformed the state-of-the-art models under different settings
    corecore