91 research outputs found

    Redes neurais convolucionais para deteção de landmarks gástricas

    Get PDF
    Gastric cancer is the fifth most incident cancer in the world and, when diagnosed at an advanced stage, its survival rate is only 5%-25%, providing that it is essential that the cancer is detected at an early stage. However, physicians specialized in this diagnosis have difficulties in detecting early lesions during a diagnostic examination, esophagogastroduodenoscopy (EGD). Early lesions on the walls of the digestive system are imperceptible and confounded with the stomach mucosa, being difficult to detect. On the other hand, physicians run the risk of not covering all areas of the stomach during diagnosis, especially areas that may have lesions. The introduction of artificial intelligence into this diagnostic method may help to detect gastric cancer at an earlier stage. The implementation of a system capable of monitoring all areas of the digestive system during EGD would be a solution to prevent the diagnosis of gastric cancer in advanced states. This work focuses on the study of upper gastrointestinal (GI) landmarks monitoring, which are anatomical areas of the digestive system more conducive to the appearance of lesions and that allow better control of the missed areas during EGD exam. The use of convolutional neural networks (CNNs) in GI landmarks monitoring has been a great target of study by the scientific community, with such networks having a good capacity to extract features that better characterize EGD images. The aim of this work consisted in testing new automatic algorithms, specifically CNN-based systems able to detect upper GI landmarks to avoid the presence of blind spots during EGD to increase the quality of endoscopic exams. In contrast with related works in the literature, in this work we used upper GI landmarks images closer to real-world environments. In particular, images for each anatomical landmark class include both examples affected by pathologies and healthy tissue. We tested some pre-trained architectures as the ResNet-50, DenseNet-121, and VGG-16. For each pre-trained architecture, we tested different learning approaches, including the use of class weights (CW), the use of batch normalization and dropout layers, and the use of data augmentation to train the network. The CW ResNet-50 achieved an accuracy of 71.79% and a Mathews Correlation Coefficient (MCC) of 65.06%. In current state-of-art studies, only supervised learning approaches were used to classify EGD images. On the other hand, in our work, we tested the use of unsupervised learning to increase classification performance. In particular, convolutional autoencoder architectures to extract representative features from unlabeled GI images and concatenated their outputs withs with the CW ResNet-50 architecture. We achieved an accuracy of 72.45% and an MCC of 65.08%.O cancro gástrico é o quinto cancro mais incidente no mundo e quando diagnosticado numa fase avançada a taxa de sobrevivência é de apenas 5%-25%. Assim, é essencial que este cancro seja detetado numa fase precoce. No entanto, os médicos especializados neste diagnóstico nem sempre são capazes de uma boa performance de deteção durante o exame de diagnóstico, a esofagogastroduodenoscopia (EGD). As lesões precoces nas paredes do sistema digestivo são quase impercetíveis e confundíveis com a mucosa do estômago, sendo difíceis de detetar. Por outro lado, os médicos correm o risco de não cobrirem todas as áreas do estômago durante o diagnóstico, podendo estas áreas ter lesões. A introdução da inteligência artificial neste método de diagnóstico poderá ajudar a detetar o cancro gástrico numa fase mais precoce. A implementação de um sistema capaz de fazer a monitorização de todas as áreas do sistema digestivo durante a EGD seria uma solução de forma a prevenir o diagnóstico de cancro gástrico em estados avançados. Este trabalho tem como foco o estudo da monitorização de landmarks gastrointestinais (GI) superiores, que são zonas anatómicas do sistema digestivo mais propícias ao surgimento de lesões e que permitem fazer um melhor controlo das áreas esquecidas durante a EGD. O uso de redes neurais convolucionais (CNNs) na monitorização de landmarks GI tem sido grande alvo de estudo pela comunidade científica, por serem redes com uma boa capacidade de extração features que melhor caraterizam as imagens da EGD. O objetivo deste trabalho consistiu em testar novos algoritmos automáticos baseados em CNNs capazes de detetar landmarks GI superiores para evitar a presença áreas não cobertas durante a EGD, aumentando a qualidade deste exame. Este trabalho difere de outros estudos porque foram usadas classes de landmarks GI superiores mais próximas do ambiente real da EGD. Dentro de cada classe incluímos imagens com patologias e de tecido saudável da respetiva zona anatómica, ao contrário dos demais estudos. Nos estudos apresentados no estado de arte apenas foram consideradas classes de landmarks com tecido saudável em tarefas de deteção de landmarks GI. Testámos algumas arquiteturas pré-treinadas como a ResNet-50, a DenseNet-121 e a VGG-16. Para cada arquitetura pré-treinada, testámos algumas variáveis: o uso de class weights (CW), o uso das camadas batch normalization e dropout, e o uso de data augmentation. A arquitetura CW ResNet-50 atingiu uma accuracy de 71,79% e um coeficiente de correlação de Mathews (MCC) de 65,06%. Nos estudos apresentados no estado de arte, apenas foram estudados sistemas de supervised learning para classificação de imagens EGD enquanto, que no nosso trabalho, foram também testados sistemas de unsupervised learning para aumentar o desempenho da classificação. Em particular, arquiteturas autoencoder convolucionais para extração de features de imagens GI sem labels. Assim, concatenámos os outputs das arquiteturas autoencoder convolucionais com a arquitetura CW ResNet-50 e alcançamos uma accuracy de 72,45% e um MCC de 65,08%.Mestrado em Engenharia Biomédic

    Kvasir-Capsule, a video capsule endoscopy dataset

    Get PDF
    Artificial intelligence (AI) is predicted to have profound effects on the future of video capsule endoscopy (VCE) technology. The potential lies in improving anomaly detection while reducing manual labour. Existing work demonstrates the promising benefits of AI-based computer-assisted diagnosis systems for VCE. They also show great potential for improvements to achieve even better results. Also, medical data is often sparse and unavailable to the research community, and qualified medical personnel rarely have time for the tedious labelling work. We present Kvasir-Capsule, a large VCE dataset collected from examinations at a Norwegian Hospital. Kvasir-Capsule consists of 117 videos which can be used to extract a total of 4,741,504 image frames. We have labelled and medically verified 47,238 frames with a bounding box around findings from 14 different classes. In addition to these labelled images, there are 4,694,266 unlabelled frames included in the dataset. The Kvasir-Capsule dataset can play a valuable role in developing better algorithms in order to reach true potential of VCE technology

    Anatomical Classification of the Gastrointestinal Tract Using Ensemble Transfer Learning

    Get PDF
    Endoscopy is a procedure used to visualize disorders of the gastrointestinal (GI) lumen. GI disorders can occur without symptoms, which is why gastroenterologists often recommend routine examinations of the GI tract. It allows a doctor to directly visualize the inside of the GI tract and identify the cause of symptoms, reducing the need for exploratory surgery or other invasive procedures. It can also detect the early stages of GI disorders, such as cancer, enabling prompt treatment that can improve outcomes. Endoscopic examinations generate significant numbers of GI images. Because of this vast amount of endoscopic image data, relying solely on human interpretation can be problematic. Artificial intelligence is gaining popularity in clinical medicine. Assist in medical image analysis and early detection of diseases, help with personalized treatment planning by analyzing a patient’s medical history and genomic data, and be used by surgical robots to improve precision and reduce invasiveness. It enables automated diagnosis, provides physicians with assistance, and may improve performance. One of the significant challenges is defining the specific anatomic locations of GI tract abnormalities. Clinicians can then determine appropriate treatment options, reducing the need for repetitive endoscopy. Due to the difficulty of collecting annotated data, very limited research has been conducted on the localization of anatomical locations by classification of endoscopy images. In this study, we present a classification of GI tract anatomical localization based on transfer learning and ensemble learning. Our approach involves the use of an autoencoder and the Xception model. The autoencoder was initially trained on thousands of unlabeled images, and the encoder then separated and used as a feature extractor. The Xception model was also used as a second model to extract features from the input images. The extracted feature vectors were then concatenated and fed into a Convolutional Neural Network for classification. This combination of models provides a powerful and versatile solution for image classification. By using the encoder as a feature extractor that can transfer the learned knowledge, it is possible to improve learning by allowing the model to focus on more relevant and useful data, which is extremely valuable when there are not enough appropriately labelled data. On the other hand, the Xception model provides additional feature extraction capabilities. Sometimes, one classifier is not enough in machine learning, as it depends on the problem we are trying to solve and the quality and quantity of data available. With ensemble learning, multiple learning networks can work together to create a stronger classifier. The final classification results are obtained by combining the information from both models through the CNN model. This approach demonstrates the potential for combining multiple models to improve the accuracy of image classification tasks in the medical domain. The HyperKvasir dataset is the main dataset used in this study. It contains 4,104 labelled and 99,417 unlabeled images taken at six different locations in the GI tract, including the cecum, ileum, pylorus, rectum, stomach, and Z line. After dataset preprocessing, which includes noise deduction and similarity removal, 871 labelled images remained for the purpose of this study. Our method was more accurate than state-of-the-art studies and had a higher F1 score while categorizing the input images into six different anatomical locations with less than a thousand labelled images. According to the results, feature extraction and ensemble learning increase accuracy by 5%, and a comparison with existing methods using the same dataset indicate improved performance and reduced cross entropy loss. The proposed method can therefore be used in the classification of endoscopy images

    Classification of Anomalies in Gastrointestinal Tract Using Deep Learning

    Get PDF
    Automatic detection of diseases and anatomical landmarks in medical images by the use of computers is important and considered a challenging process that could help medical diagnosis and reduce the cost and time of investigational procedures and refine health care systems all over the world. Recently, gastrointestinal (GI) tract disease diagnosis through endoscopic image classification is an active research area in the biomedical field. Several GI tract disease classification methods based on image processing and machine learning techniques have been proposed by diverse research groups in the recent past. However, yet effective and comprehensive deep ensemble neural network-based classification model with high accuracy classification results is not available in the literature. In this thesis, we review ways and mechanisms to use deep learning techniques to research on multi-disease computer-aided detection about gastrointestinal and identify these images. We re-trained five state-of-the-art neural network architectures, VGG16, ResNet, MobileNet, Inception-v3, and Xception on the Kvasir dataset to classify eight categories that include an anatomical landmark (pylorus, z-line, cecum), a diseased state (esophagitis, ulcerative colitis, polyps), or a medical procedure (dyed lifted polyps, dyed resection margins) in the Gastrointestinal Tract. Our models have showed results with a promising accuracy which is a remarkable performance with respect to the state-of-the-art approaches. The resulting accuracies achieved using VGG, ResNet, MobileNet, Inception-v3, and Xception were 98.3%, 92.3%, 97.6%, 90% and 98.2%, respectively. As it appears, the most accurate result has been achieved when retraining VGG16 and Xception neural networks with accuracy reache to 98% due to its high performance on training on ImageNet dataset and internal structure that support classification problems

    A New Approach for Gastrointestinal Tract Findings Detection and Classification : Deep Learning-Based Hybrid Stacking Ensemble Models

    Get PDF
    Endoscopic procedures for diagnosing gastrointestinal tract findings depend on specialist experience and inter-observer variability. This variability can cause minor lesions to be missed and prevent early diagnosis. In this study, deep learning-based hybrid stacking ensemble modeling has been proposed for detecting and classifying gastrointestinal system findings, aiming at early diagnosis with high accuracy and sensitive measurements and saving workload to help the specialist and objectivity in endoscopic diagnosis. In the first level of the proposed bi-level stacking ensemble approach, predictions are obtained by applying 5-fold cross-validation to three new CNN models. A machine learning classifier selected at the second level is trained according to the obtained predictions, and the final classification result is reached. The performances of the stacking models were compared with the performances of the deep learning models, and McNemar’s statistical test was applied to support the results. According to the experimental results, stacking ensemble models performed with a significant difference with 98.42% ACC and 98.19% MCC in the KvasirV2 dataset and 98.53% ACC and 98.39% MCC in the HyperKvasir dataset. This study is the first to offer a new learning-oriented approach that efficiently evaluates CNN features and provides objective and reliable results with statistical testing compared to state-of-the-art studies on the subject. The proposed approach improves the performance of deep learning models and outperforms the state-of-the-art studies in the literature.publishedVersionPeer reviewe

    Artificial intelligence and automation in endoscopy and surgery

    Get PDF
    Modern endoscopy relies on digital technology, from high-resolution imaging sensors and displays to electronics connecting configurable illumination and actuation systems for robotic articulation. In addition to enabling more effective diagnostic and therapeutic interventions, the digitization of the procedural toolset enables video data capture of the internal human anatomy at unprecedented levels. Interventional video data encapsulate functional and structural information about a patient’s anatomy as well as events, activity and action logs about the surgical process. This detailed but difficult-to-interpret record from endoscopic procedures can be linked to preoperative and postoperative records or patient imaging information. Rapid advances in artificial intelligence, especially in supervised deep learning, can utilize data from endoscopic procedures to develop systems for assisting procedures leading to computer-assisted interventions that can enable better navigation during procedures, automation of image interpretation and robotically assisted tool manipulation. In this Perspective, we summarize state-of-the-art artificial intelligence for computer-assisted interventions in gastroenterology and surgery

    A Robust Deep Model for Classification of Peptic Ulcer and Other Digestive Tract Disorders Using Endoscopic Images

    Get PDF
    Accurate patient disease classification and detection through deep-learning (DL) models are increasingly contributing to the area of biomedical imaging. The most frequent gastrointestinal (GI) tract ailments are peptic ulcers and stomach cancer. Conventional endoscopy is a painful and hectic procedure for the patient while Wireless Capsule Endoscopy (WCE) is a useful technology for diagnosing GI problems and doing painless gut imaging. However, there is still a challenge to investigate thousands of images captured during the WCE procedure accurately and efficiently because existing deep models are not scored with significant accuracy on WCE image analysis. So, to prevent emergency conditions among patients, we need an efficient and accurate DL model for real-time analysis. In this study, we propose a reliable and efficient approach for classifying GI tract abnormalities using WCE images by applying a deep Convolutional Neural Network (CNN). For this purpose, we propose a custom CNN architecture named GI Disease-Detection Network (GIDD-Net) that is designed from scratch with relatively few parameters to detect GI tract disorders more accurately and efficiently at a low computational cost. Moreover, our model successfully distinguishes GI disorders by visualizing class activation patterns in the stomach bowls as a heat map. The Kvasir-Capsule image dataset has a significant class imbalance problem, we exploited a synthetic oversampling technique BORDERLINE SMOTE (BL-SMOTE) to evenly distribute the image among the classes to prevent the problem of class imbalance. The proposed model is evaluated against various metrics and achieved the following values for evaluation metrics: 98.9%, 99.8%, 98.9%, 98.9%, 98.8%, and 0.0474 for accuracy, AUC, F1-score, precision, recall, and loss, respectively. From the simulation results, it is noted that the proposed model outperforms other state-of-the-art models in all the evaluation metrics

    Deep Learning-based Solutions to Improve Diagnosis in Wireless Capsule Endoscopy

    Full text link
    [eng] Deep Learning (DL) models have gained extensive attention due to their remarkable performance in a wide range of real-world applications, particularly in computer vision. This achievement, combined with the increase in available medical records, has made it possible to open up new opportunities for analyzing and interpreting healthcare data. This symbiotic relationship can enhance the diagnostic process by identifying abnormalities, patterns, and trends, resulting in more precise, personalized, and effective healthcare for patients. Wireless Capsule Endoscopy (WCE) is a non-invasive medical imaging technique used to visualize the entire Gastrointestinal (GI) tract. Up to this moment, physicians meticulously review the captured frames to identify pathologies and diagnose patients. This manual process is time- consuming and prone to errors due to the challenges of interpreting the complex nature of WCE procedures. Thus, it demands a high level of attention, expertise, and experience. To overcome these drawbacks, shorten the screening process, and improve the diagnosis, efficient and accurate DL methods are required. This thesis proposes DL solutions to the following problems encountered in the analysis of WCE studies: pathology detection, anatomical landmark identification, and Out-of-Distribution (OOD) sample handling. These solutions aim to achieve robust systems that minimize the duration of the video analysis and reduce the number of undetected lesions. Throughout their development, several DL drawbacks have appeared, including small and imbalanced datasets. These limitations have also been addressed, ensuring that they do not hinder the generalization of neural networks, leading to suboptimal performance and overfitting. To address the previous WCE problems and overcome the DL challenges, the proposed systems adopt various strategies that utilize the power advantage of Triplet Loss (TL) and Self-Supervised Learning (SSL) techniques. Mainly, TL has been used to improve the generalization of the models, while SSL methods have been employed to leverage the unlabeled data to obtain useful representations. The presented methods achieve State-of-the-art results in the aforementioned medical problems and contribute to the ongoing research to improve the diagnostic of WCE studies.[cat] Els models d’aprenentatge profund (AP) han acaparat molta atenció a causa del seu rendiment en una àmplia gamma d'aplicacions del món real, especialment en visió per ordinador. Aquest fet, combinat amb l'increment de registres mèdics disponibles, ha permès obrir noves oportunitats per analitzar i interpretar les dades sanitàries. Aquesta relació simbiòtica pot millorar el procés de diagnòstic identificant anomalies, patrons i tendències, amb la conseqüent obtenció de diagnòstics sanitaris més precisos, personalitzats i eficients per als pacients. La Capsula endoscòpica (WCE) és una tècnica d'imatge mèdica no invasiva utilitzada per visualitzar tot el tracte gastrointestinal (GI). Fins ara, els metges revisen minuciosament els fotogrames capturats per identificar patologies i diagnosticar pacients. Aquest procés manual requereix temps i és propens a errors. Per tant, exigeix un alt nivell d'atenció, experiència i especialització. Per superar aquests inconvenients, reduir la durada del procés de detecció i millorar el diagnòstic, es requereixen mètodes eficients i precisos d’AP. Aquesta tesi proposa solucions que utilitzen AP per als següents problemes trobats en l'anàlisi dels estudis de WCE: detecció de patologies, identificació de punts de referència anatòmics i gestió de mostres que pertanyen fora del domini. Aquestes solucions tenen com a objectiu aconseguir sistemes robustos que minimitzin la durada de l'anàlisi del vídeo i redueixin el nombre de lesions no detectades. Durant el seu desenvolupament, han sorgit diversos inconvenients relacionats amb l’AP, com ara conjunts de dades petits i desequilibrats. Aquestes limitacions també s'han abordat per assegurar que no obstaculitzin la generalització de les xarxes neuronals, evitant un rendiment subòptim. Per abordar els problemes anteriors de WCE i superar els reptes d’AP, els sistemes proposats adopten diverses estratègies que aprofiten l'avantatge de la Triplet Loss (TL) i les tècniques d’auto-aprenentatge. Principalment, s'ha utilitzat TL per millorar la generalització dels models, mentre que els mètodes d’autoaprenentatge s'han emprat per aprofitar les dades sense etiquetar i obtenir representacions útils. Els mètodes presentats aconsegueixen bons resultats en els problemes mèdics esmentats i contribueixen a la investigació en curs per millorar el diagnòstic dels estudis de WCE
    corecore