34 research outputs found

    EndoSLAM Dataset and An Unsupervised Monocular Visual Odometry and Depth Estimation Approach for Endoscopic Videos: Endo-SfMLearner

    Full text link
    Deep learning techniques hold promise to develop dense topography reconstruction and pose estimation methods for endoscopic videos. However, currently available datasets do not support effective quantitative benchmarking. In this paper, we introduce a comprehensive endoscopic SLAM dataset consisting of 3D point cloud data for six porcine organs, capsule and standard endoscopy recordings as well as synthetically generated data. A Panda robotic arm, two commercially available capsule endoscopes, two conventional endoscopes with different camera properties, and two high precision 3D scanners were employed to collect data from 8 ex-vivo porcine gastrointestinal (GI)-tract organs. In total, 35 sub-datasets are provided with 6D pose ground truth for the ex-vivo part: 18 sub-dataset for colon, 12 sub-datasets for stomach and 5 sub-datasets for small intestine, while four of these contain polyp-mimicking elevations carried out by an expert gastroenterologist. Synthetic capsule endoscopy frames from GI-tract with both depth and pose annotations are included to facilitate the study of simulation-to-real transfer learning algorithms. Additionally, we propound Endo-SfMLearner, an unsupervised monocular depth and pose estimation method that combines residual networks with spatial attention module in order to dictate the network to focus on distinguishable and highly textured tissue regions. The proposed approach makes use of a brightness-aware photometric loss to improve the robustness under fast frame-to-frame illumination changes. To exemplify the use-case of the EndoSLAM dataset, the performance of Endo-SfMLearner is extensively compared with the state-of-the-art. The codes and the link for the dataset are publicly available at https://github.com/CapsuleEndoscope/EndoSLAM. A video demonstrating the experimental setup and procedure is accessible through https://www.youtube.com/watch?v=G_LCe0aWWdQ.Comment: 27 pages, 16 figure

    VR-Caps: A Virtual Environment for Capsule Endoscopy

    Full text link
    Current capsule endoscopes and next-generation robotic capsules for diagnosis and treatment of gastrointestinal diseases are complex cyber-physical platforms that must orchestrate complex software and hardware functions. The desired tasks for these systems include visual localization, depth estimation, 3D mapping, disease detection and segmentation, automated navigation, active control, path realization and optional therapeutic modules such as targeted drug delivery and biopsy sampling. Data-driven algorithms promise to enable many advanced functionalities for capsule endoscopes, but real-world data is challenging to obtain. Physically-realistic simulations providing synthetic data have emerged as a solution to the development of data-driven algorithms. In this work, we present a comprehensive simulation platform for capsule endoscopy operations and introduce VR-Caps, a virtual active capsule environment that simulates a range of normal and abnormal tissue conditions (e.g., inflated, dry, wet etc.) and varied organ types, capsule endoscope designs (e.g., mono, stereo, dual and 360{\deg}camera), and the type, number, strength, and placement of internal and external magnetic sources that enable active locomotion. VR-Caps makes it possible to both independently or jointly develop, optimize, and test medical imaging and analysis software for the current and next-generation endoscopic capsule systems. To validate this approach, we train state-of-the-art deep neural networks to accomplish various medical image analysis tasks using simulated data from VR-Caps and evaluate the performance of these models on real medical data. Results demonstrate the usefulness and effectiveness of the proposed virtual platform in developing algorithms that quantify fractional coverage, camera trajectory, 3D map reconstruction, and disease classification.Comment: 18 pages, 14 figure

    Enhancing endoscopic navigation and polyp detection using artificial intelligence

    Get PDF
    Colorectal cancer (CRC) is one most common and deadly forms of cancer. It has a very high mortality rate if the disease advances to late stages however early diagnosis and treatment can be curative is hence essential to enhancing disease management. Colonoscopy is considered the gold standard for CRC screening and early therapeutic treatment. The effectiveness of colonoscopy is highly dependent on the operator’s skill, as a high level of hand-eye coordination is required to control the endoscope and fully examine the colon wall. Because of this, detection rates can vary between different gastroenterologists and technology have been proposed as solutions to assist disease detection and standardise detection rates. This thesis focuses on developing artificial intelligence algorithms to assist gastroenterologists during colonoscopy with the potential to ensure a baseline standard of quality in CRC screening. To achieve such assistance, the technical contributions develop deep learning methods and architectures for automated endoscopic image analysis to address both the detection of lesions in the endoscopic image and the 3D mapping of the endoluminal environment. The proposed detection models can run in real-time and assist visualization of different polyp types. Meanwhile the 3D reconstruction and mapping models developed are the basis for ensuring that the entire colon has been examined appropriately and to support quantitative measurement of polyp sizes using the image during a procedure. Results and validation studies presented within the thesis demonstrate how the developed algorithms perform on both general scenes and on clinical data. The feasibility of clinical translation is demonstrated for all of the models on endoscopic data from human participants during CRC screening examinations

    Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy

    Full text link
    Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera aka depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low-texture regions that are not well pronounced, learning representations from an auxiliary task can improve salient feature extraction, allowing estimation of accurate camera depths. In this work, we propose to develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator decoder. Our depth estimator incorporates attention mechanisms to enhance global context awareness. We leverage the surface normal prediction to improve geometric feature extraction. Also, we apply a cross-task consistency loss among the two geometrically related tasks, surface normal and camera depth. We demonstrate an improvement of 14.17% on relative error and 10.4% improvement on δ1\delta_{1} accuracy over the most accurate baseline state-of-the-art BTS approach. All experiments are conducted on a recently released C3VD dataset; thus, we provide a first benchmark of state-of-the-art methods.Comment: 19 page

    Learning-based depth and pose prediction for 3D scene reconstruction in endoscopy

    Get PDF
    Colorectal cancer is the third most common cancer worldwide. Early detection and treatment of pre-cancerous tissue during colonoscopy is critical to improving prognosis. However, navigating within the colon and inspecting the endoluminal tissue comprehensively are challenging, and success in both varies based on the endoscopist's skill and experience. Computer-assisted interventions in colonoscopy show much promise in improving navigation and inspection. For instance, 3D reconstruction of the colon during colonoscopy could promote more thorough examinations and increase adenoma detection rates which are associated with improved survival rates. Given the stakes, this thesis seeks to advance the state of research from feature-based traditional methods closer to a data-driven 3D reconstruction pipeline for colonoscopy. More specifically, this thesis explores different methods that improve subtasks of learning-based 3D reconstruction. The main tasks are depth prediction and camera pose estimation. As training data is unavailable, the author, together with her co-authors, proposes and publishes several synthetic datasets and promotes domain adaptation models to improve applicability to real data. We show, through extensive experiments, that our depth prediction methods produce more robust results than previous work. Our pose estimation network trained on our new synthetic data outperforms self-supervised methods on real sequences. Our box embeddings allow us to interpret the geometric relationship and scale difference between two images of the same surface without the need for feature matches that are often unobtainable in surgical scenes. Together, the methods introduced in this thesis help work towards a complete, data-driven 3D reconstruction pipeline for endoscopy

    A Robust Deep Model for Classification of Peptic Ulcer and Other Digestive Tract Disorders Using Endoscopic Images

    Get PDF
    Accurate patient disease classification and detection through deep-learning (DL) models are increasingly contributing to the area of biomedical imaging. The most frequent gastrointestinal (GI) tract ailments are peptic ulcers and stomach cancer. Conventional endoscopy is a painful and hectic procedure for the patient while Wireless Capsule Endoscopy (WCE) is a useful technology for diagnosing GI problems and doing painless gut imaging. However, there is still a challenge to investigate thousands of images captured during the WCE procedure accurately and efficiently because existing deep models are not scored with significant accuracy on WCE image analysis. So, to prevent emergency conditions among patients, we need an efficient and accurate DL model for real-time analysis. In this study, we propose a reliable and efficient approach for classifying GI tract abnormalities using WCE images by applying a deep Convolutional Neural Network (CNN). For this purpose, we propose a custom CNN architecture named GI Disease-Detection Network (GIDD-Net) that is designed from scratch with relatively few parameters to detect GI tract disorders more accurately and efficiently at a low computational cost. Moreover, our model successfully distinguishes GI disorders by visualizing class activation patterns in the stomach bowls as a heat map. The Kvasir-Capsule image dataset has a significant class imbalance problem, we exploited a synthetic oversampling technique BORDERLINE SMOTE (BL-SMOTE) to evenly distribute the image among the classes to prevent the problem of class imbalance. The proposed model is evaluated against various metrics and achieved the following values for evaluation metrics: 98.9%, 99.8%, 98.9%, 98.9%, 98.8%, and 0.0474 for accuracy, AUC, F1-score, precision, recall, and loss, respectively. From the simulation results, it is noted that the proposed model outperforms other state-of-the-art models in all the evaluation metrics

    Estimation of gastrointestinal polyp size in video endoscopy

    Get PDF
    Abstract Worldwide the colorectal cancer is one of the most common public health problems, constituting in 2010 the seventh cause of death. This aggressive cancer is firstly identified during an endoscopy routine examination by characterizing a set of polyps that appear along the digestive tract, mainly in the colon and rectum. The polyp size is one of the most important features that determines the surgical endoscopy management and even can be used to predict the level of aggressiveness. For instance, the gastroenterologists only send a polyp sample to the pathology examination if the polyp diameter is larger than 10 mm, a measure that is achieved typically by examining the lesion with a calibrated endoscopy tool. However, the polyp size measure is very challenging because it must be performed during a procedure subjected to a complex mix of noise sources, such as: the distorted optical characteristics of the endoscopy, the exacerbated physiological conditions and abrupt motion. The main goal of this thesis was estimated the polyp size during an endoscopy video sequence using a spatio-temporal characterization. Firstly, the method estimated the region with more motion within which the polyp shape is approximated by those pixels with the largest temporal variance. On the above, an initial manual polyp delineation in the first frame captures the main features to be follow in posterior frames by a cross correlation procedure. Afterwards, a bayesian tracking strategy is used to refine the polyp segmentation. Finally a defocus strategy allows to estimate on the clear cut frame at a certain depth as a reference to determine the polyp size obtaining reliable results. In the segmentation task, the approach achieved a Dice Score of 0.7 in real endoscopy video-sequences, when comparing with an expert. In addition, the results polyp size estimation obtained a Root Mean Square Error (RMSE) of 0.87 mm with spheres of known size that simulated the polyps, and in real endoscopy sequences obtaining a RMSE of 4.7 mm compared with measures obtained by a group of four experts with similar experience.El cáncer colorectal es uno de los problemas de salud pública más comunes a nivel mundial, ocupando la séptima causa de muerte en el 2010. Este tipo de cáncer tan agresivo es identificado prematuramente por un conjunto de pólipos que crecen a lo largo del tracto digestivo, principalmente en el colon y el recto. El tamaño de los pólipos es una de las características mas importantes, con la cual se determina el manejo quirúrgico de la lesión e incluso puede ser usado para predecir el grado de malignidad. Acorde a esto, el experto solo envía una muestra del pólipo para un examen patológico, sí el diámetro del pólipo es más largo que 10 mm. típica mente, esta medida es tomada examinando la lesión con una herramienta endoscópica calibrada. Sin embargo, la medición del tamaño del pólipo es realmente difícil debido a que el procedimiento está sujeto a fuentes de ruido bastante complejas, tales como: la distorsión óptica que es característica del endoscopio, las condiciones fisiológicas del tracto digestivo y los movimientos abruptos con el dispositivo. La contribución principal de este trabajo fue la estimación del tamaño de los pólipos, sobre una secuencia de vídeo de un procedimiento de endoscopia usando una caracterización espacio-temporal. En primera parte, el método estima la región con mayor movimiento que corresponde aproximadamente a la región del pólipo, tomando aquellos pixeles con mayor varianza temporal. Sobre lo anterior, una delineación manual de la lesión es realizada en el primer cuadro para establecer las principales características, para ser seguidas en los cuadros posteriores usando un método de correlación cruzada. Después, se usó una estrategia de seguimiento bayesiana para refinar la segmentación del pólipo. Finalmente, una estrategia basada en la correspondencia del desenfoque de las imágenes de una secuencia a una profundidad o distancia determinada, se pudo obtener una referencia para determinar el tamaño de los pólipos, obteniendo resultados fiables. En la etapa de segmentación, la estrategia logra un Dice score de 0, 7 al comparar con un experto en secuencias de endoscopia reales. Y en la estimación del tamaño de los pólipo se obtuvo un error cuadrático medio (RMSE) de 0.87 mm, comparando con esferas de tamaño conocido que simulaban los pólipos, y en secuencias de endoscopia reales se obtuvo un RMSE de 4.7 mm comparando con las medidas obtenidas por un grupo de cuatro. expertos con experiencia similar.Maestrí
    corecore