65 research outputs found

    Semantic Parsing of Colonoscopy Videos with Multi-Label Temporal Networks

    Full text link
    Following the successful debut of polyp detection and characterization, more advanced automation tools are being developed for colonoscopy. The new automation tasks, such as quality metrics or report generation, require understanding of the procedure flow that includes activities, events, anatomical landmarks, etc. In this work we present a method for automatic semantic parsing of colonoscopy videos. The method uses a novel DL multi-label temporal segmentation model trained in supervised and unsupervised regimes. We evaluate the accuracy of the method on a test set of over 300 annotated colonoscopy videos, and use ablation to explore the relative importance of various method's components

    A Survey on Deep Learning in Medical Image Analysis

    Full text link
    Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

    Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers

    Full text link
    Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account the differences in contribution between different-level features; and 2) designing an effective mechanism for fusing these features. Different from existing CNN-based methods, we adopt a transformer encoder, which learns more powerful and robust representations. In addition, considering the image acquisition influence and elusive properties of polyps, we introduce three novel modules, including a cascaded fusion module (CFM), a camouflage identification module (CIM), and a similarity aggregation module (SAM). Among these, the CFM is used to collect the semantic and location information of polyps from high-level features, while the CIM is applied to capture polyp information disguised in low-level features. With the help of the SAM, we extend the pixel features of the polyp area with high-level semantic position information to the entire polyp area, thereby effectively fusing cross-level features. The proposed model, named Polyp-PVT, effectively suppresses noises in the features and significantly improves their expressive capabilities. Extensive experiments on five widely adopted datasets show that the proposed model is more robust to various challenging situations (e.g., appearance changes, small objects) than existing methods, and achieves the new state-of-the-art performance. The proposed model is available at https://github.com/DengPingFan/Polyp-PVT.Comment: Technical Repor

    Automatic Segmentation and Inpainting of Specular Highlights for Endoscopic Imaging

    Get PDF
    Minimally invasive medical procedures have become increasingly common in today's healthcare practice. Images taken during such procedures largely show tissues of human organs, such as the mucosa of the gastrointestinal tract. These surfaces usually have a glossy appearance showing specular highlights. For many visual analysis algorithms, these distinct and bright visual features can become a significant source of error. In this article, we propose two methods to address this problem: (a) a segmentation method based on nonlinear filtering and colour image thresholding and (b) an efficient inpainting method. The inpainting algorithm eliminates the negative effect of specular highlights on other image analysis algorithms and also gives a visually pleasing result. The methods compare favourably to the existing approaches reported for endoscopic imaging. Furthermore, in contrast to the existing approaches, the proposed segmentation method is applicable to the widely used sequential RGB image acquisition systems

    Polyp Segmentation in Colonoscopy Images with Convolutional Neural Networks

    Get PDF
    The thesis looks at approaches to segmentation of polyps in colonoscopy images. The aim was to investigate and develop methods that are robust, accurate and computationally efficient and which can compete with the current state-of-the-art in polyp segmentation. Colorectal cancer is one of the leading cause of cancer deaths worldwide. To decrease mortality, an assessment of polyp malignancy is performed during colonoscopy examination so polyps can be removed at an early stage. In current routine clinical practice, polyps are detected and delineated manually in colonoscopy images by highly trained clinicians. To automate these processes, machine learning and computer vision techniques have been utilised. They have been shown to improve polyp detectability and segmentation objectivity. However, polyp segmentation is a very challenging task due to inherent variability of polyp morphology and colonoscopy image appearance. This research considers a range of approaches to polyp segmentation – seeking out those that offer a best compromise between accuracy and computational complexity. Based on analysis of existing machine learning and polyp image segmentation techniques, a novel hybrid deep learning segmentation method is proposed to alleviate the impact of the above stated challenges on polyp segmentation. The method consists of two fully convolutional networks. The first proposed network is based on a compact architecture with large receptive fields and multiple classification paths. The method performs well on most images, accurately segmenting polyps of diverse morphology and appearance. However, this network is prone to misdetection of very small polyps. To solve this problem, a second network is proposed, which primarily aims to improve sensitivity to small polyp details by emphasising low-level image features. In order to fully utilise information contained in the available training dataset, comprehensive data augmentation techniques are adopted. To further improve the performance of the proposed segmentation methods, test-time data augmentation is also implemented. A comprehensive multi-criterion analysis of the proposed methods is provided. The result demonstrates that the new methodology has better accuracy and robustness than the current state-of-the-art, as proven by the outstanding performance at the 2017 and 2018 GIANA polyp segmentation challenges

    Identification, indexing, and retrieval of cardio-pulmonary resuscitation (CPR) video scenes of simulated medical crisis.

    Get PDF
    Medical simulations, where uncommon clinical situations can be replicated, have proved to provide a more comprehensive training. Simulations involve the use of patient simulators, which are lifelike mannequins. After each session, the physician must manually review and annotate the recordings and then debrief the trainees. This process can be tedious and retrieval of specific video segments should be automated. In this dissertation, we propose a machine learning based approach to detect and classify scenes that involve rhythmic activities such as Cardio-Pulmonary Resuscitation (CPR) from training video sessions simulating medical crises. This applications requires different preprocessing techniques from other video applications. In particular, most processing steps require the integration of multiple features such as motion, color and spatial and temporal constrains. The first step of our approach consists of segmenting the video into shots. This is achieved by extracting color and motion information from each frame and identifying locations where consecutive frames have different features. We propose two different methods to identify shot boundaries. The first one is based on simple thresholding while the second one uses unsupervised learning techniques. The second step of our approach consists of selecting one key frame from each shot and segmenting it into homogeneous regions. Then few regions of interest are identified for further processing. These regions are selected based on the type of motion of their pixels and their likelihood to be skin-like regions. The regions of interest are tracked and a sequence of observations that encode their motion throughout the shot is extracted. The next step of our approach uses an HMM classiffier to discriminate between regions that involve CPR actions and other regions. We experiment with both continuous and discrete HMM. Finally, to improve the accuracy of our system, we also detect faces in each key frame, track them throughout the shot, and fuse their HMM confidence with the region\u27s confidence. To allow the user to view and analyze the video training session much more efficiently, we have also developed a graphical user interface (GUI) for CPR video scene retrieval and analysis with several desirable features. To validate our proposed approach to detect CPR scenes, we use one video simulation session recorded by the SPARC group to train the HMM classifiers and learn the system\u27s parameters. Then, we analyze the proposed system on other video recordings. We show that our approach can identify most CPR scenes with few false alarms

    MSRF-Net: A Multi-Scale Residual Fusion Network for Biomedical Image Segmentation

    Get PDF
    Methods based on convolutional neural networks have improved the performance of biomedical image segmentation. However, most of these methods cannot efficiently segment objects of variable sizes and train on small and biased datasets, which are common for biomedical use cases. While methods exist that incorporate multi-scale fusion approaches to address the challenges arising with variable sizes, they usually use complex models that are more suitable for general semantic segmentation problems. In this paper, we propose a novel architecture called MultiScale Residual Fusion Network (MSRF-Net), which is specially designed for medical image segmentation. The proposed MSRF-Net is able to exchange multi-scale features of varying receptive fields using a Dual-Scale Dense Fusion (DSDF) block. Our DSDF block can exchange information rigorously across two different resolution scales, and our MSRF sub-network uses multiple DSDF blocks in sequence to perform multi-scale fusion. This allows the preservation of resolution, improved information flow and propagation of both high- and low-level features to obtain accurate segmentation maps. The proposed MSRF-Net allows to capture object variabilities and provides improved results on different biomedical datasets. Extensive experiments on MSRF-Net demonstrate that the proposed method outperforms the cutting-edge medical image segmentation methods on four publicly available datasets. We achieve the Dice Coefficient (DSC) of 0.9217, 0.9420, and 0.9224, 0.8824 on Kvasir-SEG, CVC-ClinicDB, 2018 Data Science Bowl dataset, and ISIC-2018 skin lesion segmentation challenge dataset respectively. We further conducted generalizability tests and achieved DSC of 0.7921 and 0.7575 on CVCClinicDB and Kvasir-SEG, respectively.publishedVersio
    • …
    corecore