16 research outputs found

    Two layer ensemble of deep learning models for medical image segmentation.

    Get PDF
    In recent years, deep learning has rapidly become a method of choice for the segmentation of medical images. Deep Neural Network (DNN) architectures such as UNet have achieved state-of-the-art results on many medical datasets. To further improve the performance in the segmentation task, we develop an ensemble system which combines various deep learning architectures. We propose a two-layer ensemble of deep learning models for the segmentation of medical images. The prediction for each training image pixel made by each model in the first layer is used as the augmented data of the training image for the second layer of the ensemble. The prediction of the second layer is then combined by using a weights-based scheme in which each model contributes differently to the combined result. The weights are found by solving linear regression problems. Experiments conducted on two popular medical datasets namely CAMUS and Kvasir-SEG show that the proposed method achieves better results concerning two performance metrics (Dice Coefficient and Hausdorff distance) compared to some well-known benchmark algorithms

    Two layer Ensemble of Deep Learning Models for Medical Image Segmentation

    Get PDF
    In recent years, deep learning has rapidly become a method of choice for the segmentation of medical images. Deep Neural Network (DNN) architectures such as UNet have achieved state-of-the-art results on many medical datasets. To further improve the performance in the segmentation task, we develop an ensemble system which combines various deep learning architectures. We propose a two-layer ensemble of deep learning models for the segmentation of medical images. The prediction for each training image pixel made by each model in the first layer is used as the augmented data of the training image for the second layer of the ensemble. The prediction of the second layer is then combined by using a weights-based scheme in which each model contributes differently to the combined result. The weights are found by solving linear regression problems. Experiments conducted on two popular medical datasets namely CAMUS and Kvasir-SEG show that the proposed method achieves better results concerning two performance metrics (Dice Coefficient and Hausdorff distance) compared to some well-known benchmark algorithms.Comment: 8 pages, 4 figure

    A Survey on Deep Learning in Medical Image Analysis

    Full text link
    Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

    Deep reinforcement learning for subpixel neural tracking

    Get PDF
    Automatically tracing elongated structures, such as axons and blood vessels, is a challenging problem in the field of biomedical imaging, but one with many downstream applications. Real, labelled data is sparse, and existing algorithms either lack robustness to different datasets, or otherwise require significant manual tuning. Here, we instead learn a tracking algorithm in a synthetic environment, and apply it to tracing axons. To do so, we formulate tracking as a reinforcement learning problem, and apply deep reinforcement learning techniques with a continuous action space to learn how to track at the subpixel level. We train our model on simple synthetic data and test it on mouse cortical two-photon microscopy images. Despite the domain gap, our model approaches the performance of a heavily engineered tracker from a standard analysis suite for neuronal microscopy. We show that fine-tuning on real data improves performance, allowing better transfer when real labelled data is available. Finally, we demonstrate that our model's uncertainty measure-a feature lacking in hand-engineered trackers-corresponds with how well it tracks the structure

    Recurrent Neural Networks for Representing, Segmenting, and Classifying Surgical Activities

    Get PDF
    Robot-assisted surgery has enabled scalable, transparent capture of high-quality data during operation, and this has in turn led to many new research opportunities. Among these opportunities are those that aim to improve the objectivity and efficiency of surgical training, which include making performance assessment and feedback more objective and consistent; providing more specific or localized assessment and feedback; delegating this responsibility to machines, which have the potential to provide feedback in any desired abundance; and having machines go even further, for example by optimizing practice routines, in the form of a virtual coach. In this thesis, we focus on a foundation that serves all of these objectives: automated surgical activity recognition, or in other words the ability to automatically determine what activities a surgeon is performing and when those activities are taking place. First, we introduce the use of recurrent neural networks (RNNs) for localizing and classifying surgical activities from motion data. Here, we show for the first time that this task is possible at the level of maneuvers, which unlike the activities considered in prior work are already a part of surgical training curricula. Second, we study the ability of RNNs to learn dependencies over extremely long time periods, which we posit are present in surgical motion data; and we introduce MIST RNNs, a new RNN architecture that is capable of capturing these extremely long-term dependencies. Third, we investigate unsupervised learning using surgical motion data: we show that predicting future motion from past motion with RNNs, using motion data alone, leads to meaningful and useful representations of surgical motion. This approach leads to the discovery of surgical activities from unannotated data, and to state-of-the-art performance for querying a database of surgical activity using motion-based queries. Finally, we depart from a common yet limiting assumption in nearly all prior work on surgical activity recognition: that annotated training data, which is difficult and expensive to acquire, is available in abundance. We demonstrate for the first time that both gesture recognition and maneuver recognition are feasible even when very few annotated sequences are available; and that future-prediction based representation learning, prior to the recognition phase, yields significant performance improvements when annotated data is scarce

    Neutro-Connectedness Theory, Algorithms and Applications

    Get PDF
    Connectedness is an important topological property and has been widely studied in digital topology. However, three main challenges exist in applying connectedness to solve real world problems: (1) the definitions of connectedness based on the classic and fuzzy logic cannot model the “hidden factors” that could influence our decision-making; (2) these definitions are too general to be applied to solve complex problem; and (4) many measurements of connectedness are heavily dependent on the shape (spatial distribution of vertices) of the graph and violate the intuitive idea of connectedness. This research focused on solving these challenges by redesigning the connectedness theory, developing fast algorithms for connectedness computation, and applying the newly proposed theory and algorithms to solve challenges in real problems. The newly proposed Neutro-Connectedness (NC) generalizes the conventional definitions of connectedness and can model uncertainty and describe the part and the whole relationship. By applying the dynamic programming strategy, a fast algorithm was proposed to calculate NC for general dataset. It is not just calculating NC map, and the output NC forest can discover a dataset’s topological structure regarding connectedness. In the first application, interactive image segmentation, two approaches were proposed to solve the two most difficult challenges: user interaction-dependence and intense interaction. The first approach, named NC-Cut, models global topologic property among image regions and reduces the dependence of segmentation performance on the appearance models generated by user interactions. It is less sensitive to the initial region of interest (ROI) than four state-of-the-art ROI-based methods. The second approach, named EISeg, provides user with visual clues to guide the interacting process based on NC. It reduces user interaction greatly by guiding user to where interacting can produce the best segmentation results. In the second application, NC was utilized to solve the challenge of weak boundary problem in breast ultrasound image segmentation. The approach can model the indeterminacy resulted from weak boundaries better than fuzzy connectedness, and achieved more accurate and robust result on our dataset with 131 breast tumor cases

    3D-3D Deformable Registration and Deep Learning Segmentation based Neck Diseases Analysis in MRI

    Full text link
    Whiplash, cervical dystonia (CD), neck pain and work-related upper limb disorder (WRULD) are the most common diseases in the cervical region. Headaches, stiffness, sensory disturbance to the legs and arms, optical problems, aching in the back and shoulder, and auditory and visual problems are common symptoms seen in patients with these diseases. CD patients may also suffer tormenting spasticity in some neck muscles, with the symptoms possibly being acute and persisting for a long time, sometimes a lifetime. Whiplash-associated disorders (WADs) may occur due to sudden forward and backward movements of the head and neck occurring during a sporting activity or vehicle or domestic accident. These diseases affect private industries, insurance companies and governments, with the socio-economic costs significantly related to work absences, long-term sick leave, early disability and disability support pensions, health care expenses, reduced productivity and insurance claims. Therefore, diagnosing and treating neck-related diseases are important issues in clinical practice. The reason for these afflictions resulting from accident is the impairment of the cervical muscles which undergo atrophy or pseudo-hypertrophy due to fat infiltrating into them. These morphological changes have to be determined by identifying and quantifying their bio-markers before applying any medical intervention. Volumetric studies of neck muscles are reliable indicators of the proper treatments to apply. Radiation therapy, chemotherapy, injection of a toxin or surgery could be possible ways of treating these diseases. However, the dosages required should be precise because the neck region contains some sensitive organs, such as nerves, blood vessels and the trachea and spinal cord. Image registration and deep learning-based segmentation can help to determine appropriate treatments by analyzing the neck muscles. However, this is a challenging task for medical images due to complexities such as many muscles crossing multiple joints and attaching to many bones. Also, their shapes and sizes vary greatly across populations whereas their cross-sectional areas (CSAs) do not change in proportion to the heights and weights of individuals, with their sizes varying more significantly between males and females than ages. Therefore, the neck's anatomical variabilities are much greater than those of other parts of the human body. Some other challenges which make analyzing neck muscles very difficult are their compactness, similar gray-level appearances, intra-muscular fat, sliding due to cardiac and respiratory motions, false boundaries created by intramuscular fat, low resolution and contrast in medical images, noise, inhomogeneity and background clutter with the same composition and intensity. Furthermore, a patient's mode, position and neck movements during the capture of an image create variability. However, very little significant research work has been conducted on analyzing neck muscles. Although previous image registration efforts form a strong basis for many medical applications, none can satisfy the requirements of all of them because of the challenges associated with their implementation and low accuracy which could be due to anatomical complexities and variabilities or the artefacts of imaging devices. In existing methods, multi-resolution- and heuristic-based methods are popular. However, the above issues cause conventional multi-resolution-based registration methods to be trapped in local minima due to their low degrees of freedom in their geometrical transforms. Although heuristic-based methods are good at handling large mismatches, they require pre-segmentation and are computationally expensive. Also, current deformable methods often face statistical instability problems and many local optima when dealing with small mismatches. On the other hand, deep learning-based methods have achieved significant success over the last few years. Although a deeper network can learn more complex features and yields better performances, its depth cannot be increased as this would cause the gradient to vanish during training and result in training difficulties. Recently, researchers have focused on attention mechanisms for deep learning but current attention models face a challenge in the case of an application with compact and similar small multiple classes, large variability, low contrast and noise. The focus of this dissertation is on the design of 3D-3D image registration approaches as well as deep learning-based semantic segmentation methods for analyzing neck muscles. In the first part of this thesis, a novel object-constrained hierarchical registration framework for aligning inter-subject neck muscles is proposed. Firstly, to handle large-scale local minima, it uses a coarse registration technique which optimizes a new edge position difference (EPD) similarity measure to align large mismatches. Also, a new transformation based on the discrete periodic spline wavelet (DPSW), affine and free-form-deformation (FFD) transformations are exploited. Secondly, to avoid the monotonous nature of using transformations in multiple stages, affine registration technique, which uses a double-pushing system by changing the edges in the EPD and switching the transformation's resolutions, is designed to align small mismatches. The EPD helps in both the coarse and fine techniques to implement object-constrained registration via controlling edges which is not possible using traditional similarity measures. Experiments are performed on clinical 3D magnetic resonance imaging (MRI) scans of the neck, with the results showing that the EPD is more effective than the mutual information (MI) and the sum of squared difference (SSD) measures in terms of the volumetric dice similarity coefficient (DSC). Also, the proposed method is compared with two state-of-the-art approaches with ablation studies of inter-subject deformable registration and achieves better accuracy, robustness and consistency. However, as this method is computationally complex and has a problem handling large-scale anatomical variabilities, another 3D-3D registration framework with two novel contributions is proposed in the second part of this thesis. Firstly, a two-stage heuristic search optimization technique for handling large mismatches,which uses a minimal user hypothesis regarding these mismatches and is computationally fast, is introduced. It brings a moving image hierarchically closer to a fixed one using MI and EPD similarity measures in the coarse and fine stages, respectively, while the images do not require pre-segmentation as is necessary in traditional heuristic optimization-based techniques. Secondly, a region of interest (ROI) EPD-based registration framework for handling small mismatches using salient anatomical information (AI), in which a convex objective function is formed through a unique shape created from the desired objects in the ROI, is proposed. It is compared with two state-of-the-art methods on a neck dataset, with the results showing that it is superior in terms of accuracy and is computationally fast. In the last part of this thesis, an evaluation study of recent U-Net-based convolutional neural networks (CNNs) is performed on a neck dataset. It comprises 6 recent models, the U-Net, U-Net with a conditional random field (CRF-Unet), attention U-Net (A-Unet), nested U-Net or U-Net++, multi-feature pyramid (MFP)-Unet and recurrent residual U-Net (R2Unet) and 4 with more comprehensive modifications, the multi-scale U-Net (MS-Unet), parallel multi-scale U-Net (PMSUnet), recurrent residual attention U-Net (R2A-Unet) and R2A-Unet++ in neck muscles segmentation, with analyses of the numerical results indicating that the R2Unet architecture achieves the best accuracy. Also, two deep learning-based semantic segmentation approaches are proposed. In the first, a new two-stage U-Net++ (TS-UNet++) uses two different types of deep CNNs (DCNNs) rather than one similar to the traditional multi-stage method, with the U-Net++ in the first stage and the U-Net in the second. More convolutional blocks are added after the input and before the output layers of the multi-stage approach to better extract the low- and high-level features. A new concatenation-based fusion structure, which is incorporated in the architecture to allow deep supervision, helps to increase the depth of the network without accelerating the gradient-vanishing problem. Then, more convolutional layers are added after each concatenation of the fusion structure to extract more representative features. The proposed network is compared with the U-Net, U-Net++ and two-stage U-Net (TS-UNet) on the neck dataset, with the results indicating that it outperforms the others. In the second approach, an explicit attention method, in which the attention is performed through a ROI evolved from ground truth via dilation, is proposed. It does not require any additional CNN, as does a cascaded approach, to localize the ROI. Attention in a CNN is sensitive with respect to the area of the ROI. This dilated ROI is more capable of capturing relevant regions and suppressing irrelevant ones than a bounding box and region-level coarse annotation, and is used during training of any CNN. Coarse annotation, which does not require any detailed pixel wise delineation that can be performed by any novice person, is used during testing. This proposed ROI-based attention method, which can handle compact and similar small multiple classes with objects with large variabilities, is compared with the automatic A-Unet and U-Net, and performs best
    corecore