550 research outputs found

    Parallel Multi-Dimensional LSTM, With Application to Fast Biomedical Volumetric Image Segmentation

    Get PDF
    Abstract Convolutional Neural Networks (CNNs) can be shifted across 2D images or 3D videos to segment them. They have a fixed input size and typically perceive only small local contexts of the pixels to be classified as foreground or background. In contrast, Multi-Dimensional Recurrent NNs (MD-RNNs) can perceive the entire spatio-temporal context of each pixel in a few sweeps through all pixels, especially when the RNN is a Long Short-Term Memory (LSTM). Despite these theoretical advantages, however, unlike CNNs, previous MD-LSTM variants were hard to parallelise on GPUs. Here we re-arrange the traditional cuboid order of computations in MD-LSTM in pyramidal fashion. The resulting PyraMiD-LSTM is easy to parallelise, especially for 3D data such as stacks of brain slice images. PyraMiD-LSTM achieved best known pixel-wise brain image segmentation results on MRBrainS13 (and competitive results on EM-ISBI12)

    Deep Learning in Cardiology

    Full text link
    The medical field is creating large amount of data that physicians are unable to decipher and use efficiently. Moreover, rule-based expert systems are inefficient in solving complicated medical tasks or for creating insights using big data. Deep learning has emerged as a more accurate and effective technology in a wide range of medical problems such as diagnosis, prediction and intervention. Deep learning is a representation learning method that consists of layers that transform the data non-linearly, thus, revealing hierarchical relationships and structures. In this review we survey deep learning application papers that use structured data, signal and imaging modalities from cardiology. We discuss the advantages and limitations of applying deep learning in cardiology that also apply in medicine in general, while proposing certain directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table

    A Survey on Deep Learning in Medical Image Analysis

    Full text link
    Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

    Synergistic Visualization And Quantitative Analysis Of Volumetric Medical Images

    Get PDF
    The medical diagnosis process starts with an interview with the patient, and continues with the physical exam. In practice, the medical professional may require additional screenings to precisely diagnose. Medical imaging is one of the most frequently used non-invasive screening methods to acquire insight of human body. Medical imaging is not only essential for accurate diagnosis, but also it can enable early prevention. Medical data visualization refers to projecting the medical data into a human understandable format at mediums such as 2D or head-mounted displays without causing any interpretation which may lead to clinical intervention. In contrast to the medical visualization, quantification refers to extracting the information in the medical scan to enable the clinicians to make fast and accurate decisions. Despite the extraordinary process both in medical visualization and quantitative radiology, efforts to improve these two complementary fields are often performed independently and synergistic combination is under-studied. Existing image-based software platforms mostly fail to be used in routine clinics due to lack of a unified strategy that guides clinicians both visually and quan- titatively. Hence, there is an urgent need for a bridge connecting the medical visualization and automatic quantification algorithms in the same software platform. In this thesis, we aim to fill this research gap by visualizing medical images interactively from anywhere, and performing a fast, accurate and fully-automatic quantification of the medical imaging data. To end this, we propose several innovative and novel methods. Specifically, we solve the following sub-problems of the ul- timate goal: (1) direct web-based out-of-core volume rendering, (2) robust, accurate, and efficient learning based algorithms to segment highly pathological medical data, (3) automatic landmark- ing for aiding diagnosis and surgical planning and (4) novel artificial intelligence algorithms to determine the sufficient and necessary data to derive large-scale problems

    Semantic Object Parsing with Local-Global Long Short-Term Memory

    Full text link
    Semantic object parsing is a fundamental task for understanding objects in detail in computer vision community, where incorporating multi-level contextual information is critical for achieving such fine-grained pixel-level recognition. Prior methods often leverage the contextual information through post-processing predicted confidence maps. In this work, we propose a novel deep Local-Global Long Short-Term Memory (LG-LSTM) architecture to seamlessly incorporate short-distance and long-distance spatial dependencies into the feature learning over all pixel positions. In each LG-LSTM layer, local guidance from neighboring positions and global guidance from the whole image are imposed on each position to better exploit complex local and global contextual information. Individual LSTMs for distinct spatial dimensions are also utilized to intrinsically capture various spatial layouts of semantic parts in the images, yielding distinct hidden and memory cells of each position for each dimension. In our parsing approach, several LG-LSTM layers are stacked and appended to the intermediate convolutional layers to directly enhance visual features, allowing network parameters to be learned in an end-to-end way. The long chains of sequential computation by stacked LG-LSTM layers also enable each pixel to sense a much larger region for inference benefiting from the memorization of previous dependencies in all positions along all dimensions. Comprehensive evaluations on three public datasets well demonstrate the significant superiority of our LG-LSTM over other state-of-the-art methods.Comment: 10 page
    corecore