65,476 research outputs found

    3D Convolutional Neural Networks for Tumor Segmentation using Long-range 2D Context

    Full text link
    We present an efficient deep learning approach for the challenging task of tumor segmentation in multisequence MR images. In recent years, Convolutional Neural Networks (CNN) have achieved state-of-the-art performances in a large variety of recognition tasks in medical imaging. Because of the considerable computational cost of CNNs, large volumes such as MRI are typically processed by subvolumes, for instance slices (axial, coronal, sagittal) or small 3D patches. In this paper we introduce a CNN-based model which efficiently combines the advantages of the short-range 3D context and the long-range 2D context. To overcome the limitations of specific choices of neural network architectures, we also propose to merge outputs of several cascaded 2D-3D models by a voxelwise voting strategy. Furthermore, we propose a network architecture in which the different MR sequences are processed by separate subnetworks in order to be more robust to the problem of missing MR sequences. Finally, a simple and efficient algorithm for training large CNN models is introduced. We evaluate our method on the public benchmark of the BRATS 2017 challenge on the task of multiclass segmentation of malignant brain tumors. Our method achieves good performances and produces accurate segmentations with median Dice scores of 0.918 (whole tumor), 0.883 (tumor core) and 0.854 (enhancing core). Our approach can be naturally applied to various tasks involving segmentation of lesions or organs.Comment: Submitted to the journal Computerized Medical Imaging and Graphic

    Learning Audio Sequence Representations for Acoustic Event Classification

    Full text link
    Acoustic Event Classification (AEC) has become a significant task for machines to perceive the surrounding auditory scene. However, extracting effective representations that capture the underlying characteristics of the acoustic events is still challenging. Previous methods mainly focused on designing the audio features in a 'hand-crafted' manner. Interestingly, data-learnt features have been recently reported to show better performance. Up to now, these were only considered on the frame-level. In this paper, we propose an unsupervised learning framework to learn a vector representation of an audio sequence for AEC. This framework consists of a Recurrent Neural Network (RNN) encoder and a RNN decoder, which respectively transforms the variable-length audio sequence into a fixed-length vector and reconstructs the input sequence on the generated vector. After training the encoder-decoder, we feed the audio sequences to the encoder and then take the learnt vectors as the audio sequence representations. Compared with previous methods, the proposed method can not only deal with the problem of arbitrary-lengths of audio streams, but also learn the salient information of the sequence. Extensive evaluation on a large-size acoustic event database is performed, and the empirical results demonstrate that the learnt audio sequence representation yields a significant performance improvement by a large margin compared with other state-of-the-art hand-crafted sequence features for AEC

    Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks

    Full text link
    One of the challenges in modeling cognitive events from electroencephalogram (EEG) data is finding representations that are invariant to inter- and intra-subject differences, as well as to inherent noise associated with such data. Herein, we propose a novel approach for learning such representations from multi-channel EEG time-series, and demonstrate its advantages in the context of mental load classification task. First, we transform EEG activities into a sequence of topology-preserving multi-spectral images, as opposed to standard EEG analysis techniques that ignore such spatial information. Next, we train a deep recurrent-convolutional network inspired by state-of-the-art video classification to learn robust representations from the sequence of images. The proposed approach is designed to preserve the spatial, spectral, and temporal structure of EEG which leads to finding features that are less sensitive to variations and distortions within each dimension. Empirical evaluation on the cognitive load classification task demonstrated significant improvements in classification accuracy over current state-of-the-art approaches in this field.Comment: To be published as a conference paper at ICLR 201

    Going Deeper into Action Recognition: A Survey

    Full text link
    Understanding human actions in visual data is tied to advances in complementary research areas including object recognition, human dynamics, domain adaptation and semantic segmentation. Over the last decade, human action analysis evolved from earlier schemes that are often limited to controlled environments to nowadays advanced solutions that can learn from millions of videos and apply to almost all daily activities. Given the broad range of applications from video surveillance to human-computer interaction, scientific milestones in action recognition are achieved more rapidly, eventually leading to the demise of what used to be good in a short time. This motivated us to provide a comprehensive review of the notable steps taken towards recognizing human actions. To this end, we start our discussion with the pioneering methods that use handcrafted representations, and then, navigate into the realm of deep learning based approaches. We aim to remain objective throughout this survey, touching upon encouraging improvements as well as inevitable fallbacks, in the hope of raising fresh questions and motivating new research directions for the reader
    corecore