101 research outputs found

    Receptive fields optimization in deep learning for enhanced interpretability, diversity, and resource efficiency.

    Get PDF
    In both supervised and unsupervised learning settings, deep neural networks (DNNs) are known to perform hierarchical and discriminative representation of data. They are capable of automatically extracting excellent hierarchy of features from raw data without the need for manual feature engineering. Over the past few years, the general trend has been that DNNs have grown deeper and larger, amounting to huge number of final parameters and highly nonlinear cascade of features, thus improving the flexibility and accuracy of resulting models. In order to account for the scale, diversity and the difficulty of data DNNs learn from, the architectural complexity and the excessive number of weights are often deliberately built in into their design. This flexibility and performance usually come with high computational and memory demands both during training and inference. In addition, insight into the mappings DNN models perform and human ability to understand them still remain very limited. This dissertation addresses some of these limitations by balancing three conflicting objectives: computational/ memory demands, interpretability, and accuracy. This dissertation first introduces some unsupervised feature learning methods in a broader context of dictionary learning. It also sets the tone for deep autoencoder learning and constraints for data representations in light of removing some of the aforementioned bottlenecks such as the feature interpretability of deep learning models with nonnegativity constraints on receptive fields. In addition, the two main classes of solution to the drawbacks associated with overparameterization/ over-complete representation in deep learning models are also presented. Subsequently, two novel methods, one for each solution class, are presented to address the problems resulting from over-complete representation exhibited by most deep learning models. The first method is developed to achieve inference-cost-efficient models via elimination of redundant features with negligible deterioration of prediction accuracy. This is important especially for deploying deep learning models into resource-limited portable devices. The second method aims at diversifying the features of DNNs in the learning phase to improve their performance without undermining their size and capacity. Lastly, feature diversification is considered to stabilize adversarial learning and extensive experimental outcomes show that these methods have the potential of advancing the current state-of-the-art on different learning tasks and benchmark datasets

    Sparse feature learning for image analysis in segmentation, classification, and disease diagnosis.

    Get PDF
    The success of machine learning algorithms generally depends on intermediate data representation, called features that disentangle the hidden factors of variation in data. Moreover, machine learning models are required to be generalized, in order to reduce the specificity or bias toward the training dataset. Unsupervised feature learning is useful in taking advantage of large amount of unlabeled data, which is available to capture these variations. However, learned features are required to capture variational patterns in data space. In this dissertation, unsupervised feature learning with sparsity is investigated for sparse and local feature extraction with application to lung segmentation, interpretable deep models, and Alzheimer\u27s disease classification. Nonnegative Matrix Factorization, Autoencoder and 3D Convolutional Autoencoder are used as architectures or models for unsupervised feature learning. They are investigated along with nonnegativity, sparsity and part-based representation constraints for generalized and transferable feature extraction

    DAEN: Deep Autoencoder Networks for Hyperspectral Unmixing

    Get PDF

    TasNet: time-domain audio separation network for real-time, single-channel speech separation

    Full text link
    Robust speech processing in multi-talker environments requires effective speech separation. Recent deep learning systems have made significant progress toward solving this problem, yet it remains challenging particularly in real-time, short latency applications. Most methods attempt to construct a mask for each source in time-frequency representation of the mixture signal which is not necessarily an optimal representation for speech separation. In addition, time-frequency decomposition results in inherent problems such as phase/magnitude decoupling and long time window which is required to achieve sufficient frequency resolution. We propose Time-domain Audio Separation Network (TasNet) to overcome these limitations. We directly model the signal in the time-domain using an encoder-decoder framework and perform the source separation on nonnegative encoder outputs. This method removes the frequency decomposition step and reduces the separation problem to estimation of source masks on encoder outputs which is then synthesized by the decoder. Our system outperforms the current state-of-the-art causal and noncausal speech separation algorithms, reduces the computational cost of speech separation, and significantly reduces the minimum required latency of the output. This makes TasNet suitable for applications where low-power, real-time implementation is desirable such as in hearable and telecommunication devices.Comment: Camera ready version for ICASSP 2018, Calgary, Canad

    Unsupervised Deep Transfer Feature Learning for Medical Image Classification

    Full text link
    The accuracy and robustness of image classification with supervised deep learning are dependent on the availability of large-scale, annotated training data. However, there is a paucity of annotated data available due to the complexity of manual annotation. To overcome this problem, a popular approach is to use transferable knowledge across different domains by: 1) using a generic feature extractor that has been pre-trained on large-scale general images (i.e., transfer-learned) but which not suited to capture characteristics from medical images; or 2) fine-tuning generic knowledge with a relatively smaller number of annotated images. Our aim is to reduce the reliance on annotated training data by using a new hierarchical unsupervised feature extractor with a convolutional auto-encoder placed atop of a pre-trained convolutional neural network. Our approach constrains the rich and generic image features from the pre-trained domain to a sophisticated representation of the local image characteristics from the unannotated medical image domain. Our approach has a higher classification accuracy than transfer-learned approaches and is competitive with state-of-the-art supervised fine-tuned methods.Comment: 4 pages, 1 figure, 3 tables, Accepted (Oral) as IEEE International Symposium on Biomedical Imaging 201
    • …
    corecore