453,461 research outputs found

    Learning to detect dysarthria from raw speech

    Full text link
    Speech classifiers of paralinguistic traits traditionally learn from diverse hand-crafted low-level features, by selecting the relevant information for the task at hand. We explore an alternative to this selection, by learning jointly the classifier, and the feature extraction. Recent work on speech recognition has shown improved performance over speech features by learning from the waveform. We extend this approach to paralinguistic classification and propose a neural network that can learn a filterbank, a normalization factor and a compression power from the raw speech, jointly with the rest of the architecture. We apply this model to dysarthria detection from sentence-level audio recordings. Starting from a strong attention-based baseline on which mel-filterbanks outperform standard low-level descriptors, we show that learning the filters or the normalization and compression improves over fixed features by 10% absolute accuracy. We also observe a gain over OpenSmile features by learning jointly the feature extraction, the normalization, and the compression factor with the architecture. This constitutes a first attempt at learning jointly all these operations from raw audio for a speech classification task.Comment: 5 pages, 3 figures, submitted to ICASS

    NormFace: L2 Hypersphere Embedding for Face Verification

    Full text link
    Thanks to the recent developments of Convolutional Neural Networks, the performance of face verification methods has increased rapidly. In a typical face verification method, feature normalization is a critical step for boosting performance. This motivates us to introduce and study the effect of normalization during training. But we find this is non-trivial, despite normalization being differentiable. We identify and study four issues related to normalization through mathematical analysis, which yields understanding and helps with parameter settings. Based on this analysis we propose two strategies for training using normalized features. The first is a modification of softmax loss, which optimizes cosine similarity instead of inner-product. The second is a reformulation of metric learning by introducing an agent vector for each class. We show that both strategies, and small variants, consistently improve performance by between 0.2% to 0.4% on the LFW dataset based on two models. This is significant because the performance of the two models on LFW dataset is close to saturation at over 98%. Codes and models are released on https://github.com/happynear/NormFaceComment: camera-ready versio

    CrossNorm: Normalization for Off-Policy TD Reinforcement Learning

    Full text link
    Off-policy temporal difference (TD) methods are a powerful class of reinforcement learning (RL) algorithms. Intriguingly, deep off-policy TD algorithms are not commonly used in combination with feature normalization techniques, despite positive effects of normalization in other domains. We show that naive application of existing normalization techniques is indeed not effective, but that well-designed normalization improves optimization stability and removes the necessity of target networks. In particular, we introduce a normalization based on a mixture of on- and off-policy transitions, which we call cross-normalization. It can be regarded as an extension of batch normalization that re-centers data for two different distributions, as present in off-policy learning. Applied to DDPG and TD3, cross-normalization improves over the state of the art across a range of MuJoCo benchmark tasks

    Context-based Normalization of Histological Stains using Deep Convolutional Features

    Full text link
    While human observers are able to cope with variations in color and appearance of histological stains, digital pathology algorithms commonly require a well-normalized setting to achieve peak performance, especially when a limited amount of labeled data is available. This work provides a fully automated, end-to-end learning-based setup for normalizing histological stains, which considers the texture context of the tissue. We introduce Feature Aware Normalization, which extends the framework of batch normalization in combination with gating elements from Long Short-Term Memory units for normalization among different spatial regions of interest. By incorporating a pretrained deep neural network as a feature extractor steering a pixelwise processing pipeline, we achieve excellent normalization results and ensure a consistent representation of color and texture. The evaluation comprises a comparison of color histogram deviations, structural similarity and measures the color volume obtained by the different methods.Comment: In: 3rd Workshop on Deep Learning in Medical Image Analysis (DLMIA 2017
    corecore