453,461 research outputs found
Learning to detect dysarthria from raw speech
Speech classifiers of paralinguistic traits traditionally learn from diverse
hand-crafted low-level features, by selecting the relevant information for the
task at hand. We explore an alternative to this selection, by learning jointly
the classifier, and the feature extraction. Recent work on speech recognition
has shown improved performance over speech features by learning from the
waveform. We extend this approach to paralinguistic classification and propose
a neural network that can learn a filterbank, a normalization factor and a
compression power from the raw speech, jointly with the rest of the
architecture. We apply this model to dysarthria detection from sentence-level
audio recordings. Starting from a strong attention-based baseline on which
mel-filterbanks outperform standard low-level descriptors, we show that
learning the filters or the normalization and compression improves over fixed
features by 10% absolute accuracy. We also observe a gain over OpenSmile
features by learning jointly the feature extraction, the normalization, and the
compression factor with the architecture. This constitutes a first attempt at
learning jointly all these operations from raw audio for a speech
classification task.Comment: 5 pages, 3 figures, submitted to ICASS
NormFace: L2 Hypersphere Embedding for Face Verification
Thanks to the recent developments of Convolutional Neural Networks, the
performance of face verification methods has increased rapidly. In a typical
face verification method, feature normalization is a critical step for boosting
performance. This motivates us to introduce and study the effect of
normalization during training. But we find this is non-trivial, despite
normalization being differentiable. We identify and study four issues related
to normalization through mathematical analysis, which yields understanding and
helps with parameter settings. Based on this analysis we propose two strategies
for training using normalized features. The first is a modification of softmax
loss, which optimizes cosine similarity instead of inner-product. The second is
a reformulation of metric learning by introducing an agent vector for each
class. We show that both strategies, and small variants, consistently improve
performance by between 0.2% to 0.4% on the LFW dataset based on two models.
This is significant because the performance of the two models on LFW dataset is
close to saturation at over 98%. Codes and models are released on
https://github.com/happynear/NormFaceComment: camera-ready versio
CrossNorm: Normalization for Off-Policy TD Reinforcement Learning
Off-policy temporal difference (TD) methods are a powerful class of
reinforcement learning (RL) algorithms. Intriguingly, deep off-policy TD
algorithms are not commonly used in combination with feature normalization
techniques, despite positive effects of normalization in other domains. We show
that naive application of existing normalization techniques is indeed not
effective, but that well-designed normalization improves optimization stability
and removes the necessity of target networks. In particular, we introduce a
normalization based on a mixture of on- and off-policy transitions, which we
call cross-normalization. It can be regarded as an extension of batch
normalization that re-centers data for two different distributions, as present
in off-policy learning. Applied to DDPG and TD3, cross-normalization improves
over the state of the art across a range of MuJoCo benchmark tasks
Context-based Normalization of Histological Stains using Deep Convolutional Features
While human observers are able to cope with variations in color and
appearance of histological stains, digital pathology algorithms commonly
require a well-normalized setting to achieve peak performance, especially when
a limited amount of labeled data is available. This work provides a fully
automated, end-to-end learning-based setup for normalizing histological stains,
which considers the texture context of the tissue. We introduce Feature Aware
Normalization, which extends the framework of batch normalization in
combination with gating elements from Long Short-Term Memory units for
normalization among different spatial regions of interest. By incorporating a
pretrained deep neural network as a feature extractor steering a pixelwise
processing pipeline, we achieve excellent normalization results and ensure a
consistent representation of color and texture. The evaluation comprises a
comparison of color histogram deviations, structural similarity and measures
the color volume obtained by the different methods.Comment: In: 3rd Workshop on Deep Learning in Medical Image Analysis (DLMIA
2017
- …
