2,524 research outputs found
Island Loss for Learning Discriminative Features in Facial Expression Recognition
Over the past few years, Convolutional Neural Networks (CNNs) have shown
promise on facial expression recognition. However, the performance degrades
dramatically under real-world settings due to variations introduced by subtle
facial appearance changes, head pose variations, illumination changes, and
occlusions.
In this paper, a novel island loss is proposed to enhance the discriminative
power of the deeply learned features. Specifically, the IL is designed to
reduce the intra-class variations while enlarging the inter-class differences
simultaneously. Experimental results on four benchmark expression databases
have demonstrated that the CNN with the proposed island loss (IL-CNN)
outperforms the baseline CNN models with either traditional softmax loss or the
center loss and achieves comparable or better performance compared with the
state-of-the-art methods for facial expression recognition.Comment: 8 pages, 3 figure
Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System
In this paper, we explore the encoding/pooling layer and loss function in the
end-to-end speaker and language recognition system. First, a unified and
interpretable end-to-end system for both speaker and language recognition is
developed. It accepts variable-length input and produces an utterance level
result. In the end-to-end system, the encoding layer plays a role in
aggregating the variable-length input sequence into an utterance level
representation. Besides the basic temporal average pooling, we introduce a
self-attentive pooling layer and a learnable dictionary encoding layer to get
the utterance level representation. In terms of loss function for open-set
speaker verification, to get more discriminative speaker embedding, center loss
and angular softmax loss is introduced in the end-to-end system. Experimental
results on Voxceleb and NIST LRE 07 datasets show that the performance of
end-to-end learning system could be significantly improved by the proposed
encoding layer and loss function.Comment: Accepted for Speaker Odyssey 201
- …