Search CORE

11,811 research outputs found

Learning Discriminative Features with Class Encoder

Author: Lei Zhen
Li Stan Z.
Liao Shengcai
Shi Hailin
Zhu Xiangyu
Publication venue
Publication date: 09/05/2016
Field of study

Deep neural networks usually benefit from unsupervised pre-training, e.g. auto-encoders. However, the classifier further needs supervised fine-tuning methods for good discrimination. Besides, due to the limits of full-connection, the application of auto-encoders is usually limited to small, well aligned images. In this paper, we incorporate the supervised information to propose a novel formulation, namely class-encoder, whose training objective is to reconstruct a sample from another one of which the labels are identical. Class-encoder aims to minimize the intra-class variations in the feature space, and to learn a good discriminative manifolds on a class scale. We impose the class-encoder as a constraint into the softmax for better supervised training, and extend the reconstruction on feature-level to tackle the parameter size issue and translation issue. The experiments show that the class-encoder helps to improve the performance on benchmarks of classification and face recognition. This could also be a promising direction for fast training of face recognition models.Comment: Accepted by CVPR2016 Workshop of Robust Features for Computer Visio

arXiv.org e-Print Archive

Crossref

Learning Audio Sequence Representations for Acoustic Event Classification

Author: Han Jing
Liu Ding
Qian Kun
Schuller Björn W.
Zhang Zixing
Publication venue
Publication date: 27/07/2017
Field of study

Acoustic Event Classification (AEC) has become a significant task for machines to perceive the surrounding auditory scene. However, extracting effective representations that capture the underlying characteristics of the acoustic events is still challenging. Previous methods mainly focused on designing the audio features in a 'hand-crafted' manner. Interestingly, data-learnt features have been recently reported to show better performance. Up to now, these were only considered on the frame-level. In this paper, we propose an unsupervised learning framework to learn a vector representation of an audio sequence for AEC. This framework consists of a Recurrent Neural Network (RNN) encoder and a RNN decoder, which respectively transforms the variable-length audio sequence into a fixed-length vector and reconstructs the input sequence on the generated vector. After training the encoder-decoder, we feed the audio sequences to the encoder and then take the learnt vectors as the audio sequence representations. Compared with previous methods, the proposed method can not only deal with the problem of arbitrary-lengths of audio streams, but also learn the salient information of the sequence. Extensive evaluation on a large-size acoustic event database is performed, and the empirical results demonstrate that the learnt audio sequence representation yields a significant performance improvement by a large margin compared with other state-of-the-art hand-crafted sequence features for AEC

arXiv.org e-Print Archive

OPUS Augsburg

Distributed video coding for wireless video sensor networks: a review of the state-of-the-art architectures

Author: Fong A.C.M.
Imran Noreen
Seet Boon-Chong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/09/2015
Field of study

Distributed video coding (DVC) is a relatively new video coding architecture originated from two fundamental theorems namely, Slepian–Wolf and Wyner–Ziv. Recent research developments have made DVC attractive for applications in the emerging domain of wireless video sensor networks (WVSNs). This paper reviews the state-of-the-art DVC architectures with a focus on understanding their opportunities and gaps in addressing the operational requirements and application needs of WVSNs

Springer - Publisher Connector

PubMed Central

Enlighten