Search CORE

1,636 research outputs found

Relaxed 2-D Principal Component Analysis by $L_p$ Norm for Face Recognition

Author: A d’Aspremont
A Pentland
D Meng
DM Witten
H Shen
H Wang
H Zou
I Jolliffe
J Wang
J Yang
J Ye
L Sirovich
L Zhao
M Kirby
M Turk
M Zhao
N Kwak
N Kwak
Q Chang
R Ma
X Li
Z Jia
Z Jia
Z Jia
Z Jia
Z Jia
Z Liang
Z-G Jia
ZZ Liang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/05/2019
Field of study

A relaxed two dimensional principal component analysis (R2DPCA) approach is proposed for face recognition. Different to the 2DPCA, 2DPCA-

L_1

and G2DPCA, the R2DPCA utilizes the label information (if known) of training samples to calculate a relaxation vector and presents a weight to each subset of training data. A new relaxed scatter matrix is defined and the computed projection axes are able to increase the accuracy of face recognition. The optimal

L_p

-norms are selected in a reasonable range. Numerical experiments on practical face databased indicate that the R2DPCA has high generalization ability and can achieve a higher recognition rate than state-of-the-art methods.Comment: 19 pages, 11 figure

arXiv.org e-Print Archive

Crossref

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition

Author: Bengio Yoshua
De Mori Renato
Linarès Georges
Morchid Mohamed
Parcollet Titouan
Trabelsi Chiheb
Zhang Ying
Publication venue
Publication date: 20/06/2018
Field of study

Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs.Comment: Accepted at INTERSPEECH 201

arXiv.org e-Print Archive

Crossref