1,432 research outputs found

    Relaxed 2-D Principal Component Analysis by LpL_p Norm for Face Recognition

    Full text link
    A relaxed two dimensional principal component analysis (R2DPCA) approach is proposed for face recognition. Different to the 2DPCA, 2DPCA-L1L_1 and G2DPCA, the R2DPCA utilizes the label information (if known) of training samples to calculate a relaxation vector and presents a weight to each subset of training data. A new relaxed scatter matrix is defined and the computed projection axes are able to increase the accuracy of face recognition. The optimal LpL_p-norms are selected in a reasonable range. Numerical experiments on practical face databased indicate that the R2DPCA has high generalization ability and can achieve a higher recognition rate than state-of-the-art methods.Comment: 19 pages, 11 figure

    Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition

    Get PDF
    Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs.Comment: Accepted at INTERSPEECH 201

    Octonion sparse representation for color and multispectral image processing

    Get PDF
    A recent trend in color image processing combines the quaternion algebra with dictionary learning methods. This paper aims to present a generalization of the quaternion dictionary learning method by using the octonion algebra. The octonion algebra combined with dictionary learning methods is well suited for representation of multispectral images with up to 7 color channels. Opposed to the classical dictionary learning techniques that treat multispectral images by concatenating spectral bands into a large monochrome image, we treat all the spectral bands simultaneously. Our approach leads to better preservation of color fidelity in true and false color images of the reconstructed multispectral image. To show the potential of the octonion based model, experiments are conducted for image reconstruction and denoising of color images as well as of extensively used Landsat 7 images

    Hypercomplex algebras for dictionary learning

    Get PDF
    This paper presents an application of hypercomplex algebras combined with dictionary learning for sparse representation of multichannel images. Two main representatives of hypercomplex algebras, Clifford algebras and algebras generated by the Cayley-Dickson procedure are considered. Related works reported quaternion methods (for color images) and octonion methods, which are applicable to images with up to 7 channels. We show that the current constructions cannot be generalized to dimensions above eight

    Color Sparse Representations for Image Processing: Review, Models, and Prospects

    Get PDF
    International audienceSparse representations have been extended to deal with color images composed of three channels. A review of dictionary-learning-based sparse representations for color images is made here, detailing the differences between the models, and comparing their results on real data and simulated data. These models are considered in a unifying framework that is based on the degrees of freedom of the linear filtering/transformation of the color channels. Moreover, this allows it to be shown that the scalar quaternionic linear model is equivalent to constrained matrix-based color filtering, which highlights the filtering implicitly applied through this model. Based on this reformulation, the new color filtering model is introduced, using unconstrained filters. In this model, spatial morphologies of color images are encoded by atoms, and colors are encoded by color filters. Color variability is no longer captured in increasing the dictionary size, but with color filters, this gives an efficient color representation
    corecore