3,291 research outputs found

    Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition

    Get PDF
    Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs.Comment: Accepted at INTERSPEECH 201

    Deep Quaternion Networks

    Full text link
    The field of deep learning has seen significant advancement in recent years. However, much of the existing work has been focused on real-valued numbers. Recent work has shown that a deep learning system using the complex numbers can be deeper for a fixed parameter budget compared to its real-valued counterpart. In this work, we explore the benefits of generalizing one step further into the hyper-complex numbers, quaternions specifically, and provide the architecture components needed to build deep quaternion networks. We develop the theoretical basis by reviewing quaternion convolutions, developing a novel quaternion weight initialization scheme, and developing novel algorithms for quaternion batch-normalization. These pieces are tested in a classification model by end-to-end training on the CIFAR-10 and CIFAR-100 data sets and a segmentation model by end-to-end training on the KITTI Road Segmentation data set. These quaternion networks show improved convergence compared to real-valued and complex-valued networks, especially on the segmentation task, while having fewer parametersComment: IJCNN 2018, 8 pages, 1 figur

    Optimization of star research algorithm for esmo star tracker

    Get PDF
    This paper explains in detail the design and the development of a software research star algorithm, embedded on a star tracker, by the ISAE/SUPAERO team. This research algorithm is inspired by musical techniques. This work will be carried out as part of the ESMO (European Student Moon Orbiter) project by different teams of students and professors from ISAE/SUPAERO (Institut Supe ́rieur de l’Ae ́ronautique et de l’Espace). Till today, the system engineering studies have been completed and the work that will be presented will concern the algorithmic and the embedded software development. The physical architecture of the sensor relies on APS 750 developed by the CIMI laboratory of ISAE/SUPAERO. First, a star research algorithm based on the image acquired in lost-in-space mode (one of the star tracker opera- tional modes) will be presented; it is inspired by techniques of musical recognition with the help of the correlation of digital signature (hash) with those stored in databases. The musical recognition principle is based on finger- printing, i.e. the extraction of points of interest in the studied signal. In the musical context, the signal spectrogram is used to identify these points. Applying this technique in image processing domain requires an equivalent tool to spectrogram. Those points of interest create a hash and are used to efficiently search within the database pre- viously sorted in order to be compared. The main goals of this research algorithm are to minimise the number of steps in the computations in order to deliver information at a higher frequency and to increase the computation robustness against the different possible disturbances

    Complex Structures on some Stiefel Manifolds

    Full text link
    We discuss conditions for the integrability of an almost complex structure defined on the total space of an induced Hopf S^3-bundle over a Sasakian manifold . As an application, we obtain an uncountable family of inequivalent complex structures on the Stiefel manifolds of orthonormal 2-frames in C^{n+1}, non compatible with its standard hypercomplex structure. Similar families of complex structures are constructed on the Stiefel manifold of oriented orthonormal 4-frames in R^{n+1}, as well as on some special Stiefel manifolds related to the groups G_2 and Spin(7).Comment: LaTex, 11 pages, to be published in Bull. Soc. Sc. Math. Roumanie, Volume in memory of G. Vrancean

    On some Moment Maps and Induced Hopf Bundles in the Quaternionic Projective Space

    Full text link
    We describe a diagram containing the zero sets of the moment maps associated to the diagonal U(1) and Sp(1) actions on the quaternionic projective space HP^n. These sets are related both to focal sets of submanifolds and to Sasakian-Einstein structures on induced Hopf bundles. As an application, we construct a complex structure on the Stiefel manifolds V_2 (C^{n+1}) and V_4 (R^{n+1}), the one on the former manifold not being compatible with its known hypercomplex structure.Comment: Revised version, a more complete proof of a statement and some references were added. LaTex, 21 pages, to be published in Int. J. Mat

    Nonuniform Fuchsian codes for noisy channels

    Get PDF
    We develop a new transmission scheme for additive white Gaussian noisy (AWGN) channels based on Fuchsian groups from rational quaternion algebras. The structure of the proposed Fuchsian codes is nonlinear and nonuniform, hence conventional decoding methods based on linearity and symmetry do not apply. Previously, only brute force decoding methods with complexity that is linear in the code size exist for general nonuniform codes. However, the properly discontinuous character of the action of the Fuchsian groups on the complex upper half-plane translates into decoding complexity that is logarithmic in the code size via a recently introduced point reduction algorithm

    Flexible Stereo: Constrained, Non-rigid, Wide-baseline Stereo Vision for Fixed-wing Aerial Platforms

    Full text link
    This paper proposes a computationally efficient method to estimate the time-varying relative pose between two visual-inertial sensor rigs mounted on the flexible wings of a fixed-wing unmanned aerial vehicle (UAV). The estimated relative poses are used to generate highly accurate depth maps in real-time and can be employed for obstacle avoidance in low-altitude flights or landing maneuvers. The approach is structured as follows: Initially, a wing model is identified by fitting a probability density function to measured deviations from the nominal relative baseline transformation. At run-time, the prior knowledge about the wing model is fused in an Extended Kalman filter~(EKF) together with relative pose measurements obtained from solving a relative perspective N-point problem (PNP), and the linear accelerations and angular velocities measured by the two inertial measurement units (IMU) which are rigidly attached to the cameras. Results obtained from extensive synthetic experiments demonstrate that our proposed framework is able to estimate highly accurate baseline transformations and depth maps.Comment: Accepted for publication in IEEE International Conference on Robotics and Automation (ICRA), 2018, Brisban
    corecore