3,291 research outputs found
Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Recently, the connectionist temporal classification (CTC) model coupled with
recurrent (RNN) or convolutional neural networks (CNN), made it easier to train
speech recognition systems in an end-to-end fashion. However in real-valued
models, time frame components such as mel-filter-bank energies and the cepstral
coefficients obtained from them, together with their first and second order
derivatives, are processed as individual elements, while a natural alternative
is to process such components as composed entities. We propose to group such
elements in the form of quaternions and to process these quaternions using the
established quaternion algebra. Quaternion numbers and quaternion neural
networks have shown their efficiency to process multidimensional inputs as
entities, to encode internal dependencies, and to solve many tasks with less
learning parameters than real-valued models. This paper proposes to integrate
multiple feature views in quaternion-valued convolutional neural network
(QCNN), to be used for sequence-to-sequence mapping with the CTC model.
Promising results are reported using simple QCNNs in phoneme recognition
experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme
error rate (PER) with less learning parameters than a competing model based on
real-valued CNNs.Comment: Accepted at INTERSPEECH 201
Deep Quaternion Networks
The field of deep learning has seen significant advancement in recent years.
However, much of the existing work has been focused on real-valued numbers.
Recent work has shown that a deep learning system using the complex numbers can
be deeper for a fixed parameter budget compared to its real-valued counterpart.
In this work, we explore the benefits of generalizing one step further into the
hyper-complex numbers, quaternions specifically, and provide the architecture
components needed to build deep quaternion networks. We develop the theoretical
basis by reviewing quaternion convolutions, developing a novel quaternion
weight initialization scheme, and developing novel algorithms for quaternion
batch-normalization. These pieces are tested in a classification model by
end-to-end training on the CIFAR-10 and CIFAR-100 data sets and a segmentation
model by end-to-end training on the KITTI Road Segmentation data set. These
quaternion networks show improved convergence compared to real-valued and
complex-valued networks, especially on the segmentation task, while having
fewer parametersComment: IJCNN 2018, 8 pages, 1 figur
Optimization of star research algorithm for esmo star tracker
This paper explains in detail the design and the development of a software research star algorithm, embedded on a star tracker, by the ISAE/SUPAERO team. This research algorithm is inspired by musical techniques. This work will be carried out as part of the ESMO (European Student Moon Orbiter) project by different teams of students and professors from ISAE/SUPAERO (Institut Supe ́rieur de l’Ae ́ronautique et de l’Espace). Till today, the system engineering studies have been completed and the work that will be presented will concern the algorithmic and the embedded software development. The physical architecture of the sensor relies on APS 750 developed by the CIMI laboratory of ISAE/SUPAERO. First, a star research algorithm based on the image acquired in lost-in-space mode (one of the star tracker opera- tional modes) will be presented; it is inspired by techniques of musical recognition with the help of the correlation of digital signature (hash) with those stored in databases. The musical recognition principle is based on finger- printing, i.e. the extraction of points of interest in the studied signal. In the musical context, the signal spectrogram is used to identify these points. Applying this technique in image processing domain requires an equivalent tool to spectrogram. Those points of interest create a hash and are used to efficiently search within the database pre- viously sorted in order to be compared. The main goals of this research algorithm are to minimise the number of steps in the computations in order to deliver information at a higher frequency and to increase the computation robustness against the different possible disturbances
Complex Structures on some Stiefel Manifolds
We discuss conditions for the integrability of an almost complex structure
defined on the total space of an induced Hopf S^3-bundle over a Sasakian
manifold . As an application, we obtain an uncountable family of inequivalent
complex structures on the Stiefel manifolds of orthonormal 2-frames in C^{n+1},
non compatible with its standard hypercomplex structure. Similar families of
complex structures are constructed on the Stiefel manifold of oriented
orthonormal 4-frames in R^{n+1}, as well as on some special Stiefel manifolds
related to the groups G_2 and Spin(7).Comment: LaTex, 11 pages, to be published in Bull. Soc. Sc. Math. Roumanie,
Volume in memory of G. Vrancean
On some Moment Maps and Induced Hopf Bundles in the Quaternionic Projective Space
We describe a diagram containing the zero sets of the moment maps associated
to the diagonal U(1) and Sp(1) actions on the quaternionic projective space
HP^n. These sets are related both to focal sets of submanifolds and to
Sasakian-Einstein structures on induced Hopf bundles. As an application, we
construct a complex structure on the Stiefel manifolds V_2 (C^{n+1}) and V_4
(R^{n+1}), the one on the former manifold not being compatible with its known
hypercomplex structure.Comment: Revised version, a more complete proof of a statement and some
references were added. LaTex, 21 pages, to be published in Int. J. Mat
Nonuniform Fuchsian codes for noisy channels
We develop a new transmission scheme for additive white Gaussian noisy (AWGN)
channels based on Fuchsian groups from rational quaternion algebras. The
structure of the proposed Fuchsian codes is nonlinear and nonuniform, hence
conventional decoding methods based on linearity and symmetry do not apply.
Previously, only brute force decoding methods with complexity that is linear in
the code size exist for general nonuniform codes. However, the properly
discontinuous character of the action of the Fuchsian groups on the complex
upper half-plane translates into decoding complexity that is logarithmic in the
code size via a recently introduced point reduction algorithm
Flexible Stereo: Constrained, Non-rigid, Wide-baseline Stereo Vision for Fixed-wing Aerial Platforms
This paper proposes a computationally efficient method to estimate the
time-varying relative pose between two visual-inertial sensor rigs mounted on
the flexible wings of a fixed-wing unmanned aerial vehicle (UAV). The estimated
relative poses are used to generate highly accurate depth maps in real-time and
can be employed for obstacle avoidance in low-altitude flights or landing
maneuvers. The approach is structured as follows: Initially, a wing model is
identified by fitting a probability density function to measured deviations
from the nominal relative baseline transformation. At run-time, the prior
knowledge about the wing model is fused in an Extended Kalman filter~(EKF)
together with relative pose measurements obtained from solving a relative
perspective N-point problem (PNP), and the linear accelerations and angular
velocities measured by the two inertial measurement units (IMU) which are
rigidly attached to the cameras. Results obtained from extensive synthetic
experiments demonstrate that our proposed framework is able to estimate highly
accurate baseline transformations and depth maps.Comment: Accepted for publication in IEEE International Conference on Robotics
and Automation (ICRA), 2018, Brisban
- …