3,653 research outputs found

    Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition

    Get PDF
    Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs.Comment: Accepted at INTERSPEECH 201

    A Broad Class of Discrete-Time Hypercomplex-Valued Hopfield Neural Networks

    Full text link
    In this paper, we address the stability of a broad class of discrete-time hypercomplex-valued Hopfield-type neural networks. To ensure the neural networks belonging to this class always settle down at a stationary state, we introduce novel hypercomplex number systems referred to as real-part associative hypercomplex number systems. Real-part associative hypercomplex number systems generalize the well-known Cayley-Dickson algebras and real Clifford algebras and include the systems of real numbers, complex numbers, dual numbers, hyperbolic numbers, quaternions, tessarines, and octonions as particular instances. Apart from the novel hypercomplex number systems, we introduce a family of hypercomplex-valued activation functions called B\mathcal{B}-projection functions. Broadly speaking, a B\mathcal{B}-projection function projects the activation potential onto the set of all possible states of a hypercomplex-valued neuron. Using the theory presented in this paper, we confirm the stability analysis of several discrete-time hypercomplex-valued Hopfield-type neural networks from the literature. Moreover, we introduce and provide the stability analysis of a general class of Hopfield-type neural networks on Cayley-Dickson algebras

    Surface Networks

    Full text link
    We study data-driven representations for three-dimensional triangle meshes, which are one of the prevalent objects used to represent 3D geometry. Recent works have developed models that exploit the intrinsic geometry of manifolds and graphs, namely the Graph Neural Networks (GNNs) and its spectral variants, which learn from the local metric tensor via the Laplacian operator. Despite offering excellent sample complexity and built-in invariances, intrinsic geometry alone is invariant to isometric deformations, making it unsuitable for many applications. To overcome this limitation, we propose several upgrades to GNNs to leverage extrinsic differential geometry properties of three-dimensional surfaces, increasing its modeling power. In particular, we propose to exploit the Dirac operator, whose spectrum detects principal curvature directions --- this is in stark contrast with the classical Laplace operator, which directly measures mean curvature. We coin the resulting models \emph{Surface Networks (SN)}. We prove that these models define shape representations that are stable to deformation and to discretization, and we demonstrate the efficiency and versatility of SNs on two challenging tasks: temporal prediction of mesh deformations under non-linear dynamics and generative models using a variational autoencoder framework with encoders/decoders given by SNs
    • …
    corecore