Search CORE

263 research outputs found

A Comparison of Quaternion Neural Network Backpropagation Algorithms

Author: Bill Jeremiah
Champaign Lance
Cox Bruce A.
Publication venue: AFIT Scholar
Publication date: 01/06/2023
Field of study

This research paper focuses on quaternion neural networks (QNNs) - a type of neural network wherein the weights, biases, and input values are all represented as quaternion numbers. Previous studies have shown that QNNs outperform real-valued neural networks in basic tasks and have potential in high-dimensional problem spaces. However, research on QNNs has been fragmented, with contributions from different mathematical and engineering domains leading to unintentional overlap in QNN literature. This work aims to unify existing research by evaluating four distinct QNN backpropagation algorithms, including the novel GHR-calculus backpropagation algorithm, and providing concise, scalable implementations of each algorithm using a modern compiled programming language. Additionally, the authors apply a robust Design of Experiments (DoE) methodology to compare the accuracy and runtime of each algorithm. The experiments demonstrate that the Clifford Multilayer Perceptron (CMLP) learning algorithm results in statistically significant improvements in network test set accuracy while maintaining comparable runtime performance to the other three algorithms in four distinct regression tasks. By unifying existing research and comparing different QNN training algorithms, this work develops a state-of-the-art baseline and provides important insights into the potential of QNNs for solving high-dimensional problems

AFTI Scholar (Air Force Institute of Technology)

A Unifying Approach to Quaternion Adaptive Filtering: Addressing the Gradient and Convergence

Author: Jahanchahi Cyrus
Mandic Danilo P.
Publication venue
Publication date: 17/10/2013
Field of study

A novel framework for a unifying treatment of quaternion valued adaptive filtering algorithms is introduced. This is achieved based on a rigorous account of quaternion differentiability, the proposed I-gradient, and the use of augmented quaternion statistics to account for real world data with noncircular probability distributions. We first provide an elegant solution for the calculation of the gradient of real functions of quaternion variables (typical cost function), an issue that has so far prevented systematic development of quaternion adaptive filters. This makes it possible to unify the class of existing and proposed quaternion least mean square (QLMS) algorithms, and to illuminate their structural similarity. Next, in order to cater for both circular and noncircular data, the class of widely linear QLMS (WL-QLMS) algorithms is introduced and the subsequent convergence analysis unifies the treatment of strictly linear and widely linear filters, for both proper and improper sources. It is also shown that the proposed class of HR gradients allows us to resolve the uncertainty owing to the noncommutativity of quaternion products, while the involution gradient (I-gradient) provides generic extensions of the corresponding real- and complex-valued adaptive algorithms, at a reduced computational cost. Simulations in both the strictly linear and widely linear setting support the approach

arXiv.org e-Print Archive

CiteSeerX

Meta-Heuristic Optimization Methods for Quaternion-Valued Neural Networks

Author: Bill Jeremiah P.
Publication venue: AFIT Scholar
Publication date: 01/03/2021
Field of study

In recent years, real-valued neural networks have demonstrated promising, and often striking, results across a broad range of domains. This has driven a surge of applications utilizing high-dimensional datasets. While many techniques exist to alleviate issues of high-dimensionality, they all induce a cost in terms of network size or computational runtime. This work examines the use of quaternions, a form of hypercomplex numbers, in neural networks. The constructed networks demonstrate the ability of quaternions to encode high-dimensional data in an efficient neural network structure, showing that hypercomplex neural networks reduce the number of total trainable parameters compared to their real-valued equivalents. Finally, this work introduces a novel training algorithm using a meta-heuristic approach that bypasses the need for analytic quaternion loss or activation functions. This algorithm allows for a broader range of activation functions over current quaternion networks and presents a proof-of-concept for future work

AFTI Scholar (Air Force Institute of Technology)

Meta-Heuristic Optimization Methods for Quaternion-Valued Neural Networks

Author: Bihl Trevor J.
Bill Jeremiah
Champagne Lance E.
Cox Bruce
Publication venue: AFIT Scholar
Publication date: 23/04/2021
Field of study

AFTI Scholar (Air Force Institute of Technology)

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition

Author: Bengio Yoshua
De Mori Renato
Linarès Georges
Morchid Mohamed
Parcollet Titouan
Trabelsi Chiheb
Zhang Ying
Publication venue
Publication date: 20/06/2018
Field of study

Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs.Comment: Accepted at INTERSPEECH 201

arXiv.org e-Print Archive

Crossref

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

Author: Bengio Yoshua
Gidel Gauthier
Goyette Kyle
Kerg Giancarlo
Lajoie Guillaume
Touzel Maximilian Puelma
Vorontsov Eugene
Publication venue
Publication date: 01/01/2019
Field of study

A recent strategy to circumvent the exploding and vanishing gradient problem in RNNs, and to allow the stable propagation of signals over long time scales, is to constrain recurrent connectivity matrices to be orthogonal or unitary. This ensures eigenvalues with unit norm and thus stable dynamics and training. However this comes at the cost of reduced expressivity due to the limited variety of orthogonal transformations. We propose a novel connectivity structure based on the Schur decomposition and a splitting of the Schur form into normal and non-normal parts. This allows to parametrize matrices with unit-norm eigenspectra without orthogonality constraints on eigenbases. The resulting architecture ensures access to a larger space of spectrally constrained matrices, of which orthogonal matrices are a subset. This crucial difference retains the stability advantages and training speed of orthogonal RNNs while enhancing expressivity, especially on tasks that require computations over ongoing input sequences

arXiv.org e-Print Archive

PolyPublie

Neural Computing in Quaternion Algebra

Author: 峯本俊文
Publication venue: 兵庫県立大学
Publication date: 28/06/2017
Field of study

兵庫県立大学201

University of Hyogo Academic Repository / 兵庫県立大学学術情報リポジトリ