298 research outputs found
Extending the Universal Approximation Theorem for a Broad Class of Hypercomplex-Valued Neural Networks
The universal approximation theorem asserts that a single hidden layer neural
network approximates continuous functions with any desired precision on compact
sets. As an existential result, the universal approximation theorem supports
the use of neural networks for various applications, including regression and
classification tasks. The universal approximation theorem is not limited to
real-valued neural networks but also holds for complex, quaternion, tessarines,
and Clifford-valued neural networks. This paper extends the universal
approximation theorem for a broad class of hypercomplex-valued neural networks.
Precisely, we first introduce the concept of non-degenerate hypercomplex
algebra. Complex numbers, quaternions, and tessarines are examples of
non-degenerate hypercomplex algebras. Then, we state the universal
approximation theorem for hypercomplex-valued neural networks defined on a
non-degenerate algebra
Embed Me If You Can: A Geometric Perceptron
Solving geometric tasks involving point clouds by using machine learning is a
challenging problem. Standard feed-forward neural networks combine linear or,
if the bias parameter is included, affine layers and activation functions.
Their geometric modeling is limited, which motivated the prior work introducing
the multilayer hypersphere perceptron (MLHP). Its constituent part, i.e.,
hypersphere neuron, is obtained by applying a conformal embedding of Euclidean
space. By virtue of Clifford algebra, it can be implemented as the Cartesian
dot product of inputs and weights. If the embedding is applied in a manner
consistent with the dimensionality of the input space geometry, the decision
surfaces of the model units become combinations of hyperspheres and make the
decision-making process geometrically interpretable for humans. Our extension
of the MLHP model, the multilayer geometric perceptron (MLGP), and its
respective layer units, i.e., geometric neurons, are consistent with the 3D
geometry and provide a geometric handle of the learned coefficients. In
particular, the geometric neuron activations are isometric in 3D. When
classifying the 3D Tetris shapes, we quantitatively show that our model
requires no activation function in the hidden layers other than the embedding
to outperform the vanilla multilayer perceptron. In the presence of noise in
the data, our model is also superior to the MLHP
A Theory of Neural Computation with Clifford Algebras
The present thesis introduces Clifford Algebra as a framework for neural computation. Neural computation with Clifford algebras is model-based. This principle is established by constructing Clifford algebras from quadratic spaces. Then the subspace grading inherent to any Clifford algebra is introduced. The above features of Clifford algebras are then taken as motivation for introducing the Basic Clifford Neuron (BCN). As a second type of Clifford neuron the Spinor Clifford Neuron is presented. A systematic basis for Clifford neural computation is provided by the important notions of isomorphic Clifford neurons and isomorphic representations. After the neuron level is established, the discussion continues with (Spinor) Clifford Multilayer Perceptrons. First, (Spinor) Clifford Multilayer Perceptrons with real-valued activation functions ((S)CMLPs) are studied. A generic Backpropagation algorithm for CMLPs is derived. Also, universal approximation theorems for (S)CMLPs are presented. Finally, CMLPs with Clifford-valued activation functions are studied
A Comparison of Quaternion Neural Network Backpropagation Algorithms
This research paper focuses on quaternion neural networks (QNNs) - a type of neural network wherein the weights, biases, and input values are all represented as quaternion numbers. Previous studies have shown that QNNs outperform real-valued neural networks in basic tasks and have potential in high-dimensional problem spaces. However, research on QNNs has been fragmented, with contributions from different mathematical and engineering domains leading to unintentional overlap in QNN literature. This work aims to unify existing research by evaluating four distinct QNN backpropagation algorithms, including the novel GHR-calculus backpropagation algorithm, and providing concise, scalable implementations of each algorithm using a modern compiled programming language. Additionally, the authors apply a robust Design of Experiments (DoE) methodology to compare the accuracy and runtime of each algorithm. The experiments demonstrate that the Clifford Multilayer Perceptron (CMLP) learning algorithm results in statistically significant improvements in network test set accuracy while maintaining comparable runtime performance to the other three algorithms in four distinct regression tasks. By unifying existing research and comparing different QNN training algorithms, this work develops a state-of-the-art baseline and provides important insights into the potential of QNNs for solving high-dimensional problems
Geometric Algebra Attention Networks for Small Point Clouds
Much of the success of deep learning is drawn from building architectures
that properly respect underlying symmetry and structure in the data on which
they operate - a set of considerations that have been united under the banner
of geometric deep learning. Often problems in the physical sciences deal with
relatively small sets of points in two- or three-dimensional space wherein
translation, rotation, and permutation equivariance are important or even vital
for models to be useful in practice. In this work, we present rotation- and
permutation-equivariant architectures for deep learning on these small point
clouds, composed of a set of products of terms from the geometric algebra and
reductions over those products using an attention mechanism. The geometric
algebra provides valuable mathematical structure by which to combine vector,
scalar, and other types of geometric inputs in a systematic way to account for
rotation invariance or covariance, while attention yields a powerful way to
impose permutation equivariance. We demonstrate the usefulness of these
architectures by training models to solve sample problems relevant to physics,
chemistry, and biology
Machine Learning for Practical Quantum Error Mitigation
Quantum computers are actively competing to surpass classical supercomputers,
but quantum errors remain their chief obstacle. The key to overcoming these on
near-term devices has emerged through the field of quantum error mitigation,
enabling improved accuracy at the cost of additional runtime. In practice,
however, the success of mitigation is limited by a generally exponential
overhead. Can classical machine learning address this challenge on today's
quantum computers? Here, through both simulations and experiments on
state-of-the-art quantum computers using up to 100 qubits, we demonstrate that
machine learning for quantum error mitigation (ML-QEM) can drastically reduce
overheads, maintain or even surpass the accuracy of conventional methods, and
yield near noise-free results for quantum algorithms. We benchmark a variety of
machine learning models -- linear regression, random forests, multi-layer
perceptrons, and graph neural networks -- on diverse classes of quantum
circuits, over increasingly complex device-noise profiles, under interpolation
and extrapolation, and for small and large quantum circuits. These tests employ
the popular digital zero-noise extrapolation method as an added reference. We
further show how to scale ML-QEM to classically intractable quantum circuits by
mimicking the results of traditional mitigation results, while significantly
reducing overhead. Our results highlight the potential of classical machine
learning for practical quantum computation.Comment: 11 pages, 7 figures (main text) + 4 pages, 2 figures (appendices
Selected aspects of complex, hypercomplex and fuzzy neural networks
This short report reviews the current state of the research and methodology
on theoretical and practical aspects of Artificial Neural Networks (ANN). It
was prepared to gather state-of-the-art knowledge needed to construct complex,
hypercomplex and fuzzy neural networks.
The report reflects the individual interests of the authors and, by now
means, cannot be treated as a comprehensive review of the ANN discipline.
Considering the fast development of this field, it is currently impossible to
do a detailed review of a considerable number of pages.
The report is an outcome of the Project 'The Strategic Research Partnership
for the mathematical aspects of complex, hypercomplex and fuzzy neural
networks' meeting at the University of Warmia and Mazury in Olsztyn, Poland,
organized in September 2022.Comment: 46 page
- …