298 research outputs found

    Extending the Universal Approximation Theorem for a Broad Class of Hypercomplex-Valued Neural Networks

    Full text link
    The universal approximation theorem asserts that a single hidden layer neural network approximates continuous functions with any desired precision on compact sets. As an existential result, the universal approximation theorem supports the use of neural networks for various applications, including regression and classification tasks. The universal approximation theorem is not limited to real-valued neural networks but also holds for complex, quaternion, tessarines, and Clifford-valued neural networks. This paper extends the universal approximation theorem for a broad class of hypercomplex-valued neural networks. Precisely, we first introduce the concept of non-degenerate hypercomplex algebra. Complex numbers, quaternions, and tessarines are examples of non-degenerate hypercomplex algebras. Then, we state the universal approximation theorem for hypercomplex-valued neural networks defined on a non-degenerate algebra

    Embed Me If You Can: A Geometric Perceptron

    Full text link
    Solving geometric tasks involving point clouds by using machine learning is a challenging problem. Standard feed-forward neural networks combine linear or, if the bias parameter is included, affine layers and activation functions. Their geometric modeling is limited, which motivated the prior work introducing the multilayer hypersphere perceptron (MLHP). Its constituent part, i.e., hypersphere neuron, is obtained by applying a conformal embedding of Euclidean space. By virtue of Clifford algebra, it can be implemented as the Cartesian dot product of inputs and weights. If the embedding is applied in a manner consistent with the dimensionality of the input space geometry, the decision surfaces of the model units become combinations of hyperspheres and make the decision-making process geometrically interpretable for humans. Our extension of the MLHP model, the multilayer geometric perceptron (MLGP), and its respective layer units, i.e., geometric neurons, are consistent with the 3D geometry and provide a geometric handle of the learned coefficients. In particular, the geometric neuron activations are isometric in 3D. When classifying the 3D Tetris shapes, we quantitatively show that our model requires no activation function in the hidden layers other than the embedding to outperform the vanilla multilayer perceptron. In the presence of noise in the data, our model is also superior to the MLHP

    A Theory of Neural Computation with Clifford Algebras

    Get PDF
    The present thesis introduces Clifford Algebra as a framework for neural computation. Neural computation with Clifford algebras is model-based. This principle is established by constructing Clifford algebras from quadratic spaces. Then the subspace grading inherent to any Clifford algebra is introduced. The above features of Clifford algebras are then taken as motivation for introducing the Basic Clifford Neuron (BCN). As a second type of Clifford neuron the Spinor Clifford Neuron is presented. A systematic basis for Clifford neural computation is provided by the important notions of isomorphic Clifford neurons and isomorphic representations. After the neuron level is established, the discussion continues with (Spinor) Clifford Multilayer Perceptrons. First, (Spinor) Clifford Multilayer Perceptrons with real-valued activation functions ((S)CMLPs) are studied. A generic Backpropagation algorithm for CMLPs is derived. Also, universal approximation theorems for (S)CMLPs are presented. Finally, CMLPs with Clifford-valued activation functions are studied

    A Comparison of Quaternion Neural Network Backpropagation Algorithms

    Get PDF
    This research paper focuses on quaternion neural networks (QNNs) - a type of neural network wherein the weights, biases, and input values are all represented as quaternion numbers. Previous studies have shown that QNNs outperform real-valued neural networks in basic tasks and have potential in high-dimensional problem spaces. However, research on QNNs has been fragmented, with contributions from different mathematical and engineering domains leading to unintentional overlap in QNN literature. This work aims to unify existing research by evaluating four distinct QNN backpropagation algorithms, including the novel GHR-calculus backpropagation algorithm, and providing concise, scalable implementations of each algorithm using a modern compiled programming language. Additionally, the authors apply a robust Design of Experiments (DoE) methodology to compare the accuracy and runtime of each algorithm. The experiments demonstrate that the Clifford Multilayer Perceptron (CMLP) learning algorithm results in statistically significant improvements in network test set accuracy while maintaining comparable runtime performance to the other three algorithms in four distinct regression tasks. By unifying existing research and comparing different QNN training algorithms, this work develops a state-of-the-art baseline and provides important insights into the potential of QNNs for solving high-dimensional problems

    Geometric Algebra Attention Networks for Small Point Clouds

    Full text link
    Much of the success of deep learning is drawn from building architectures that properly respect underlying symmetry and structure in the data on which they operate - a set of considerations that have been united under the banner of geometric deep learning. Often problems in the physical sciences deal with relatively small sets of points in two- or three-dimensional space wherein translation, rotation, and permutation equivariance are important or even vital for models to be useful in practice. In this work, we present rotation- and permutation-equivariant architectures for deep learning on these small point clouds, composed of a set of products of terms from the geometric algebra and reductions over those products using an attention mechanism. The geometric algebra provides valuable mathematical structure by which to combine vector, scalar, and other types of geometric inputs in a systematic way to account for rotation invariance or covariance, while attention yields a powerful way to impose permutation equivariance. We demonstrate the usefulness of these architectures by training models to solve sample problems relevant to physics, chemistry, and biology

    Machine Learning for Practical Quantum Error Mitigation

    Full text link
    Quantum computers are actively competing to surpass classical supercomputers, but quantum errors remain their chief obstacle. The key to overcoming these on near-term devices has emerged through the field of quantum error mitigation, enabling improved accuracy at the cost of additional runtime. In practice, however, the success of mitigation is limited by a generally exponential overhead. Can classical machine learning address this challenge on today's quantum computers? Here, through both simulations and experiments on state-of-the-art quantum computers using up to 100 qubits, we demonstrate that machine learning for quantum error mitigation (ML-QEM) can drastically reduce overheads, maintain or even surpass the accuracy of conventional methods, and yield near noise-free results for quantum algorithms. We benchmark a variety of machine learning models -- linear regression, random forests, multi-layer perceptrons, and graph neural networks -- on diverse classes of quantum circuits, over increasingly complex device-noise profiles, under interpolation and extrapolation, and for small and large quantum circuits. These tests employ the popular digital zero-noise extrapolation method as an added reference. We further show how to scale ML-QEM to classically intractable quantum circuits by mimicking the results of traditional mitigation results, while significantly reducing overhead. Our results highlight the potential of classical machine learning for practical quantum computation.Comment: 11 pages, 7 figures (main text) + 4 pages, 2 figures (appendices

    Selected aspects of complex, hypercomplex and fuzzy neural networks

    Full text link
    This short report reviews the current state of the research and methodology on theoretical and practical aspects of Artificial Neural Networks (ANN). It was prepared to gather state-of-the-art knowledge needed to construct complex, hypercomplex and fuzzy neural networks. The report reflects the individual interests of the authors and, by now means, cannot be treated as a comprehensive review of the ANN discipline. Considering the fast development of this field, it is currently impossible to do a detailed review of a considerable number of pages. The report is an outcome of the Project 'The Strategic Research Partnership for the mathematical aspects of complex, hypercomplex and fuzzy neural networks' meeting at the University of Warmia and Mazury in Olsztyn, Poland, organized in September 2022.Comment: 46 page
    • …
    corecore