Search CORE

2 research outputs found

Meta-Heuristic Optimization Methods for Quaternion-Valued Neural Networks

Author: Bihl Trevor J.
Bill Jeremiah
Champagne Lance E.
Cox Bruce
Publication venue: AFIT Scholar
Publication date: 23/04/2021
Field of study

In recent years, real-valued neural networks have demonstrated promising, and often striking, results across a broad range of domains. This has driven a surge of applications utilizing high-dimensional datasets. While many techniques exist to alleviate issues of high-dimensionality, they all induce a cost in terms of network size or computational runtime. This work examines the use of quaternions, a form of hypercomplex numbers, in neural networks. The constructed networks demonstrate the ability of quaternions to encode high-dimensional data in an efficient neural network structure, showing that hypercomplex neural networks reduce the number of total trainable parameters compared to their real-valued equivalents. Finally, this work introduces a novel training algorithm using a meta-heuristic approach that bypasses the need for analytic quaternion loss or activation functions. This algorithm allows for a broader range of activation functions over current quaternion networks and presents a proof-of-concept for future work

AFTI Scholar (Air Force Institute of Technology)

The universal approximation theorem for complex-valued neural networks

Author: Voigtlaender Felix
Publication venue
Publication date: 06/12/2020
Field of study

We generalize the classical universal approximation theorem for neural networks to the case of complex-valued neural networks. Precisely, we consider feedforward networks with a complex activation function

\sigma : \mathbb{C} \to \mathbb{C}

in which each neuron performs the operation

\mathbb{C}^N \to \mathbb{C}, z \mapsto \sigma(b + w^T z)

with weights

w \in \mathbb{C}^N

and a bias

b \in \mathbb{C}

, and with

\sigma

applied componentwise. We completely characterize those activation functions

\sigma

for which the associated complex networks have the universal approximation property, meaning that they can uniformly approximate any continuous function on any compact subset of

\mathbb{C}^d

arbitrarily well. Unlike the classical case of real networks, the set of "good activation functions" which give rise to networks with the universal approximation property differs significantly depending on whether one considers deep networks or shallow networks: For deep networks with at least two hidden layers, the universal approximation property holds as long as

\sigma

is neither a polynomial, a holomorphic function, or an antiholomorphic function. Shallow networks, on the other hand, are universal if and only if the real part or the imaginary part of

\sigma

is not a polyharmonic function

arXiv.org e-Print Archive