2 research outputs found
Evaluation of Complex-Valued Neural Networks on Real-Valued Classification Tasks
Complex-valued neural networks are not a new concept, however, the use of
real-valued models has often been favoured over complex-valued models due to
difficulties in training and performance. When comparing real-valued versus
complex-valued neural networks, existing literature often ignores the number of
parameters, resulting in comparisons of neural networks with vastly different
sizes. We find that when real and complex neural networks of similar capacity
are compared, complex models perform equal to or slightly worse than
real-valued models for a range of real-valued classification tasks. The use of
complex numbers allows neural networks to handle noise on the complex plane.
When classifying real-valued data with a complex-valued neural network, the
imaginary parts of the weights follow their real parts. This behaviour is
indicative for a task that does not require a complex-valued model. We further
investigated this in a synthetic classification task. We can transfer many
activation functions from the real to the complex domain using different
strategies. The weight initialisation of complex neural networks, however,
remains a significant problem.Comment: preprint, 18 pages, 8 figures, 8 table
Bayesian Sparsification Methods for Deep Complex-valued Networks
With continual miniaturization ever more applications of deep learning can be
found in embedded systems, where it is common to encounter data with natural
complex domain representation. To this end we extend Sparse Variational Dropout
to complex-valued neural networks and verify the proposed Bayesian technique by
conducting a large numerical study of the performance-compression trade-off of
C-valued networks on two tasks: image recognition on MNIST-like and CIFAR10
datasets and music transcription on MusicNet. We replicate the state-of-the-art
result by Trabelsi et al. [2018] on MusicNet with a complex-valued network
compressed by 50-100x at a small performance penalty.Comment: Findings and conclusions unchanged. Improved overall presentation and
redid the plots with larger markers and annotations. Coherent story about
compression, CVNN, BI to SGVB with local reparameterization and additive
noise. Better coverage in lit-review, clearer connections of Dropout to
Bayes, VD, and prunin