17,840 research outputs found
Deep Complex Networks
At present, the vast majority of building blocks, techniques, and
architectures for deep learning are based on real-valued operations and
representations. However, recent work on recurrent neural networks and older
fundamental theoretical analysis suggests that complex numbers could have a
richer representational capacity and could also facilitate noise-robust memory
retrieval mechanisms. Despite their attractive properties and potential for
opening up entirely new neural architectures, complex-valued deep neural
networks have been marginalized due to the absence of the building blocks
required to design such models. In this work, we provide the key atomic
components for complex-valued deep neural networks and apply them to
convolutional feed-forward networks and convolutional LSTMs. More precisely, we
rely on complex convolutions and present algorithms for complex
batch-normalization, complex weight initialization strategies for
complex-valued neural nets and we use them in experiments with end-to-end
training schemes. We demonstrate that such complex-valued models are
competitive with their real-valued counterparts. We test deep complex models on
several computer vision tasks, on music transcription using the MusicNet
dataset and on Speech Spectrum Prediction using the TIMIT dataset. We achieve
state-of-the-art performance on these audio-related tasks
Unmasking Clever Hans Predictors and Assessing What Machines Really Learn
Current learning machines have successfully solved hard application problems,
reaching high accuracy and displaying seemingly "intelligent" behavior. Here we
apply recent techniques for explaining decisions of state-of-the-art learning
machines and analyze various tasks from computer vision and arcade games. This
showcases a spectrum of problem-solving behaviors ranging from naive and
short-sighted, to well-informed and strategic. We observe that standard
performance evaluation metrics can be oblivious to distinguishing these diverse
problem solving behaviors. Furthermore, we propose our semi-automated Spectral
Relevance Analysis that provides a practically effective way of characterizing
and validating the behavior of nonlinear learning machines. This helps to
assess whether a learned model indeed delivers reliably for the problem that it
was conceived for. Furthermore, our work intends to add a voice of caution to
the ongoing excitement about machine intelligence and pledges to evaluate and
judge some of these recent successes in a more nuanced manner.Comment: Accepted for publication in Nature Communication
- …