Researchers have proposed that deep learning, which is providing important
progress in a wide range of high complexity tasks, might inspire new insights
into learning in the brain. However, the methods used for deep learning by
artificial neural networks are biologically unrealistic and would need to be
replaced by biologically realistic counterparts. Previous biologically
plausible reinforcement learning rules, like AGREL and AuGMEnT, showed
promising results but focused on shallow networks with three layers. Will these
learning rules also generalize to networks with more layers and can they handle
tasks of higher complexity? We demonstrate the learning scheme on classical and
hard image-classification benchmarks, namely MNIST, CIFAR10 and CIFAR100, cast
as direct reward tasks, both for fully connected, convolutional and locally
connected architectures. We show that our learning rule - Q-AGREL - performs
comparably to supervised learning via error-backpropagation, with this type of
trial-and-error reinforcement learning requiring only 1.5-2.5 times more
epochs, even when classifying 100 different classes as in CIFAR100. Our results
provide new insights into how deep learning may be implemented in the brain