1,335,051 research outputs found
Biologically plausible deep learning -- but how far can we go with shallow networks?
Training deep neural networks with the error backpropagation algorithm is
considered implausible from a biological perspective. Numerous recent
publications suggest elaborate models for biologically plausible variants of
deep learning, typically defining success as reaching around 98% test accuracy
on the MNIST data set. Here, we investigate how far we can go on digit (MNIST)
and object (CIFAR10) classification with biologically plausible, local learning
rules in a network with one hidden layer and a single readout layer. The hidden
layer weights are either fixed (random or random Gabor filters) or trained with
unsupervised methods (PCA, ICA or Sparse Coding) that can be implemented by
local learning rules. The readout layer is trained with a supervised, local
learning rule. We first implement these models with rate neurons. This
comparison reveals, first, that unsupervised learning does not lead to better
performance than fixed random projections or Gabor filters for large hidden
layers. Second, networks with localized receptive fields perform significantly
better than networks with all-to-all connectivity and can reach backpropagation
performance on MNIST. We then implement two of the networks - fixed, localized,
random & random Gabor filters in the hidden layer - with spiking leaky
integrate-and-fire neurons and spike timing dependent plasticity to train the
readout layer. These spiking models achieve > 98.2% test accuracy on MNIST,
which is close to the performance of rate networks with one hidden layer
trained with backpropagation. The performance of our shallow network models is
comparable to most current biologically plausible models of deep learning.
Furthermore, our results with a shallow spiking network provide an important
reference and suggest the use of datasets other than MNIST for testing the
performance of future models of biologically plausible deep learning.Comment: 14 pages, 4 figure
Continual Local Training for Better Initialization of Federated Models
Federated learning (FL) refers to the learning paradigm that trains machine
learning models directly in the decentralized systems consisting of smart edge
devices without transmitting the raw data, which avoids the heavy communication
costs and privacy concerns. Given the typical heterogeneous data distributions
in such situations, the popular FL algorithm \emph{Federated Averaging}
(FedAvg) suffers from weight divergence and thus cannot achieve a competitive
performance for the global model (denoted as the \emph{initial performance} in
FL) compared to centralized methods. In this paper, we propose the local
continual training strategy to address this problem. Importance weights are
evaluated on a small proxy dataset on the central server and then used to
constrain the local training. With this additional term, we alleviate the
weight divergence and continually integrate the knowledge on different local
clients into the global model, which ensures a better generalization ability.
Experiments on various FL settings demonstrate that our method significantly
improves the initial performance of federated models with few extra
communication costs.Comment: This paper has been accepted to 2020 IEEE International Conference on
Image Processing (ICIP 2020
- …