37,069 research outputs found
Sample-based and Feature-based Federated Learning for Unconstrained and Constrained Nonconvex Optimization via Mini-batch SSCA
Federated learning (FL) has become a hot research area in enabling the
collaborative training of machine learning models among multiple clients that
hold sensitive local data. Nevertheless, unconstrained federated optimization
has been studied mainly using stochastic gradient descent (SGD), which may
converge slowly, and constrained federated optimization, which is more
challenging, has not been investigated so far. This paper investigates
sample-based and feature-based federated optimization, respectively, and
considers both unconstrained and constrained nonconvex problems for each of
them. First, we propose FL algorithms using stochastic successive convex
approximation (SSCA) and mini-batch techniques. These algorithms can adequately
exploit the structures of the objective and constraint functions and
incrementally utilize samples. We show that the proposed FL algorithms converge
to stationary points and Karush-Kuhn-Tucker (KKT) points of the respective
unconstrained and constrained nonconvex problems, respectively. Next, we
provide algorithm examples with appealing computational complexity and
communication load per communication round. We show that the proposed algorithm
examples for unconstrained federated optimization are identical to FL
algorithms via momentum SGD and provide an analytical connection between SSCA
and momentum SGD. Finally, numerical experiments demonstrate the inherent
advantages of the proposed algorithms in convergence speeds, communication and
computation costs, and model specifications.Comment: 18 pages, 4 figures. This work is to appear in IEEE Trans. Signal
Process. arXiv admin note: substantial text overlap with arXiv:2103.0950
On the role of synaptic stochasticity in training low-precision neural networks
Stochasticity and limited precision of synaptic weights in neural network
models are key aspects of both biological and hardware modeling of learning
processes. Here we show that a neural network model with stochastic binary
weights naturally gives prominence to exponentially rare dense regions of
solutions with a number of desirable properties such as robustness and good
generalization performance, while typical solutions are isolated and hard to
find. Binary solutions of the standard perceptron problem are obtained from a
simple gradient descent procedure on a set of real values parametrizing a
probability distribution over the binary synapses. Both analytical and
numerical results are presented. An algorithmic extension aimed at training
discrete deep neural networks is also investigated.Comment: 7 pages + 14 pages of supplementary materia
- …