37,069 research outputs found

    Sample-based and Feature-based Federated Learning for Unconstrained and Constrained Nonconvex Optimization via Mini-batch SSCA

    Full text link
    Federated learning (FL) has become a hot research area in enabling the collaborative training of machine learning models among multiple clients that hold sensitive local data. Nevertheless, unconstrained federated optimization has been studied mainly using stochastic gradient descent (SGD), which may converge slowly, and constrained federated optimization, which is more challenging, has not been investigated so far. This paper investigates sample-based and feature-based federated optimization, respectively, and considers both unconstrained and constrained nonconvex problems for each of them. First, we propose FL algorithms using stochastic successive convex approximation (SSCA) and mini-batch techniques. These algorithms can adequately exploit the structures of the objective and constraint functions and incrementally utilize samples. We show that the proposed FL algorithms converge to stationary points and Karush-Kuhn-Tucker (KKT) points of the respective unconstrained and constrained nonconvex problems, respectively. Next, we provide algorithm examples with appealing computational complexity and communication load per communication round. We show that the proposed algorithm examples for unconstrained federated optimization are identical to FL algorithms via momentum SGD and provide an analytical connection between SSCA and momentum SGD. Finally, numerical experiments demonstrate the inherent advantages of the proposed algorithms in convergence speeds, communication and computation costs, and model specifications.Comment: 18 pages, 4 figures. This work is to appear in IEEE Trans. Signal Process. arXiv admin note: substantial text overlap with arXiv:2103.0950

    On the role of synaptic stochasticity in training low-precision neural networks

    Get PDF
    Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives prominence to exponentially rare dense regions of solutions with a number of desirable properties such as robustness and good generalization performance, while typical solutions are isolated and hard to find. Binary solutions of the standard perceptron problem are obtained from a simple gradient descent procedure on a set of real values parametrizing a probability distribution over the binary synapses. Both analytical and numerical results are presented. An algorithmic extension aimed at training discrete deep neural networks is also investigated.Comment: 7 pages + 14 pages of supplementary materia
    • …
    corecore