Search CORE

1,186 research outputs found

Federated TD Learning Over Finite-Rate Erasure Channels: Linear Speedup Under Markovian Sampling

Author: Aritra Mitra
George J. Pappas
Nicolò Dal Fabbro
Publication venue: Institute of Electrical and Electronics Engineers Inc.
Publication date: 01/01/2023
Field of study

Federated learning (FL) has recently gained much attention due to its effectiveness in speeding up supervised learning tasks under communication and privacy constraints. However, whether similar speedups can be established for reinforcement learning remains much less understood theoretically. Towards this direction, we study a federated policy evaluation problem where agents communicate via a central aggregator to expedite the evaluation of a common policy. To capture typical communication constraints in FL, we consider finite capacity up-link channels that can drop packets based on a Bernoulli erasure model. Given this setting, we propose and analyze QFedTD - a quantized federated temporal difference learning algorithm with linear function approximation. Our main technical contribution is to provide a finite-sample analysis of QFedTD that (i) highlights the effect of quantization and erasures on the convergence rate; and (ii) establishes a linear speedup w.r.t. the number of agents under Markovian sampling. Notably, while different quantization mechanisms and packet drop models have been extensively studied in the FL, distributed optimization, and networked control systems literature, our work is the first to provide a non-asymptotic analysis of their effects in multi-agent and federated reinforcement learning

Archivio istituzionale della ricerca - Università di Padova

The Effects of Weight Quantization on Online Federated Learning for the IoT: A Case Study

Author: Freitag Felix
Giménez Nil Llisterri
Lee JunKyu
Vandierendonck Hans
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 12/01/2024
Field of study

Many weight quantization approaches were explored to save the communication bandwidth between the clients and the server in federated learning using high-end computing machines. However, there is a lack of weight quantization research for online federated learning using TinyML devices which are restricted by the mini-batch size, the neural network size, and the communication method due to their severe hardware resource constraints and power budgets. We name Tiny Online Federated Learning (TinyOFL) for online federated learning using TinyML devices in the Internet of Things (IoT). This paper performs a comprehensive analysis of the effects of weight quantization in TinyOFL in terms of accuracy, stability, overfitting, communication efficiency, energy consumption, and delivery time, and extracts practical guidelines on how to apply the weight quantization to TinyOFL. Our analysis is supported by a TinyOFL case study with three Arduino Portenta H7 boards running federated learning clients for a keyword spotting task. Our findings include that in TinyOFL, a more aggressive weight quantization can be allowed than in online learning without FL, without affecting the accuracy thanks to TinyOFL’s quasi-batch training property. For example, using 7-bit weights achieved the equivalent accuracy to 32-bit floating point weights, while saving communication bandwidth by 4.6× . Overfitting by increasing network width rarely occurs in TinyOFL, but may occur if strong weight quantization is applied. The experiments also showed that there is a design space for TinyOFL applications by compensating for the accuracy loss due to weight quantization with an increase of the neural network size

University of Essex Research Repository

3DQ: Compact Quantized Neural Networks for Volumetric Whole Brain Segmentation

Author: AG Roy
AG Roy
DS Marcus
H Chen
M Rastegari
MA Mazurowski
MP Heinrich
O Ronneberger
R Vivanti
S Thaler
Ö Çiçek
Publication venue
Publication date: 01/07/2019
Field of study

Model architectures have been dramatically increasing in size, improving performance at the cost of resource requirements. In this paper we propose 3DQ, a ternary quantization method, applied for the first time to 3D Fully Convolutional Neural Networks (F-CNNs), enabling 16x model compression while maintaining performance on par with full precision models. We extensively evaluate 3DQ on two datasets for the challenging task of whole brain segmentation. Additionally, we showcase our method's ability to generalize on two common 3D architectures, namely 3D U-Net and V-Net. Outperforming a variety of baselines, the proposed method is capable of compressing large 3D models to a few MBytes, alleviating the storage needs in space critical applications.Comment: Accepted to MICCAI 201

arXiv.org e-Print Archive

Crossref

Learning Sparse & Ternary Neural Networks with Entropy-Constrained Trained Ternarization (EC2T)

Author: Becking Daniel
Marban Arturo
Samek Wojciech
Wiedemann Simon
Publication venue
Publication date: 25/05/2020
Field of study

Deep neural networks (DNN) have shown remarkable success in a variety of machine learning applications. The capacity of these models (i.e., number of parameters), endows them with expressive power and allows them to reach the desired performance. In recent years, there is an increasing interest in deploying DNNs to resource-constrained devices (i.e., mobile devices) with limited energy, memory, and computational budget. To address this problem, we propose Entropy-Constrained Trained Ternarization (EC2T), a general framework to create sparse and ternary neural networks which are efficient in terms of storage (e.g., at most two binary-masks and two full-precision values are required to save a weight matrix) and computation (e.g., MAC operations are reduced to a few accumulations plus two multiplications). This approach consists of two steps. First, a super-network is created by scaling the dimensions of a pre-trained model (i.e., its width and depth). Subsequently, this super-network is simultaneously pruned (using an entropy constraint) and quantized (that is, ternary values are assigned layer-wise) in a training process, resulting in a sparse and ternary network representation. We validate the proposed approach in CIFAR-10, CIFAR-100, and ImageNet datasets, showing its effectiveness in image classification tasks.Comment: Proceedings of the CVPR'20 Joint Workshop on Efficient Deep Learning in Computer Vision. Code is available at https://github.com/d-becking/efficientCNN

arXiv.org e-Print Archive

Crossref