362 research outputs found
3DQ: Compact Quantized Neural Networks for Volumetric Whole Brain Segmentation
Model architectures have been dramatically increasing in size, improving
performance at the cost of resource requirements. In this paper we propose 3DQ,
a ternary quantization method, applied for the first time to 3D Fully
Convolutional Neural Networks (F-CNNs), enabling 16x model compression while
maintaining performance on par with full precision models. We extensively
evaluate 3DQ on two datasets for the challenging task of whole brain
segmentation. Additionally, we showcase our method's ability to generalize on
two common 3D architectures, namely 3D U-Net and V-Net. Outperforming a variety
of baselines, the proposed method is capable of compressing large 3D models to
a few MBytes, alleviating the storage needs in space critical applications.Comment: Accepted to MICCAI 201
Ternary Compression for Communication-Efficient Federated Learning
Learning over massive data stored in different locations is essential in many
real-world applications. However, sharing data is full of challenges due to the
increasing demands of privacy and security with the growing use of smart mobile
devices and IoT devices. Federated learning provides a potential solution to
privacy-preserving and secure machine learning, by means of jointly training a
global model without uploading data distributed on multiple devices to a
central server. However, most existing work on federated learning adopts
machine learning models with full-precision weights, and almost all these
models contain a large number of redundant parameters that do not need to be
transmitted to the server, consuming an excessive amount of communication
costs. To address this issue, we propose a federated trained ternary
quantization (FTTQ) algorithm, which optimizes the quantized networks on the
clients through a self-learning quantization factor. A convergence proof of the
quantization factor and the unbiasedness of FTTQ is given. In addition, we
propose a ternary federated averaging protocol (T-FedAvg) to reduce the
upstream and downstream communication of federated learning systems. Empirical
experiments are conducted to train widely used deep learning models on publicly
available datasets, and our results demonstrate the effectiveness of FTTQ and
T-FedAvg compared with the canonical federated learning algorithms in reducing
communication costs and maintaining the learning performance
Learning Sparse & Ternary Neural Networks with Entropy-Constrained Trained Ternarization (EC2T)
Deep neural networks (DNN) have shown remarkable success in a variety of
machine learning applications. The capacity of these models (i.e., number of
parameters), endows them with expressive power and allows them to reach the
desired performance. In recent years, there is an increasing interest in
deploying DNNs to resource-constrained devices (i.e., mobile devices) with
limited energy, memory, and computational budget. To address this problem, we
propose Entropy-Constrained Trained Ternarization (EC2T), a general framework
to create sparse and ternary neural networks which are efficient in terms of
storage (e.g., at most two binary-masks and two full-precision values are
required to save a weight matrix) and computation (e.g., MAC operations are
reduced to a few accumulations plus two multiplications). This approach
consists of two steps. First, a super-network is created by scaling the
dimensions of a pre-trained model (i.e., its width and depth). Subsequently,
this super-network is simultaneously pruned (using an entropy constraint) and
quantized (that is, ternary values are assigned layer-wise) in a training
process, resulting in a sparse and ternary network representation. We validate
the proposed approach in CIFAR-10, CIFAR-100, and ImageNet datasets, showing
its effectiveness in image classification tasks.Comment: Proceedings of the CVPR'20 Joint Workshop on Efficient Deep Learning
in Computer Vision. Code is available at
https://github.com/d-becking/efficientCNN
- …