13,277 research outputs found
DeViT: Decomposing Vision Transformers for Collaborative Inference in Edge Devices
Recent years have witnessed the great success of vision transformer (ViT),
which has achieved state-of-the-art performance on multiple computer vision
benchmarks. However, ViT models suffer from vast amounts of parameters and high
computation cost, leading to difficult deployment on resource-constrained edge
devices. Existing solutions mostly compress ViT models to a compact model but
still cannot achieve real-time inference. To tackle this issue, we propose to
explore the divisibility of transformer structure, and decompose the large ViT
into multiple small models for collaborative inference at edge devices. Our
objective is to achieve fast and energy-efficient collaborative inference while
maintaining comparable accuracy compared with large ViTs. To this end, we first
propose a collaborative inference framework termed DeViT to facilitate edge
deployment by decomposing large ViTs. Subsequently, we design a
decomposition-and-ensemble algorithm based on knowledge distillation, termed
DEKD, to fuse multiple small decomposed models while dramatically reducing
communication overheads, and handle heterogeneous models by developing a
feature matching module to promote the imitations of decomposed models from the
large ViT. Extensive experiments for three representative ViT backbones on four
widely-used datasets demonstrate our method achieves efficient collaborative
inference for ViTs and outperforms existing lightweight ViTs, striking a good
trade-off between efficiency and accuracy. For example, our DeViTs improves
end-to-end latency by 2.89 with only 1.65% accuracy sacrifice using
CIFAR-100 compared to the large ViT, ViT-L/16, on the GPU server. DeDeiTs
surpasses the recent efficient ViT, MobileViT-S, by 3.54% in accuracy on
ImageNet-1K, while running 1.72 faster and requiring 55.28% lower
energy consumption on the edge device.Comment: Accepted by IEEE Transactions on Mobile Computin
On Lightweight Privacy-Preserving Collaborative Learning for IoT Objects
The Internet of Things (IoT) will be a main data generation infrastructure
for achieving better system intelligence. This paper considers the design and
implementation of a practical privacy-preserving collaborative learning scheme,
in which a curious learning coordinator trains a better machine learning model
based on the data samples contributed by a number of IoT objects, while the
confidentiality of the raw forms of the training data is protected against the
coordinator. Existing distributed machine learning and data encryption
approaches incur significant computation and communication overhead, rendering
them ill-suited for resource-constrained IoT objects. We study an approach that
applies independent Gaussian random projection at each IoT object to obfuscate
data and trains a deep neural network at the coordinator based on the projected
data from the IoT objects. This approach introduces light computation overhead
to the IoT objects and moves most workload to the coordinator that can have
sufficient computing resources. Although the independent projections performed
by the IoT objects address the potential collusion between the curious
coordinator and some compromised IoT objects, they significantly increase the
complexity of the projected data. In this paper, we leverage the superior
learning capability of deep learning in capturing sophisticated patterns to
maintain good learning performance. Extensive comparative evaluation shows that
this approach outperforms other lightweight approaches that apply additive
noisification for differential privacy and/or support vector machines for
learning in the applications with light data pattern complexities.Comment: 12 pages,IOTDI 201
- …