68 research outputs found
The First Verification Test of Space-Ground Collaborative Intelligence via Cloud-Native Satellites
Recent advancements in satellite technologies and the declining cost of
access to space have led to the emergence of large satellite constellations in
Low Earth Orbit. However, these constellations often rely on bent-pipe
architecture, resulting in high communication costs. Existing onboard inference
architectures suffer from limitations in terms of low accuracy and
inflexibility in the deployment and management of in-orbit applications. To
address these challenges, we propose a cloud-native-based satellite design
specifically tailored for Earth Observation tasks, enabling diverse computing
paradigms. In this work, we present a case study of a satellite-ground
collaborative inference system deployed in the Tiansuan constellation,
demonstrating a remarkable 50\% accuracy improvement and a substantial 90\%
data reduction. Our work sheds light on in-orbit energy, where in-orbit
computing accounts for 17\% of the total onboard energy consumption. Our
approach represents a significant advancement of cloud-native satellite, aiming
to enhance the accuracy of in-orbit computing while simultaneously reducing
communication cost.Comment: Accepted by China Communication
Federated NLP in Few-shot Scenarios
Natural language processing (NLP) sees rich mobile applications. To support
various language understanding tasks, a foundation NLP model is often
fine-tuned in a federated, privacy-preserving setting (FL). This process
currently relies on at least hundreds of thousands of labeled training samples
from mobile clients; yet mobile users often lack willingness or knowledge to
label their data. Such an inadequacy of data labels is known as a few-shot
scenario; it becomes the key blocker for mobile NLP applications.
For the first time, this work investigates federated NLP in the few-shot
scenario (FedFSL). By retrofitting algorithmic advances of pseudo labeling and
prompt learning, we first establish a training pipeline that delivers
competitive accuracy when only 0.05% (fewer than 100) of the training data is
labeled and the remaining is unlabeled. To instantiate the workflow, we further
present a system FFNLP, addressing the high execution cost with novel designs.
(1) Curriculum pacing, which injects pseudo labels to the training workflow at
a rate commensurate to the learning progress; (2) Representational diversity, a
mechanism for selecting the most learnable data, only for which pseudo labels
will be generated; (3) Co-planning of a model's training depth and layer
capacity. Together, these designs reduce the training delay, client energy, and
network traffic by up to 46.0, 41.2 and 3000.0,
respectively. Through algorithm/system co-design, FFNLP demonstrates that FL
can apply to challenging settings where most training samples are unlabeled
Collaborative Inference in DNN-based Satellite Systems with Dynamic Task Streams
As a driving force in the advancement of intelligent in-orbit applications,
DNN models have been gradually integrated into satellites, producing daily
latency-constraint and computation-intensive tasks. However, the substantial
computation capability of DNN models, coupled with the instability of the
satellite-ground link, pose significant challenges, hindering timely completion
of tasks. It becomes necessary to adapt to task stream changes when dealing
with tasks requiring latency guarantees, such as dynamic observation tasks on
the satellites. To this end, we consider a system model for a collaborative
inference system with latency constraints, leveraging the multi-exit and model
partition technology. To address this, we propose an algorithm, which is
tailored to effectively address the trade-off between task completion and
maintaining satisfactory task accuracy by dynamically choosing early-exit and
partition points. Simulation evaluations show that our proposed algorithm
significantly outperforms baseline algorithms across the task stream with
strict latency constraints
Towards Practical Few-shot Federated NLP
Transformer-based pre-trained models have emerged as the predominant solution
for natural language processing (NLP). Fine-tuning such pre-trained models for
downstream tasks often requires a considerable amount of labeled private data.
In practice, private data is often distributed across heterogeneous mobile
devices and may be prohibited from being uploaded. Moreover, well-curated
labeled data is often scarce, presenting an additional challenge. To address
these challenges, we first introduce a data generator for federated few-shot
learning tasks, which encompasses the quantity and skewness of scarce labeled
data in a realistic setting. Subsequently, we propose AUG-FedPrompt, a
prompt-based federated learning system that exploits abundant unlabeled data
for data augmentation. Our experiments indicate that AUG-FedPrompt can perform
on par with full-set fine-tuning with a limited amount of labeled data.
However, such competitive performance comes at a significant system cost.Comment: EuroSys23 worksho
TFormer: A Transmission-Friendly ViT Model for IoT Devices
Deploying high-performance vision transformer (ViT) models on ubiquitous
Internet of Things (IoT) devices to provide high-quality vision services will
revolutionize the way we live, work, and interact with the world. Due to the
contradiction between the limited resources of IoT devices and
resource-intensive ViT models, the use of cloud servers to assist ViT model
training has become mainstream. However, due to the larger number of parameters
and floating-point operations (FLOPs) of the existing ViT models, the model
parameters transmitted by cloud servers are large and difficult to run on
resource-constrained IoT devices. To this end, this paper proposes a
transmission-friendly ViT model, TFormer, for deployment on
resource-constrained IoT devices with the assistance of a cloud server. The
high performance and small number of model parameters and FLOPs of TFormer are
attributed to the proposed hybrid layer and the proposed partially connected
feed-forward network (PCS-FFN). The hybrid layer consists of nonlearnable
modules and a pointwise convolution, which can obtain multitype and multiscale
features with only a few parameters and FLOPs to improve the TFormer
performance. The PCS-FFN adopts group convolution to reduce the number of
parameters. The key idea of this paper is to propose TFormer with few model
parameters and FLOPs to facilitate applications running on resource-constrained
IoT devices to benefit from the high performance of the ViT models.
Experimental results on the ImageNet-1K, MS COCO, and ADE20K datasets for image
classification, object detection, and semantic segmentation tasks demonstrate
that the proposed model outperforms other state-of-the-art models.
Specifically, TFormer-S achieves 5% higher accuracy on ImageNet-1K than
ResNet18 with 1.4 fewer parameters and FLOPs.Comment: IEEE Transactions on Parallel and Distributed System
Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services
Deploying high-performance convolutional neural network (CNN) models on
low-earth orbit (LEO) satellites for rapid remote sensing image processing has
attracted significant interest from industry and academia. However, the limited
resources available on LEO satellites contrast with the demands of
resource-intensive CNN models, necessitating the adoption of ground-station
server assistance for training and updating these models. Existing approaches
often require large floating-point operations (FLOPs) and substantial model
parameter transmissions, presenting considerable challenges. To address these
issues, this paper introduces a ground-station server-assisted framework. With
the proposed framework, each layer of the CNN model contains only one learnable
feature map (called the seed feature map) from which other feature maps are
generated based on specific rules. The hyperparameters of these rules are
randomly generated instead of being trained, thus enabling the generation of
multiple feature maps from the seed feature map and significantly reducing
FLOPs. Furthermore, since the random hyperparameters can be saved using a few
random seeds, the ground station server assistance can be facilitated in
updating the CNN model deployed on the LEO satellite. Experimental results on
the ISPRS Vaihingen, ISPRS Potsdam, UAVid, and LoveDA datasets for semantic
segmentation services demonstrate that the proposed framework outperforms
existing state-of-the-art approaches. In particular, the SineFM-based model
achieves a higher mIoU than the UNetFormer on the UAVid dataset, with 3.3x
fewer parameters and 2.2x fewer FLOPs.Comment: 11 page
Accelerating Vertical Federated Learning
Privacy, security and data governance constraints rule out a brute force
process in the integration of cross-silo data, which inherits the development
of the Internet of Things. Federated learning is proposed to ensure that all
parties can collaboratively complete the training task while the data is not
out of the local. Vertical federated learning is a specialization of federated
learning for distributed features. To preserve privacy, homomorphic encryption
is applied to enable encrypted operations without decryption. Nevertheless,
together with a robust security guarantee, homomorphic encryption brings extra
communication and computation overhead. In this paper, we analyze the current
bottlenecks of vertical federated learning under homomorphic encryption
comprehensively and numerically. We propose a straggler-resilient and
computation-efficient accelerating system that reduces the communication
overhead in heterogeneous scenarios by 65.26% at most and reduces the
computation overhead caused by homomorphic encryption by 40.66% at most. Our
system can improve the robustness and efficiency of the current vertical
federated learning framework without loss of security
- …