1,271 research outputs found
AirNet: Neural Network Transmission over the Air
State-of-the-art performance for many emerging edge applications is achieved by deep neural networks (DNNs). Often, the employed DNNs are location- and time-dependent, and the parameters of a specific DNN must be delivered from an edge server to the edge device rapidly and efficiently to carry out time-sensitive inference tasks. This can be considered as a joint source-channel coding (JSCC) problem, in which the goal is not to recover the DNN coefficients with the minimal distortion, but in a manner that provides the highest accuracy in the downstream task. For this purpose we introduce AirNet, a novel training and analog transmission method to deliver DNNs over the air. We first train the DNN with noise injection to counter the wireless channel noise. We also employ pruning to identify the most significant DNN parameters that can be delivered within the available channel bandwidth, knowledge distillation, and nonlinear bandwidth expansion to provide better error protection for the most important network parameters. We show that AirNet achieves significantly higher test accuracy compared to the separation-based alternative, and exhibits graceful degradation with channel quality
AirNet: Neural Network Transmission over the Air
State-of-the-art performance for many emerging edge applications is achieved
by deep neural networks (DNNs). Often, these DNNs are location and time
sensitive, and must be delivered from an edge server to the edge device rapidly
and efficiently to carry out time-sensitive inference tasks. In this paper, we
introduce AirNet, a novel training and transmission method that allows
efficient wireless delivery of DNNs under stringent transmit power and latency
constraints. We first train the DNN with noise injection to counter the channel
noise. We then employ pruning to reduce the network size to the available
channel bandwidth, and perform knowledge distillation from a large model to
improve the performance. We show that AirNet achieves significantly higher test
accuracy compared to digital alternatives under the same bandwidth and power
constraints. We further improve the performance of AirNet by pruning the
network below the available bandwidth, and using channel expansion to provide
better robustness against channel noise. We also benefit from unequal error
protection (UEP) by selectively expanding more important layers of the network.
Finally, we develop an ensemble training approach, which trains a whole
spectrum of DNNs, each of which can be used at different channel condition,
resolving the impractical memory requirements
Joint Device-Edge Digital Semantic Communication with Adaptive Network Split and Learned Non-Linear Quantization
Semantic communication, an intelligent communication paradigm that aims to
transmit useful information in the semantic domain, is facilitated by deep
learning techniques. Although robust semantic features can be learned and
transmitted in an analog fashion, it poses new challenges to hardware,
protocol, and encryption. In this paper, we propose a digital semantic
communication system, which consists of an encoding network deployed on a
resource-limited device and a decoding network deployed at the edge. To acquire
better semantic representation for digital transmission, a novel non-linear
quantization module is proposed with the trainable quantization levels that
efficiently quantifies semantic features. Additionally, structured pruning by a
sparse scaling vector is incorporated to reduce the dimension of the
transmitted features. We also introduce a semantic learning loss (SLL) function
to reduce semantic error. To adapt to various channel conditions and inputs
under constraints of communication and computing resources, a policy network is
designed to adaptively choose the split point and the dimension of the
transmitted semantic features. Experiments using the CIFAR-10 dataset for image
classification are employed to evaluate the proposed digital semantic
communication network, and ablation studies are conducted to assess the
proposed modules including the quantization module, structured pruning and SLL
In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile Networks
The sixth-generation (6G) mobile networks are expected to feature the
ubiquitous deployment of machine learning and AI algorithms at the network
edge. With rapid advancements in edge AI, the time has come to realize
intelligence downloading onto edge devices (e.g., smartphones and sensors). To
materialize this version, we propose a novel technology in this article, called
in-situ model downloading, that aims to achieve transparent and real-time
replacement of on-device AI models by downloading from an AI library in the
network. Its distinctive feature is the adaptation of downloading to
time-varying situations (e.g., application, location, and time), devices'
heterogeneous storage-and-computing capacities, and channel states. A key
component of the presented framework is a set of techniques that dynamically
compress a downloaded model at the depth-level, parameter-level, or bit-level
to support adaptive model downloading. We further propose a virtualized 6G
network architecture customized for deploying in-situ model downloading with
the key feature of a three-tier (edge, local, and central) AI library.
Furthermore, experiments are conducted to quantify 6G connectivity requirements
and research opportunities pertaining to the proposed technology are discussed.Comment: The paper has been submitted to IEEE for possible publicatio
Communication-Oriented Model Fine-Tuning for Packet-Loss Resilient Distributed Inference Under Highly Lossy IoT Networks
The distributed inference (DI) framework has gained traction as a technique for real-time applications empowered by cutting-edge deep machine learning (ML) on resource-constrained Internet of things (IoT) devices. In DI, computational tasks are offloaded from the IoT device to the edge server via lossy IoT networks. However, generally, there is a communication system-level trade-off between communication latency and reliability; thus, to provide accurate DI results, a reliable and high-latency communication system is required to be adapted, which results in non-negligible end-to-end latency of the DI. This motivated us to improve the trade-off between the communication latency and accuracy by efforts on ML techniques. Specifically, we have proposed a communication-oriented model tuning (COMtune), which aims to achieve highly accurate DI with low-latency but unreliable communication links. In COMtune, the key idea is to fine-tune the ML model by emulating the effect of unreliable communication links through the application of the dropout technique. This enables the DI system to obtain robustness against unreliable communication links. Our ML experiments revealed that COMtune enables accurate predictions with low latency and under lossy networks
Task-Oriented Over-the-Air Computation for Multi-Device Edge AI
Departing from the classic paradigm of data-centric designs, the 6G networks
for supporting edge AI features task-oriented techniques that focus on
effective and efficient execution of AI task. Targeting end-to-end system
performance, such techniques are sophisticated as they aim to seamlessly
integrate sensing (data acquisition), communication (data transmission), and
computation (data processing). Aligned with the paradigm shift, a task-oriented
over-the-air computation (AirComp) scheme is proposed in this paper for
multi-device split-inference system. In the considered system, local feature
vectors, which are extracted from the real-time noisy sensory data on devices,
are aggregated over-the-air by exploiting the waveform superposition in a
multiuser channel. Then the aggregated features as received at a server are fed
into an inference model with the result used for decision making or control of
actuators. To design inference-oriented AirComp, the transmit precoders at edge
devices and receive beamforming at edge server are jointly optimized to rein in
the aggregation error and maximize the inference accuracy. The problem is made
tractable by measuring the inference accuracy using a surrogate metric called
discriminant gain, which measures the discernibility of two object classes in
the application of object/event classification. It is discovered that the
conventional AirComp beamforming design for minimizing the mean square error in
generic AirComp with respect to the noiseless case may not lead to the optimal
classification accuracy. The reason is due to the overlooking of the fact that
feature dimensions have different sensitivity towards aggregation errors and
are thus of different importance levels for classification. This issue is
addressed in this work via a new task-oriented AirComp scheme designed by
directly maximizing the derived discriminant gain
Progressive feature transmission for split classification at the wireless edge
We consider the scenario of inference at the wire-less edge , in which devices are connected to an edge server and ask the server to carry out remote classification, that is, classify data samples available at edge devices. This requires the edge devices to upload high-dimensional features of samples over resource-constrained wireless channels, which creates a communication bottleneck. The conventional feature pruning solution would require the device to have access to the inference model, which is not available in the current split inference scenario. To address this issue, we propose the progressive feature transmission (ProgressFTX) protocol, which minimizes the overhead by progressively transmitting features until a target confidence level is reached. A control policy is proposed to accelerate inference, comprising two key operations: importance-aware feature selection at the server and transmission-termination control . For the former, it is shown that selecting the most important features, characterized by the largest discriminant gains of the corresponding feature dimensions, achieves a sub-optimal performance. For the latter, the proposed policy is shown to exhibit a threshold structure. Specifically, the transmission is stopped when the incremental uncertainty reduction by further feature transmission is outweighed by its communication cost. The indices of the selected features and transmission decision are fed back to the device in each slot. The control policy is first derived for the tractable case of linear classification, and then extended to the more complex case of classification using a convolutional neural network . Both Gaussian and fading channels are considered. Experimental results are obtained for both a statistical data model and a real dataset. It is shown that ProgressFTX can substantially reduce the communication latency compared to conventional feature pruning and random feature transmission strategies
- …