20 research outputs found
End-to-End Delay Minimization based on Joint Optimization of DNN Partitioning and Resource Allocation for Cooperative Edge Inference
Cooperative inference in Mobile Edge Computing (MEC), achieved by deploying
partitioned Deep Neural Network (DNN) models between resource-constrained user
equipments (UEs) and edge servers (ESs), has emerged as a promising paradigm.
Firstly, we consider scenarios of continuous Artificial Intelligence (AI) task
arrivals, like the object detection for video streams, and utilize a serial
queuing model for the accurate evaluation of End-to-End (E2E) delay in
cooperative edge inference. Secondly, to enhance the long-term performance of
inference systems, we formulate a multi-slot stochastic E2E delay optimization
problem that jointly considers model partitioning and multi-dimensional
resource allocation. Finally, to solve this problem, we introduce a
Lyapunov-guided Multi-Dimensional Optimization algorithm (LyMDO) that decouples
the original problem into per-slot deterministic problems, where Deep
Reinforcement Learning (DRL) and convex optimization are used for joint
optimization of partitioning decisions and complementary resource allocation.
Simulation results show that our approach effectively improves E2E delay while
balancing long-term resource constraints.Comment: 7 pages, 9 figures, 1 table, 1 algorithm, to be published in IEEE
98th Vehicular Technology Conference (VTC2023-Fall
Task-Oriented Over-the-Air Computation for Multi-Device Edge AI
Departing from the classic paradigm of data-centric designs, the 6G networks
for supporting edge AI features task-oriented techniques that focus on
effective and efficient execution of AI task. Targeting end-to-end system
performance, such techniques are sophisticated as they aim to seamlessly
integrate sensing (data acquisition), communication (data transmission), and
computation (data processing). Aligned with the paradigm shift, a task-oriented
over-the-air computation (AirComp) scheme is proposed in this paper for
multi-device split-inference system. In the considered system, local feature
vectors, which are extracted from the real-time noisy sensory data on devices,
are aggregated over-the-air by exploiting the waveform superposition in a
multiuser channel. Then the aggregated features as received at a server are fed
into an inference model with the result used for decision making or control of
actuators. To design inference-oriented AirComp, the transmit precoders at edge
devices and receive beamforming at edge server are jointly optimized to rein in
the aggregation error and maximize the inference accuracy. The problem is made
tractable by measuring the inference accuracy using a surrogate metric called
discriminant gain, which measures the discernibility of two object classes in
the application of object/event classification. It is discovered that the
conventional AirComp beamforming design for minimizing the mean square error in
generic AirComp with respect to the noiseless case may not lead to the optimal
classification accuracy. The reason is due to the overlooking of the fact that
feature dimensions have different sensitivity towards aggregation errors and
are thus of different importance levels for classification. This issue is
addressed in this work via a new task-oriented AirComp scheme designed by
directly maximizing the derived discriminant gain
Integrated Sensing-Communication-Computation for Edge Artificial Intelligence
Edge artificial intelligence (AI) has been a promising solution towards 6G to
empower a series of advanced techniques such as digital twin, holographic
projection, semantic communications, and auto-driving, for achieving
intelligence of everything. The performance of edge AI tasks, including edge
learning and edge AI inference, depends on the quality of three highly coupled
processes, i.e., sensing for data acquisition, computation for information
extraction, and communication for information transmission. However, these
three modules need to compete for network resources for enhancing their own
quality-of-services. To this end, integrated sensing-communication-computation
(ISCC) is of paramount significance for improving resource utilization as well
as achieving the customized goals of edge AI tasks. By investigating the
interplay among the three modules, this article presents various kinds of ISCC
schemes for federated edge learning tasks and edge AI inference tasks in both
application and physical layers
Integrated Sensing-Communication-Computation for Over-the-Air Edge AI Inference
Edge-device co-inference refers to deploying well-trained artificial
intelligent (AI) models at the network edge under the cooperation of devices
and edge servers for providing ambient intelligent services. For enhancing the
utilization of limited network resources in edge-device co-inference tasks from
a systematic view, we propose a task-oriented scheme of integrated sensing,
computation and communication (ISCC) in this work. In this system, all devices
sense a target from the same wide view to obtain homogeneous noise-corrupted
sensory data, from which the local feature vectors are extracted. All local
feature vectors are aggregated at the server using over-the-air computation
(AirComp) in a broadband channel with the
orthogonal-frequency-division-multiplexing technique for suppressing the
sensing and channel noise. The aggregated denoised global feature vector is
further input to a server-side AI model for completing the downstream inference
task. A novel task-oriented design criterion, called maximum minimum pair-wise
discriminant gain, is adopted for classification tasks. It extends the distance
of the closest class pair in the feature space, leading to a balanced and
enhanced inference accuracy. Under this criterion, a problem of joint sensing
power assignment, transmit precoding and receive beamforming is formulated. The
challenge lies in three aspects: the coupling between sensing and AirComp, the
joint optimization of all feature dimensions' AirComp aggregation over a
broadband channel, and the complicated form of the maximum minimum pair-wise
discriminant gain. To solve this problem, a task-oriented ISCC scheme with
AirComp is proposed. Experiments based on a human motion recognition task are
conducted to verify the advantages of the proposed scheme over the existing
scheme and a baseline.Comment: This work was accepted by IEEE Transactions on Wireless
Communications on Aug. 12, 202
Joint parameter-and-bandwidth allocation for improving the efficiency of partitioned edge learning
Abstract
To leverage data and computation capabilities of mobile devices, machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models, resulting in the new paradigm of edge learning. In this paper, we consider the framework of partitioned edge learning for iteratively training a large-scale model using many resource-constrained devices (called workers). To this end, in each iteration, the model is dynamically partitioned into parametric blocks, which are downloaded to worker groups for updating using data subsets. Then, the local updates are uploaded to and cascaded by the server for updating a global model. To reduce resource usage by minimizing the total learning-and-communication latency, this work focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation (for downloading and uploading). Two design approaches are adopted. First, a practical sequential approach, called partially integrated parameter-and-bandwidth allocation (PABA), yields two schemes, namely bandwidth aware parameter allocation and parameter aware bandwidth allocation. The former minimizes the load for the slowest (in computing) of worker groups, each training a same parametric block. The latter allocates the largest bandwidth to the worker being the latency bottleneck. Second, PABA are jointly optimized. Despite it being a nonconvex problem, an efficient and optimal solution algorithm is derived by intelligently nesting a bisection search and solving a convex problem. Experimental results using real data demonstrate that integrating PABA can substantially improve the performance of partitioned edge learning in terms of latency (by e.g., 46%) and accuracy (by e.g., 4% given the latency of 100 seconds)