Search CORE

3,089 research outputs found

An energy-efficient off-loading scheme for low latency in collaborative edge computing

Author: Liao Zhuofan
Sangaiah Arun Kumar
Sherratt R. Simon
Wang Jin
Wu Wenbing
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Mobile terminal users applications, such as smartphones or laptops, have frequent computational task demanding but limited battery power. Edge computing is introduced to offload terminals' tasks to meet the quality of service requirements such as low delay and energy consumption. By offloading computation tasks, edge servers can enable terminals to collaboratively run the highly demanding applications in acceptable delay requirements. However, existing schemes barely consider the characteristics of the edge server, which leads to random assignment of tasks among servers and big tasks with high computational intensity (named as “big task”) may be assigned to servers with low ability. In this paper, a task is divided into several subtasks and subtasks are offloaded according to characteristics of edge servers, such as transmission distance and central processing unit (CPU) capacity. With this multi-subtasks-to-multi-servers model, an adaptive offloading scheme based on Hungarian algorithm is proposed with low complexity. Extensive simulations are conducted to show the efficiency of the scheme on reducing the offloading latency with low energy consumption

Central Archive at the University of Reading

JALAD: Joint Accuracy- and Latency-Aware Deep Structure Decoupling for Edge-Cloud Execution

Author: Hu Chenghao
Jiang Jingyan
Li Hongshan
Wang Zhi
Wen Yonggang
Zhu Wenwu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/12/2018
Field of study

Recent years have witnessed a rapid growth of deep-network based services and applications. A practical and critical problem thus has emerged: how to effectively deploy the deep neural network models such that they can be executed efficiently. Conventional cloud-based approaches usually run the deep models in data center servers, causing large latency because a significant amount of data has to be transferred from the edge of network to the data center. In this paper, we propose JALAD, a joint accuracy- and latency-aware execution framework, which decouples a deep neural network so that a part of it will run at edge devices and the other part inside the conventional cloud, while only a minimum amount of data has to be transferred between them. Though the idea seems straightforward, we are facing challenges including i) how to find the best partition of a deep structure; ii) how to deploy the component at an edge device that only has limited computation power; and iii) how to minimize the overall execution latency. Our answers to these questions are a set of strategies in JALAD, including 1) A normalization based in-layer data compression strategy by jointly considering compression rate and model accuracy; 2) A latency-aware deep decoupling strategy to minimize the overall execution latency; and 3) An edge-cloud structure adaptation strategy that dynamically changes the decoupling for different network conditions. Experiments demonstrate that our solution can significantly reduce the execution latency: it speeds up the overall inference execution with a guaranteed model accuracy loss.Comment: conference, copyright transfered to IEE

arXiv.org e-Print Archive

Crossref

DR-NTU (Digital Repository of NTU)

GraphR: Accelerating Graph Processing Using ReRAM

Author: Chen Yiran
Li Hai
Qian Xuehai
Song Linghao
Zhuo Youwei
Publication venue
Publication date: 08/12/2017
Field of study

This paper presents GRAPHR, the first ReRAM-based graph processing accelerator. GRAPHR follows the principle of near-data processing and explores the opportunity of performing massive parallel analog operations with low hardware and energy cost. The analog computation is suit- able for graph processing because: 1) The algorithms are iterative and could inherently tolerate the imprecision; 2) Both probability calculation (e.g., PageRank and Collaborative Filtering) and typical graph algorithms involving integers (e.g., BFS/SSSP) are resilient to errors. The key insight of GRAPHR is that if a vertex program of a graph algorithm can be expressed in sparse matrix vector multiplication (SpMV), it can be efficiently performed by ReRAM crossbar. We show that this assumption is generally true for a large set of graph algorithms. GRAPHR is a novel accelerator architecture consisting of two components: memory ReRAM and graph engine (GE). The core graph computations are performed in sparse matrix format in GEs (ReRAM crossbars). The vector/matrix-based graph computation is not new, but ReRAM offers the unique opportunity to realize the massive parallelism with unprecedented energy efficiency and low hardware cost. With small subgraphs processed by GEs, the gain of performing parallel operations overshadows the wastes due to sparsity. The experiment results show that GRAPHR achieves a 16.01x (up to 132.67x) speedup and a 33.82x energy saving on geometric mean compared to a CPU baseline system. Com- pared to GPU, GRAPHR achieves 1.69x to 2.19x speedup and consumes 4.77x to 8.91x less energy. GRAPHR gains a speedup of 1.16x to 4.12x, and is 3.67x to 10.96x more energy efficiency compared to PIM-based architecture.Comment: Accepted to HPCA 201

arXiv.org e-Print Archive

Crossref

Minimization of Energy and Service Latency Computation Offloading using Neural Network in 5G NOMA System

Author: Ahmed Mohammed Riyaz
Suprith PG
Publication venue: Electronics and Telecommunications Committee
Publication date: 28/10/2023
Field of study

The future Internet of Things (IoT) era is anticipated to support computation-intensive and time-critical applications using edge computing for mobile (MEC), which is regarded as promising technique. However, the transmitting uplink performance will be highly impacted by the hostile wireless channel, the low bandwidth, and the low transmission power of IoT devices. Using edge computing for mobile (MEC) to offload tasks becomes a crucial technology to reduce service latency for computation-intensive applications and reduce the computational workloads of mobile devices. Under the restrictions of computation latency and cloud computing capacity, our goal is to reduce the overall energy consumption of all users, including transmission energy and local computation energy. In this article, the Deep Q Network Algorithm (DQNA) to deal with the data rates with respect to the user base in different time slots of 5G NOMA network. The DQNA is optimized by considering more number of cell structures like 2, 4, 6 and 8. Therefore, the DQNA provides the optimal distribution of power among all 3 users in the 5G network, which gives the increased data rates. The existing various power distribution algorithms like frequent pattern (FP), weighted least squares mean error weighted least squares mean error (WLSME), and Random Power and Maximal Power allocation are used to justify the proposed DQNA technique. The proposed technique which gives 81.6% more the data rates when increased the cell structure to 8. Thus 25% more in comparison to other algorithms like FP, WLSME Random Power and Maximal Power allocation

International Journal of Electronics and Telecommunications (Warsaw University of Technology)

A Review on Computational Intelligence Techniques in Cloud and Edge Computing

Author: Asim Muhammad
Huang Pei-Qiu
Wang Kezhi
Wang Yong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/07/2020
Field of study

Cloud computing (CC) is a centralized computing paradigm that accumulates resources centrally and provides these resources to users through Internet. Although CC holds a large number of resources, it may not be acceptable by real-time mobile applications, as it is usually far away from users geographically. On the other hand, edge computing (EC), which distributes resources to the network edge, enjoys increasing popularity in the applications with low-latency and high-reliability requirements. EC provides resources in a decentralized manner, which can respond to users’ requirements faster than the normal CC, but with limited computing capacities. As both CC and EC are resource-sensitive, several big issues arise, such as how to conduct job scheduling, resource allocation, and task offloading, which significantly influence the performance of the whole system. To tackle these issues, many optimization problems have been formulated. These optimization problems usually have complex properties, such as non-convexity and NP-hardness, which may not be addressed by the traditional convex optimization-based solutions. Computational intelligence (CI), consisting of a set of nature-inspired computational approaches, recently exhibits great potential in addressing these optimization problems in CC and EC. This article provides an overview of research problems in CC and EC and recent progresses in addressing them with the help of CI techniques. Informative discussions and future research trends are also presented, with the aim of offering insights to the readers and motivating new research directions

arXiv.org e-Print Archive

Northumbria Research Link