644 research outputs found

    Adaptive Energy-aware Scheduling of Dynamic Event Analytics across Edge and Cloud Resources

    Full text link
    The growing deployment of sensors as part of Internet of Things (IoT) is generating thousands of event streams. Complex Event Processing (CEP) queries offer a useful paradigm for rapid decision-making over such data sources. While often centralized in the Cloud, the deployment of capable edge devices on the field motivates the need for cooperative event analytics that span Edge and Cloud computing. Here, we identify a novel problem of query placement on edge and Cloud resources for dynamically arriving and departing analytic dataflows. We define this as an optimization problem to minimize the total makespan for all event analytics, while meeting energy and compute constraints of the resources. We propose 4 adaptive heuristics and 3 rebalancing strategies for such dynamic dataflows, and validate them using detailed simulations for 100 - 1000 edge devices and VMs. The results show that our heuristics offer O(seconds) planning time, give a valid and high quality solution in all cases, and reduce the number of query migrations. Furthermore, rebalance strategies when applied in these heuristics have significantly reduced the makespan by around 20 - 25%.Comment: 11 pages, 7 figure

    Model-driven Scheduling for Distributed Stream Processing Systems

    Full text link
    Distributed Stream Processing frameworks are being commonly used with the evolution of Internet of Things(IoT). These frameworks are designed to adapt to the dynamic input message rate by scaling in/out.Apache Storm, originally developed by Twitter is a widely used stream processing engine while others includes Flink, Spark streaming. For running the streaming applications successfully there is need to know the optimal resource requirement, as over-estimation of resources adds extra cost.So we need some strategy to come up with the optimal resource requirement for a given streaming application. In this article, we propose a model-driven approach for scheduling streaming applications that effectively utilizes a priori knowledge of the applications to provide predictable scheduling behavior. Specifically, we use application performance models to offer reliable estimates of the resource allocation required. Further, this intuition also drives resource mapping, and helps narrow the estimated and actual dataflow performance and resource utilization. Together, this model-driven scheduling approach gives a predictable application performance and resource utilization behavior for executing a given DSPS application at a target input stream rate on distributed resources.Comment: 54 page

    VirtFogSim: A parallel toolbox for dynamic energy-delay performance testing and optimization of 5G Mobile-Fog-Cloud virtualized platforms

    Get PDF
    It is expected that the pervasive deployment of multi-tier 5G-supported Mobile-Fog-Cloudtechnological computing platforms will constitute an effective means to support the real-time execution of future Internet applications by resource- and energy-limited mobile devices. Increasing interest in this emerging networking-computing technology demands the optimization and performance evaluation of several parts of the underlying infrastructures. However, field trials are challenging due to their operational costs, and in every case, the obtained results could be difficult to repeat and customize. These emergingMobile-Fog-Cloud ecosystems still lack, indeed, customizable software tools for the performance simulation of their computing-networking building blocks. Motivated by these considerations, in this contribution, we present VirtFogSim. It is aMATLAB-supported software toolbox that allows the dynamic joint optimization and tracking of the energy and delay performance of Mobile-Fog-Cloud systems for the execution of applications described by general Directed Application Graphs (DAGs). In a nutshell, the main peculiar features of the proposed VirtFogSim toolbox are that: (i) it allows the joint dynamic energy-aware optimization of the placement of the application tasks and the allocation of the needed computing-networking resources under hard constraints on acceptable overall execution times, (ii) it allows the repeatable and customizable simulation of the resulting energy-delay performance of the overall system; (iii) it allows the dynamic tracking of the performed resource allocation under time-varying operational environments, as those typically featuring mobile applications; (iv) it is equipped with a user-friendly Graphic User Interface (GUI) that supports a number of graphic formats for data rendering, and (v) itsMATLAB code is optimized for running atop multi-core parallel execution platforms. To check both the actual optimization and scalability capabilities of the VirtFogSim toolbox, a number of experimental setups featuring different use cases and operational environments are simulated, and their performances are compared

    한정된 자원 환경을 위한 인공신경망 설계 및 가속

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2022. 8. 윤성로.Deep learning methods have become very successful in various applications due to the availability of exponentially growing data and powerful computing resources. Due to remarkable performance, neural model has been applied to various edge devices such as mobile and embedded system. These edge devices usually suffer from constrained resources including computation and energy. To overcome this challenge, low-latency model design, model compression and acceleration are widely researched on both hardware and software side. In this dissertation, we introduce two methods with regard to low-latency model design on algorithm side. Designing compact and low-latency model is important to reduce required resources. For this reason, in aspect of algorithm, we introduced two model design methodology with neural architecture search (NAS) to find compact model: cell-based NAS and graph variational auto-encoder based NAS. Our cell-based NAS approach is based on Differentiable ARchiTecture Search (DARTS) which is well-known differentiable NAS method. Despite the popularity of DARTS, it has been reported that DARTS often shows unrobustness and sub-optimality. Through extensive theoretical analysis and empirical observations, we reveal that this issue occurs as a result of the existence of unnormalized operations. Based on our finding, we propose a novel variancestationary differentiable architecture search (VS-DARTS). VS-DARTS makes the architecture parameters a more reliable metric for deriving a desirable architecture without increasing the search cost. In addition, we derive comparable architecture using VS-DARTS with soft constrained latency objectives. Another approach to find low-latency model is using graph generative models, which has been recently focused because of their efficiency. We proposed novel graph variational auto-encoder (VAE)which shows dramatically improvements on cell-based search space. After our graph VAE extracted architectural information from the neural architecture, we conducted novel multi-objective NAS using extracted information for hardware-constrained environments. We showed that the proposed multi-objective NAS can derive various models close to Pareto optimal between latency and accuracy. Also, we evaluated our proposed method on various hardware platforms. In summary, this dissertation proposes two methods for improving performance of NAS, which can use compact and low-latency neural model for computing resource-constrained environments. The proposed methods were evaluated with various experimental studies.딥러닝은 빅데이터와 강력한 병렬 프로세서가 사용 가능해지면서 다양한 분야에서 성공적인 모습을 보여주고 있으며, 모바일과 임베디드 시스템 같은 엣지 디바이스에도 다양하게 적용되고 있다. 그러나, 엣지 디바이스는 일반적으로 컴퓨팅 및 전력 자원이 부족한 환경이다. 이를 해결하기 위해서, 저-레이턴시 모델 디자인, 모델 압축 기법과 뉴럴 모델 가속 등이 알고리즘과 하드웨어 양쪽 측면에서 활발하게 연구되고 있다. 본 논문에서는 알고리즘 측면에서 저-레이턴시 모델 디자인을위한 두 가지 새로운 방법론을 제시한다. 필요한 컴퓨팅 자원을 줄이기 위해서는 먼저 컴팩트하고 레이턴시가 작은 모델을 디자인하는 것이 중요하므로, 알고리즘 측면에서 우리는 두가지 신경구조탐색을 이용한 모델 디자인 방법론을 제시하였다: 셀 기반 신경구조탐색과 그래프 배리에이셔널 오토인코더를 이용한 신경구조탐색이다. 본 학위논문의 셀 기반 신경구조탐색은 널리 알려진 differentiable NAS 방법론인 DARTS (Differentiable ARchiTecture Search)에 기반한다. DARTS는 여러 연구에서 베이스라인 방법론으로 널리 사용되고 있음에도 불구하고, 종종 학습 불안정성과 최적화가 부족하다는 점이 이미 보고된 바 있으나 그 근본적인 원인에대해서는 일부 밝혀지지 않았었다. 우리는 이론적 분석과 관찰을 통해서 그 근본적인 문제가 각 오퍼레이션의 정규화되지 않은 출력에 기인하는 것을 밝히고, 이 문제점을 해결할 수 있는 방법론인 VS-DARTS (variance stationary DARTS)를 제안하였다. VS-DARTS는 구조 변수(architectural parameter)의 신뢰성을 높여서 탐색 비용을 늘이지 않고 성능을 높였다. 또한, 우리는 VS-DARTS에 레이턴시에 대한연성 제약(soft constraint)을 적용함으로써 기존 셀 기반 방법론에 비견되는 성능의 구조를 탐색하였다. 또 다른 저 레이턴시 모델을 찾는 접근 방법으로는 그래프 생성모델(graph generative model)을 적용하였다. 우리는 새로운 그래프 배리에이셔널 오토인코더 (graph variational auto-encoder) 방법론을 제안하여 셀 기반 탐색 공간에 대한 오토인코더 성능을 크게 향상시켰다. 이후, 제안한 배리에이셔널 오토인코더를 이용하여 뉴럴 구조의 임베딩 정보를 추출하고, 추출된 구조 임베딩 정보를 새로이 제안한 다중목적탐색(multi-objective search)에 이용하여 레이턴시-정확도에서 파레토 최적(Pareto optimal)에 가까운 구조들을 찾을 수 있음을 보였다. 또한 제안한 신경구조탐색 방법론을 다양한 하드웨어 플랫폼 상에서 검증하였다. 요약하면, 본 학위논문에서는 보다 컴팩트하고 레이턴시가 작은 모델을 찾는데 사용할 수 있는 신경구조탐색 방법론을 두가지 제안하였고, 이를 이용하여 컴퓨팅 자원이 부족한 환경을 위한 인공 신경망의 자동 설계 방법에 관하여 기술하였으며, 다양한 실험을 통해서 검증하였다.1 Introduction 1 2 Background 7 2.1 Neural Architecture Search 7 2.1.1 Previous Works on Differentiable NAS 7 2.1.2 Preliminaries on DARTS 8 2.2 Graph Variational Auto-Encoder 10 2.2.1 Graph Representation Learning . 10 2.2.2 Variational Auto-encoder for Graph. 10 2.2.3 Neural Architecture Search (NAS) with Generative Models 11 2.2.4 Preliminaries on VAE for DAGs 12 3 Neural architecture search for resource-constrained environment 15 3.1 Introduction 15 3.2 Issue of DARTS Architecture Parameter 17 3.3 Lack of Reliability of β 18 3.4 Variance-stationary DARTS (VS-DARTS) 21 3.4.1 Node Normalization 21 3.4.2 Remedying Gradient Imbalance 22 3.4.3 β-Continuous Relaxation 23 3.5 Experimental Results 28 3.5.1 Settings 28 3.5.2 Results in DARTS Search Space 30 3.5.3 Results in RobustDARTS Search Space 33 3.5.4 Ablation Study 34 3.6 Summary 35 4 Platform-aware Neural Architecture Search with Graph Variational AutoEncoder 37 4.1 Introduction 37 4.2 Proposed Methods on Graph Variational Auto-Encoder 39 4.2.1 Fail Case Study 39 4.2.2 Proposed Update Function 41 4.2.3 Observer Node 43 4.3 Proposed Predictor-based Multi-objective NAS 43 4.3.1 Training Graph VAE (Step 1) 43 4.3.2 Search Process (Step 2) 46 4.3.3 Return the set of the searched architectures (line 21-22) 47 4.4 Experimental Results 47 4.4.1 Settings 47 4.4.2 VAE Performance Comparison 48 4.4.3 Pre-predictor 50 4.4.4 Search Performance Comparison 50 4.5 Discussions 51 4.5.1 Node Index Order in DAG VAE 51 4.5.2 Model size reduction while keep the reconstructive performance 51 4.5.3 Convergence Acceleration 52 4.6 Summary 52 5 Conclusion 57 Bibliography 58 Abstract (In Korean) 72박

    {\mu}-DDRL: A QoS-Aware Distributed Deep Reinforcement Learning Technique for Service Offloading in Fog computing Environments

    Full text link
    Fog and Edge computing extend cloud services to the proximity of end users, allowing many Internet of Things (IoT) use cases, particularly latency-critical applications. Smart devices, such as traffic and surveillance cameras, often do not have sufficient resources to process computation-intensive and latency-critical services. Hence, the constituent parts of services can be offloaded to nearby Edge/Fog resources for processing and storage. However, making offloading decisions for complex services in highly stochastic and dynamic environments is an important, yet difficult task. Recently, Deep Reinforcement Learning (DRL) has been used in many complex service offloading problems; however, existing techniques are most suitable for centralized environments, and their convergence to the best-suitable solutions is slow. In addition, constituent parts of services often have predefined data dependencies and quality of service constraints, which further intensify the complexity of service offloading. To solve these issues, we propose a distributed DRL technique following the actor-critic architecture based on Asynchronous Proximal Policy Optimization (APPO) to achieve efficient and diverse distributed experience trajectory generation. Also, we employ PPO clipping and V-trace techniques for off-policy correction for faster convergence to the most suitable service offloading solutions. The results obtained demonstrate that our technique converges quickly, offers high scalability and adaptability, and outperforms its counterparts by improving the execution time of heterogeneous services

    Collaborative Reuse of Streaming Dataflows in IoT Applications

    Full text link
    Distributed Stream Processing Systems (DSPS) like Apache Storm and Spark Streaming enable composition of continuous dataflows that execute persistently over data streams. They are used by Internet of Things (IoT) applications to analyze sensor data from Smart City cyber-infrastructure, and make active utility management decisions. As the ecosystem of such IoT applications that leverage shared urban sensor streams continue to grow, applications will perform duplicate pre-processing and analytics tasks. This offers the opportunity to collaboratively reuse the outputs of overlapping dataflows, thereby improving the resource efficiency. In this paper, we propose \emph{dataflow reuse algorithms} that given a submitted dataflow, identifies the intersection of reusable tasks and streams from a collection of running dataflows to form a \emph{merged dataflow}. Similar algorithms to unmerge dataflows when they are removed are also proposed. We implement these algorithms for the popular Apache Storm DSPS, and validate their performance and resource savings for 35 synthetic dataflows based on public OPMW workflows with diverse arrival and departure distributions, and on 21 real IoT dataflows from RIoTBench.Comment: To appear in IEEE eScience Conference 201

    System Optimisation for Multi-access Edge Computing Based on Deep Reinforcement Learning

    Get PDF
    Multi-access edge computing (MEC) is an emerging and important distributed computing paradigm that aims to extend cloud service to the network edge to reduce network traffic and service latency. Proper system optimisation and maintenance are crucial to maintaining high Quality-of-service (QoS) for end-users. However, with the increasing complexity of the architecture of MEC and mobile applications, effectively optimising MEC systems is non-trivial. Traditional optimisation methods are generally based on simplified mathematical models and fixed heuristics, which rely heavily on expert knowledge. As a consequence, when facing dynamic MEC scenarios, considerable human efforts and expertise are required to redesign the model and tune the heuristics, which is time-consuming. This thesis aims to develop deep reinforcement learning (DRL) methods to handle system optimisation problems in MEC. Instead of developing fixed heuristic algorithms for the problems, this thesis aims to design DRL-based methods that enable systems to learn optimal solutions on their own. This research demonstrates the effectiveness of DRL-based methods on two crucial system optimisation problems: task offloading and service migration. Specifically, this thesis first investigate the dependent task offloading problem that considers the inner dependencies of tasks. This research builds a DRL-based method combining sequence-to-sequence (seq2seq) neural network to address the problem. Experiment results demonstrate that our method outperforms the existing heuristic algorithms and achieves near-optimal performance. To further enhance the learning efficiency of the DRL-based task offloading method for unseen learning tasks, this thesis then integrates meta reinforcement learning to handle the task offloading problem. Our method can adapt fast to new environments with a small number of gradient updates and samples. Finally, this thesis exploits the DRL-based solution for the service migration problem in MEC considering user mobility. This research models the service migration problem as a Partially Observable Markov Decision Process (POMDP) and propose a tailored actor-critic algorithm combining Long-short Term Memory (LSTM) to solve the POMDP. Results from extensive experiments based on real-world mobility traces demonstrate that our method consistently outperforms both the heuristic and state-of-the-art learning-driven algorithms on various MEC scenarios

    A Survey of Pipelined Workflow Scheduling: Models and Algorithms

    Get PDF
    International audienceA large class of applications need to execute the same workflow on different data sets of identical size. Efficient execution of such applications necessitates intelligent distribution of the application components and tasks on a parallel machine, and the execution can be orchestrated by utilizing task-, data-, pipelined-, and/or replicated-parallelism. The scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling, and it has been widely studied in the last decade. Multiple models and algorithms have flourished to tackle various programming paradigms, constraints, machine behaviors or optimization goals. This paper surveys the field by summing up and structuring known results and approaches

    Real-Time Wireless Sensor-Actuator Networks for Cyber-Physical Systems

    Get PDF
    A cyber-physical system (CPS) employs tight integration of, and coordination between computational, networking, and physical elements. Wireless sensor-actuator networks provide a new communication technology for a broad range of CPS applications such as process control, smart manufacturing, and data center management. Sensing and control in these systems need to meet stringent real-time performance requirements on communication latency in challenging environments. There have been limited results on real-time scheduling theory for wireless sensor-actuator networks. Real-time transmission scheduling and analysis for wireless sensor-actuator networks requires new methodologies to deal with unique characteristics of wireless communication. Furthermore, the performance of a wireless control involves intricate interactions between real-time communication and control. This thesis research tackles these challenges and make a series of contributions to the theory and system for wireless CPS. (1) We establish a new real-time scheduling theory for wireless sensor-actuator networks. (2) We develop a scheduling-control co-design approach for holistic optimization of control performance in a wireless control system. (3) We design and implement a wireless sensor-actuator network for CPS in data center power management. (4) We expand our research to develop scheduling algorithms and analyses for real-time parallel computing to support computation-intensive CPS
    corecore