995 research outputs found
QoS-Aware Resource Management for Multi-phase Serverless Workflows with Aquatope
Multi-stage serverless applications, i.e., workflows with many computation
and I/O stages, are becoming increasingly representative of FaaS platforms.
Despite their advantages in terms of fine-grained scalability and modular
development, these applications are subject to suboptimal performance, resource
inefficiency, and high costs to a larger degree than previous simple serverless
functions.
We present Aquatope, a QoS-and-uncertainty-aware resource scheduler for
end-to-end serverless workflows that takes into account the inherent
uncertainty present in FaaS platforms, and improves performance predictability
and resource efficiency. Aquatope uses a set of scalable and validated Bayesian
models to create pre-warmed containers ahead of function invocations, and to
allocate appropriate resources at function granularity to meet a complex
workflow's end-to-end QoS, while minimizing resource cost. Across a diverse set
of analytics and interactive multi-stage serverless workloads, Aquatope
significantly outperforms prior systems, reducing QoS violations by 5x, and
cost by 34% on average and up to 52% compared to other QoS-meeting methods
Recommended from our members
Scheduling, Characterization and Prediction of HPC Workloads for Distributed Computing Environments
As High Performance Computing (HPC) has grown considerably and is expected to grow even more, effective resource management for distributed computing sys- tems is motivated more than ever. As the computational workloads grow in quantity, it is becoming more crucial to apply efficient resource management and workload scheduling to use resources efficiently while keeping the computational performance reasonably good. The problem of efficiently scheduling workloads on resources while meeting performance standards is hard. Additionally, non-clairvoyance of job dimen- sions makes resource management even harder in real-world scenarios. Our research methodology investigates the scheduling problem compliant for HPC and researches the challenges for deploying the scheduling in real world-scenarios using state of the art machine learning and data science techniques.To this end, this Ph.D. dissertation makes the following core contributions: a) We perform a theoretical analysis of space-sharing, non-preemptive scheduling: we studied this scheduling problem and proposed scheduling algorithms with polyno- mial computation time. We also proved constant upper-bounds for the performance of these algorithms. b) We studied the sensitivity of scheduling algorithms to the accuracy of runtime and devised a meta-learning approach to estimate prediction accuracy for newly submitted jobs to the HPC system. c) We studied the runtime prediction problem for HPC applications. For this purpose, we studied the distri- bution of available public workloads and proposed two different solutions that can predict multi-modal distributions: switching state-space models and Mixture Density Networks. d) We studied the effectiveness of recent recurrent neural network models for CPU usage trace prediction for individual VM traces as well as aggregate CPU usage traces. In this dissertation, we explore solutions to improve the performance of scheduling workloads on distributed systems.We begin by looking at the problem from the theoretical perspective. Modeling the problem mathematically, we first propose a scheduling algorithm that finds a constant approximation of the optimal solution for the problem in polynomial time. We prove that the performance of the algorithm (average completion time is the constant approximation of the performance of the optimal scheduling. We next look at the problem in real-world scenarios. Considering High-Performance Computing (HPC) workload computing environments as the most similar real-world equivalent of our mathematical model, we explore the problem of predicting application runtime. We propose an algorithm to handle the existing uncertainties in the real world and show-case our algorithm with demonstrative effectiveness in terms of response time and resource utilization. After looking at the uncertainty problem, we focus on trying to improve the accuracy of existing prediction approaches for HPC application runtime. We propose two solutions, one based on Kalman filters and one based on deep density mixture networks. We showcase the effectiveness of our prediction approaches by comparing with previous prediction approaches in terms of prediction accuracy and impact on improving scheduling performance. In the end, we focus on predicting resource usage for individual applications during their execution. We explore the application of recurrent neural networks for predicting resource usage of applications deployed on individual virtual machines. To validate our proposed models and solutions, we performed extensive trace-driven simulation and measured the effectiveness of our approaches
A Survey of Deep Learning for Data Caching in Edge Network
The concept of edge caching provision in emerging 5G and beyond mobile
networks is a promising method to deal both with the traffic congestion problem
in the core network as well as reducing latency to access popular content. In
that respect end user demand for popular content can be satisfied by
proactively caching it at the network edge, i.e, at close proximity to the
users. In addition to model based caching schemes learning-based edge caching
optimizations has recently attracted significant attention and the aim
hereafter is to capture these recent advances for both model based and data
driven techniques in the area of proactive caching. This paper summarizes the
utilization of deep learning for data caching in edge network. We first outline
the typical research topics in content caching and formulate a taxonomy based
on network hierarchical structure. Then, a number of key types of deep learning
algorithms are presented, ranging from supervised learning to unsupervised
learning as well as reinforcement learning. Furthermore, a comparison of
state-of-the-art literature is provided from the aspects of caching topics and
deep learning methods. Finally, we discuss research challenges and future
directions of applying deep learning for cachin
DeepFT: Fault-tolerant edge computing using a self-supervised deep surrogate model
The emergence of latency-critical AI applications has been supported by the evolution of the edge computing paradigm. However, edge solutions are typically resource-constrained, posing reliability challenges due to heightened contention for compute capacities and faulty application behavior in the presence of overload conditions. Although a large amount of generated log data can be mined for fault prediction, labeling this data for training is a manual process and thus a limiting factor for automation. Due to this, many companies resort to unsupervised fault-tolerance models. Yet, failure models of this kind can incur a loss of accuracy when they need to adapt to non-stationary workloads and diverse host characteristics. Thus, we propose a novel modeling approach, DeepFT, to proactively avoid system overloads and their adverse effects by optimizing the task scheduling decisions. DeepFT uses a deep-surrogate model to accurately predict and diagnose faults in the system and co-simulation based self-supervised learning to dynamically adapt the model in volatile settings. Experimentation on an edge cluster shows that DeepFT can outperform state-of-the-art methods in fault-detection and QoS metrics. Specifically, DeepFT gives the highest F1 scores for fault-detection, reducing service deadline violations by up to 37% while also improving response time by up to 9%
Methods and Applications of Synthetic Data Generation
The advent of data mining and machine learning has highlighted the value of large and varied sources of data, while increasing the demand for synthetic data captures the structural and statistical characteristics of the original data without revealing personal or proprietary information contained in the original dataset.
In this dissertation, we use examples from original research to show that, using appropriate models and input parameters, synthetic data that mimics the characteristics of real data can be generated with sufficient rate and quality to address the volume, structural complexity, and statistical variation requirements of research and development of digital information processing systems.
First, we present a progression of research studies using a variety of tools to generate synthetic network traffic patterns, enabling us to observe relationships between network latency and communication pattern benchmarks at all levels of the network stack.
We then present a framework for synthesizing large scale IoT data with complex structural characteristics in a scalable extraction and synthesis framework, and demonstrate the use of generated data in the benchmarking of IoT middleware.
Finally, we detail research on synthetic image generation for deep learning models using 3D modeling. We find that synthetic images can be an effective technique for augmenting limited sets of real training data, and in use cases that benefit from incremental training or model specialization, we find that pretraining on synthetic images provided a usable base model for transfer learning
Landscape of IoT security
The last two decades have experienced a steady rise in the production and deployment of sensing-and-connectivity-enabled electronic devices, replacing “regular” physical objects. The resulting Internet-of-Things (IoT) will soon become indispensable for many application domains. Smart objects are continuously being integrated within factories, cities, buildings, health institutions, and private homes.
Approximately 30 years after the birth of IoT, society is confronted with significant challenges regarding IoT security. Due to the interconnectivity and ubiquitous use of IoT devices, cyberattacks have widespread impacts on multiple stakeholders. Past events show that the IoT domain holds various vulnerabilities, exploited to generate physical, economic, and health damage. Despite many of these threats, manufacturers struggle to secure IoT devices properly.
Thus, this work overviews the IoT security landscape with the intention to emphasize the demand for secured IoT-related products and applications. Therefore, (a) a list of key challenges of securing IoT devices is determined by examining their particular characteristics, (b) major security objectives for secured IoT systems are defined, (c) a threat taxonomy is introduced, which outlines potential security gaps prevalent in current IoT systems, and (d) key countermeasures against the aforementioned threats are summarized for selected IoT security-related technologies available on the market
Toward a Live BBU Container Migration in Wireless Networks
Cloud Radio Access Networks (Cloud-RANs) have recently emerged as a promising architecture to meet the increasing demands and expectations of future wireless networks. Such an architecture can enable dynamic and flexible network operations to address significant challenges, such as higher mobile traffic volumes and increasing network operation costs. However, the implementation of compute-intensive signal processing Network Functions (NFs) on the General Purpose Processors (General Purpose Processors) that are typically found in data centers could lead to performance complications, such as in the case of overloaded servers. There is therefore a need for methods that ensure the availability and continuity of critical wireless network functionality in such circumstances.
Motivated by the goal of providing highly available and fault-tolerant functionality in Cloud-RAN-based networks, this paper
proposes the design, specification, and implementation of live migration of containerized Baseband Units (BBUs) in two wireless network settings, namely Long Range Wide Area Network (LoRaWAN) and Long Term Evolution (LTE) networks. Driven by the requirements and critical challenges of live migration, the approach shows that in the case of LoRaWAN networks, the migration of BBUs is currently possible with relatively low downtimes to support network continuity. The analysis and comparison of the performance of functional splits and cell configurations in both networks were performed in terms of fronthaul throughput requirements. The results obtained from such an analysis can be used by both service providers and network operators in the deployment and optimization of Cloud-RANs services, in order to ensure network reliability and continuity in cloud environments
CILP: Co-simulation based imitation learner for dynamic resource provisioning in cloud computing environments
Intelligent Virtual Machine (VM) provisioning is central to cost and resource efficient computation in cloud computing environments. As bootstrapping VMs is time-consuming, a key challenge for latency-critical tasks is to predict future workload demands to provision VMs proactively. However, existing AI-based solutions tend to not holistically consider all crucial aspects such as provisioning overheads, heterogeneous VM costs and Quality of Service (QoS) of the cloud system. To address this, we propose a novel method, called CILP, that formulates the VM provisioning problem as two sub-problems of prediction and optimization, where the provisioning plan is optimized based on predicted workload demands. CILP leverages a neural network as a surrogate model to predict future workload demands with a co-simulated digital-twin of the infrastructure to compute QoS scores. We extend the neural network to also act as an imitation learner that dynamically decides the optimal VM provisioning plan. A transformer based neural model reduces training and inference overheads while our novel two-phase decision making loop facilitates in making informed provisioning decisions. Crucially, we address limitations of prior work by including resource utilization, deployment costs and provisioning overheads to inform the provisioning decisions in our imitation learning framework. Experiments with three public benchmarks demonstrate that CILP gives up to 22% higher resource utilization, 14% higher QoS scores and 44% lower execution costs compared to the current online and offline optimization based state-of-the-art methods
- …