288 research outputs found
Scalable and Distributed Resource Management Protocols for Cloud and Big Data Clusters
Cloud data centers require an operating system to manage resources and satisfy operational requirements and management objectives. The growth of popularity in cloud services causes the appearance of a new spectrum of services with sophisticated workload and resource management requirements. Also, data centers are growing by addition of various type of hardware to accommodate the ever-increasing requests of users. Nowadays a large percentage of cloud resources are executing data-intensive applications which need continuously changing workload fluctuations and specific resource management. To this end, cluster computing frameworks are shifting towards distributed resource management for better scalability and faster decision making. Such systems benefit from the parallelization of control and are resilient to failures. Throughout this thesis we investigate algorithms, protocols and techniques to address these challenges in large-scale data centers. We introduce a distributed resource management framework which consolidates virtual machine to as few servers as possible to reduce the energy consumption of data center and hence decrease the cost of cloud providers. This framework can characterize the workload of virtual machines and hence handle trade-off energy consumption and Service Level Agreement (SLA) of customers efficiently. The algorithm is highly scalable and requires low maintenance cost with dynamic workloads and it tries to minimize virtual machines migration costs. We also introduce a scalable and distributed probe-based scheduling algorithm for Big data analytics frameworks. This algorithm can efficiently address the problem job heterogeneity in workloads that has appeared after increasing the level of parallelism in jobs. The algorithm is massively scalable and can reduce significantly average job completion times in comparison with the-state of-the-art. Finally, we propose a probabilistic fault-tolerance technique as part of the scheduling algorithm
Contributions to Desktop Grid Computing : From High Throughput Computing to Data-Intensive Sciences on Hybrid Distributed Computing Infrastructures
Since the mid 90’s, Desktop Grid Computing - i.e the idea of using a large number of remote PCs distributed on the Internet to execute large parallel applications - has proved to be an efficient paradigm to provide a large computational power at the fraction of the cost of a dedicated computing infrastructure.This document presents my contributions over the last decade to broaden the scope of Desktop Grid Computing. My research has followed three different directions. The first direction has established new methods to observe and characterize Desktop Grid resources and developed experimental platforms to test and validate our approach in conditions close to reality. The second line of research has focused on integrating Desk- top Grids in e-science Grid infrastructure (e.g. EGI), which requires to address many challenges such as security, scheduling, quality of service, and more. The third direction has investigated how to support large-scale data management and data intensive applica- tions on such infrastructures, including support for the new and emerging data-oriented programming models.This manuscript not only reports on the scientific achievements and the technologies developed to support our objectives, but also on the international collaborations and projects I have been involved in, as well as the scientific mentoring which motivates my candidature for the Habilitation `a Diriger les Recherches
Simulating resource management in fog computing systems
The fog computing paradigm was introduced to address the new challenges and requirements posed by the Internet of Things (IoT). It extends the cloud to the edge of the network, thereby facilitating processing and storing a massive amount of data where it is created and used. This novel computing paradigm is widely studied in both the academy and the industry, primarily by simulation. Today, a large variety of edge and fog computing simulators exist and are reviewed by several surveys. These reviews, however, mainly focus on high-level comparisons of these simulators and often make contradictory statements, which makes it difficult to assess what studies are feasible with a simulation tool.
To address these challenges, we focus on a single state-of-the-art fog simulation tool, iFogSim2. In this paper, we provide an in-depth review of the simulator and examine its model, assumptions, and technical characteristics. Our analysis describes the details of fog resource management mechanisms implemented by iFogSim2 and discusses what it is capable of and where its limitations lie. We construct a case study to assess the tool's suitability for a mobile 5G scenario, namely, road surface weather analysis with smart vehicles. The case study is used to retrieve qualitative results of what is feasible with the tool, and what is not. We demonstrate that iFogSim2 has a valid locality model for the mobile 5G use case, but it is not suitable for experimenting with vehicular fog computing, dynamic placement, server-side service discovery, and load-balancing. In addition, we present a modeling and analytics framework, built for iFogSim2, to improve the simulation software and facilitate future research with the tool
Rigorous results on the effectiveness of some heuristics for the consolidation of virtual machines in a cloud data center
Dynamic consolidation of virtual machines (VMs) in a cloud data center can be used to minimize
power consumption. Beloglazov et al. have proposed the MM (Minimization of Migrations) heuristic for
selecting the VMs to migrate from under- or over-utilized hosts, as well as the MBFD (Modified Best
Fit Decreasing) heuristic for deciding the placement of the migrated VMs. According to their simulation
results, these heuristics work very well in practice. In this paper, we investigate what performance
guarantees can be rigorously proven for the heuristics. In particular, we establish that MM is optimal
with respect to the number of selected VMs of an over-utilized host and it is a 1.5-approximation with
respect to the decrease in utilization. On the other hand, we show that the result of MBFD can be
arbitrarily far from the optimum. Moreover, we show that even if both MM and MBFD deliver optimal
results, their combination does not necessarily result in optimal VM consolidation, but approximation
results can be proven under suitable technical conditions. To the best of our knowledge, these are the
first rigorously proven results on the effectiveness of also practically useful heuristic algorithms for the
VM consolidation problem
- …