266 research outputs found

    A Literature Survey on Resource Management Techniques, Issues and Challenges in Cloud Computing

    Get PDF
    Cloud computing is a large scale distributed computing which provides on demand services for clients. Cloud Clients use web browsers, mobile apps, thin clients, or terminal emulators to request and control their cloud resources at any time and anywhere through the network. As many companies are shifting their data to cloud and as many people are being aware of the advantages of storing data to cloud, there is increasing number of cloud computing infrastructure and large amount of data which lead to the complexity management for cloud providers. We surveyed the state-of-the-art resource management techniques for IaaS (infrastructure as a service) in cloud computing. Then we put forward different major issues in the deployment of the cloud infrastructure in order to avoid poor service delivery in cloud computing

    Classification and Performance Study of Task Scheduling Algorithms in Cloud Computing Environment

    Get PDF
    Cloud computing is becoming very common in recent years and is growing rapidly due to its attractive benefits and features such as resource pooling, accessibility, availability, scalability, reliability, cost saving, security, flexibility, on-demand services, pay-per-use services, use from anywhere, quality of service, resilience, etc. With this rapid growth of cloud computing, there may exist too many users that require services or need to execute their tasks simultaneously by resources provided by service providers. To get these services with the best performance, and minimum cost, response time, makespan, effective use of resources, etc. an intelligent and efficient task scheduling technique is required and considered as one of the main and essential issues in the cloud computing environment. It is necessary for allocating tasks to the proper cloud resources and optimizing the overall system performance. To this end, researchers put huge efforts to develop several classes of scheduling algorithms to be suitable for the various computing environments and to satisfy the needs of the various types of individuals and organizations. This research article provides a classification of proposed scheduling strategies and developed algorithms in cloud computing environment along with the evaluation of their performance. A comparison of the performance of these algorithms with existing ones is also given. Additionally, the future research work in the reviewed articles (if available) is also pointed out. This research work includes a review of 88 task scheduling algorithms in cloud computing environment distributed over the seven scheduling classes suggested in this study. Each article deals with a novel scheduling technique and the performance improvement it introduces compared with previously existing task scheduling algorithms. Keywords: Cloud computing, Task scheduling, Load balancing, Makespan, Energy-aware, Turnaround time, Response time, Cost of task, QoS, Multi-objective. DOI: 10.7176/IKM/12-5-03 Publication date:September 30th 2022

    Performance optimization and energy efficiency of big-data computing workflows

    Get PDF
    Next-generation e-science is producing colossal amounts of data, now frequently termed as Big Data, on the order of terabyte at present and petabyte or even exabyte in the predictable future. These scientific applications typically feature data-intensive workflows comprised of moldable parallel computing jobs, such as MapReduce, with intricate inter-job dependencies. The granularity of task partitioning in each moldable job of such big data workflows has a significant impact on workflow completion time, energy consumption, and financial cost if executed in clouds, which remains largely unexplored. This dissertation conducts an in-depth investigation into the properties of moldable jobs and provides an experiment-based validation of the performance model where the total workload of a moldable job increases along with the degree of parallelism. Furthermore, this dissertation conducts rigorous research on workflow execution dynamics in resource sharing environments and explores the interactions between workflow mapping and task scheduling on various computing platforms. A workflow optimization architecture is developed to seamlessly integrate three interrelated technical components, i.e., resource allocation, job mapping, and task scheduling. Cloud computing provides a cost-effective computing platform for big data workflows where moldable parallel computing models are widely applied to meet stringent performance requirements. Based on the moldable parallel computing performance model, a big-data workflow mapping model is constructed and a workflow mapping problem is formulated to minimize workflow makespan under a budget constraint in public clouds. This dissertation shows this problem to be strongly NP-complete and designs i) a fully polynomial-time approximation scheme for a special case with a pipeline-structured workflow executed on virtual machines of a single class, and ii) a heuristic for a generalized problem with an arbitrary directed acyclic graph-structured workflow executed on virtual machines of multiple classes. The performance superiority of the proposed solution is illustrated by extensive simulation-based results in Hadoop/YARN in comparison with existing workflow mapping models and algorithms. Considering that large-scale workflows for big data analytics have become a main consumer of energy in data centers, this dissertation also delves into the problem of static workflow mapping to minimize the dynamic energy consumption of a workflow request under a deadline constraint in Hadoop clusters, which is shown to be strongly NP-hard. A fully polynomial-time approximation scheme is designed for a special case with a pipeline-structured workflow on a homogeneous cluster and a heuristic is designed for the generalized problem with an arbitrary directed acyclic graph-structured workflow on a heterogeneous cluster. This problem is further extended to a dynamic version with deadline-constrained MapReduce workflows to minimize dynamic energy consumption in Hadoop clusters. This dissertation proposes a semi-dynamic online scheduling algorithm based on adaptive task partitioning to reduce dynamic energy consumption while meeting performance requirements from a global perspective, and also develops corresponding system modules for algorithm implementation in the Hadoop ecosystem. The performance superiority of the proposed solutions in terms of dynamic energy saving and deadline missing rate is illustrated by extensive simulation results in comparison with existing algorithms, and further validated through real-life workflow implementation and experiments using the Oozie workflow engine in Hadoop/YARN systems

    TimeTrader: Exploiting Latency Tail to Save Datacenter Energy for On-line Data-Intensive Applications

    Get PDF
    Datacenters running on-line, data-intensive applications (OLDIs) consume significant amounts of energy. However, reducing their energy is challenging due to their tight response time requirements. A key aspect of OLDIs is that each user query goes to all or many of the nodes in the cluster, so that the overall time budget is dictated by the tail of the replies' latency distribution; replies see latency variations both in the network and compute. Previous work proposes to achieve load-proportional energy by slowing down the computation at lower datacenter loads based directly on response times (i.e., at lower loads, the proposal exploits the average slack in the time budget provisioned for the peak load). In contrast, we propose TimeTrader to reduce energy by exploiting the latency slack in the sub- critical replies which arrive before the deadline (e.g., 80% of replies are 3-4x faster than the tail). This slack is present at all loads and subsumes the previous work's load-related slack. While the previous work shifts the leaves' response time distribution to consume the slack at lower loads, TimeTrader reshapes the distribution at all loads by slowing down individual sub-critical nodes without increasing missed deadlines. TimeTrader exploits slack in both the network and compute budgets. Further, TimeTrader leverages Earliest Deadline First scheduling to largely decouple critical requests from the queuing delays of sub- critical requests which can then be slowed down without hurting critical requests. A combination of real-system measurements and at-scale simulations shows that without adding to missed deadlines, TimeTrader saves 15-19% and 41-49% energy at 90% and 30% loading, respectively, in a datacenter with 512 nodes, whereas previous work saves 0% and 31-37%.Comment: 13 page
    • …
    corecore