428 research outputs found

    Scheduling Real-time Divisible Loads in Cluster Computing Environment

    Get PDF
    The significance of cluster computing in solving massively parallel workloads is tremendous. Divisible Load Theory has proven to be very successful in optimizing the usage of the system resources by partitioning the arbitrarily divisible loads adequately among the cluster nodes. Arbitrarily divisible loads have significant real-world applications in high energy and particle physics. In this thesis, various algorithms for a cluster computing environment are studied including the ones dealing with divisible load theory confirming DLT based algorithms performing better in most cases. The loads that are considered in this thesis are hard real-time tasks with associated deadlines. Specifically, a comparison is made between clusters with one where the head node doesn't participate in processing of the work-loads with the other where the head node does participate in processing of the work-loads. A new mathematical formula is derived for the task execution time corresponding to the new scenario of head node possessing front-end processing capability. The existing algorithms corresponding to Real-Time Divisible Load Theory are then implemented using this new formula to examine the scheduling performance in this new scenario compared to the conventional scenario where the head node lacks front-end processing capability

    Static Scheduling Strategies for Heterogeneous Systems

    Get PDF
    In this paper, we consider static scheduling techniques for heterogeneous systems, such as clusters and grids. We successively deal with minimum makespan scheduling, divisible load scheduling and steady-state scheduling. Finally, we discuss the limitations of static scheduling approaches

    Divisible load scheduling of image processing applications on the heterogeneous star and tree networks using a new genetic algorithm

    Get PDF
    The divisible load scheduling of image processing applications on the heterogeneous star and multi-level tree networks is addressed in this paper. In our platforms, processors and network links have different speeds. In addition, computation and communication overheads are considered. A new genetic algorithm for minimizing the processing time of low-level image applications using divisible load theory is introduced. The closed-form solution for the processing time, the image fractions that should be allocated to each processor, the optimum number of participating processors, and the optimal sequence for load distribution are derived. The new concept of equivalent processor in tree network is introduced and the effect of different image and kernel sizes on processing time and speed up are investigated. Finally, to indicate the efficiency of our algorithm, several numerical experiments are presented

    Efficient Parallel Video Encoding on Heterogeneous Systems

    Get PDF
    Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014.In this study we propose an efficient method for collaborative H.264/AVC inter-loop encoding in heterogeneous CPU+GPU systems. This method relies on specifically developed extensive library of highly optimized parallel algorithms for both CPU and GPU architectures, and all inter-loop modules. In order to minimize the overall encoding time, this method integrates adaptive load balancing for the most computationally intensive, inter-prediction modules, which is based on dynamically built functional performance models of heterogenous devices and inter-loop modules. The proposed method also introduces efficient communication-aware techniques, which maximize data reusing, and decrease the overhead of expensive data transfers in collaborative video encoding. The experimental results show that the proposed method is able of achieving real-time video encoding for very demanding video coding parameters, i.e., full HD video format, 64×64 pixels search area and the exhaustive motion estimation.This work was supported by national funds through FCT – Fundação para a Ciência e a Tecnologia, under projects PEst-OE/EEI/LA0021/2013, PTDC/EEI-ELC/3152/2012 and PTDC/EEA-ELC/117329/2010

    Application-centric Resource Provisioning for Amazon EC2 Spot Instances

    Full text link
    In late 2009, Amazon introduced spot instances to offer their unused resources at lower cost with reduced reliability. Amazon's spot instances allow customers to bid on unused Amazon EC2 capacity and run those instances for as long as their bid exceeds the current spot price. The spot price changes periodically based on supply and demand, and customers whose bids exceed it gain access to the available spot instances. Customers may expect their services at lower cost with spot instances compared to on-demand or reserved. However the reliability is compromised since the instances(IaaS) providing the service(SaaS) may become unavailable at any time without any notice to the customer. Checkpointing and migration schemes are of great use to cope with such situation. In this paper we study various checkpointing schemes that can be used with spot instances. Also we device some algorithms for checkpointing scheme on top of application-centric resource provisioning framework that increase the reliability while reducing the cost significantly

    Highly scalable algorithms for scheduling tasks and provisioning machines on heterogeneous computing systems

    Get PDF
    Includes bibliographical references.2015 Summer.As high performance computing systems increase in size, new and more efficient algorithms are needed to schedule work on the machines, understand the performance trade-offs inherent in the system, and determine which machines to provision. The extreme scale of these newer systems requires unique task scheduling algorithms that are capable of handling millions of tasks and thousands of machines. A highly scalable scheduling algorithm is developed that computes high quality schedules, especially for large problem sizes. Large-scale computing systems also consume vast amounts of electricity, leading to high operating costs. Through the use of novel resource allocation techniques, system administrators can examine this trade-off space to quantify how much a given performance level will cost in electricity, or see what kind of performance can be expected when given an energy budget. Trading-off energy and makespan is often difficult for companies because it is unclear how each affects the profit. A monetary-based model of high performance computing is presented and a highly scalable algorithm is developed to quickly find the schedule that maximizes the profit per unit time. As more high performance computing needs are being met with cloud computing, algorithms are needed to determine the types of machines that are best suited to a particular workload. An algorithm is designed to find the best set of computing resources to allocate to the workload that takes into account the uncertainty in the task arrival rates, task execution times, and power consumption. Reward rate, cost, failure rate, and power consumption can be optimized, as desired, to optimally trade-off these conflicting objectives

    Scalable dimensioning of resilient Lambda Grids

    Get PDF
    This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit