5 research outputs found

    A Framework for Approximate Optimization of BoT Application Deployment in Hybrid Cloud Environment

    Get PDF
    We adopt a systematic approach to investigate the efficiency of near-optimal deployment of large-scale CPU-intensive Bag-of-Task applications running on cloud resources with the non-proportional cost to performance ratios. Our analytical solutions perform in both known and unknown running time of the given application. It tries to optimize users' utility by choosing the most desirable tradeoff between the make-span and the total incurred expense. We propose a schema to provide a near-optimal deployment of BoT application regarding users' preferences. Our approach is to provide user with a set of Pareto-optimal solutions, and then she may select one of the possible scheduling points based on her internal utility function. Our framework can cope with uncertainty in the tasks' execution time using two methods, too. First, an estimation method based on a Monte Carlo sampling called AA algorithm is presented. It uses the minimum possible number of sampling to predict the average task running time. Second, assuming that we have access to some code analyzer, code profiling or estimation tools, a hybrid method to evaluate the accuracy of each estimation tool in certain interval times for improving resource allocation decision has been presented. We propose approximate deployment strategies that run on hybrid cloud. In essence, proposed strategies first determine either an estimated or an exact optimal schema based on the information provided from users' side and environmental parameters. Then, we exploit dynamic methods to assign tasks to resources to reach an optimal schema as close as possible by using two methods. A fast yet simple method based on First Fit Decreasing algorithm, and a more complex approach based on the approximation solution of the transformed problem into a subset sum problem. Extensive experiment results conducted on a hybrid cloud platform confirm that our framework can deliver a near optimal solution respecting user's utility function

    ADEPT Runtime/Scalability Predictor in support of Adaptive Scheduling

    Get PDF
    A job scheduler determines the order and duration of the allocation of resources, e.g. CPU, to the tasks waiting to run on a computer. Round-Robin and First-Come-First-Serve are examples of algorithms for making such resource allocation decisions. Parallel job schedulers make resource allocation decisions for applications that need multiple CPU cores, on computers consisting of many CPU cores connected by different interconnects. An adaptive parallel scheduler is a parallel scheduler that is capable of adjusting its resource allocation decisions based on the current resource usage and demand. Adaptive parallel schedulers that decide the numbers of CPU cores to allocate to a parallel job provide more flexibility and potentially improve performance significantly for both local and grid job scheduling compared to non-adaptive schedulers. A major reason why adaptive schedulers are not yet used practically is due to lack of knowledge of the scalability curves of the applications, and high cost of existing white-box approaches for scalability prediction. We show that a runtime and scalability prediction tool can be developed with 3 requirements: accuracy comparable to white-box methods, applicability, and robustness. Applicability depends only on knowledge feasible to gain in a production environment. Robustness addresses anomalous behaviour and unreliable predictions. We present ADEPT, a speedup and runtime prediction tool that satisfies all criteria for both single problem size and across different problem sizes of a parallel application. ADEPT is also capable of handling anomalies and judging reliability of its predictions. We demonstrate these using experiments with MPI and OpenMP implementations of NAS benchmarks and seven real applications

    Autonomous grid scheduling using probabilistic job runtime scheduling

    Get PDF
    Computational Grids are evolving into a global, service-oriented architecture – a universal platform for delivering future computational services to a range of applications of varying complexity and resource requirements. The thesis focuses on developing a new scheduling model for general-purpose, utility clusters based on the concept of user requested job completion deadlines. In such a system, a user would be able to request each job to finish by a certain deadline, and possibly to a certain monetary cost. Implementing deadline scheduling is dependent on the ability to predict the execution time of each queued job, and on an adaptive scheduling algorithm able to use those predictions to maximise deadline adherence. The thesis proposes novel solutions to these two problems and documents their implementation in a largely autonomous and self-managing way. The starting point of the work is an extensive analysis of a representative Grid workload revealing consistent workflow patterns, usage cycles and correlations between the execution times of jobs and its properties commonly collected by the Grid middleware for accounting purposes. An automated approach is proposed to identify these dependencies and use them to partition the highly variable workload into subsets of more consistent and predictable behaviour. A range of time-series forecasting models, applied in this context for the first time, were used to model the job execution times as a function of their historical behaviour and associated properties. Based on the resulting predictions of job runtimes a novel scheduling algorithm is able to estimate the latest job start time necessary to meet the requested deadline and sort the queue accordingly to minimise the amount of deadline overrun. The testing of the proposed approach was done using the actual job trace collected from a production Grid facility. The best performing execution time predictor (the auto-regressive moving average method) coupled to workload partitioning based on three simultaneous job properties returned the median absolute percentage error centroid of only 4.75%. This level of prediction accuracy enabled the proposed deadline scheduling method to reduce the average deadline overrun time ten-fold compared to the benchmark batch scheduler. Overall, the thesis demonstrates that deadline scheduling of computational jobs on the Grid is achievable using statistical forecasting of job execution times based on historical information. The proposed approach is easily implementable, substantially self-managing and better matched to the human workflow making it well suited for implementation in the utility Grids of the future

    Workload Modeling for Computer Systems Performance Evaluation

    Full text link
    corecore