Search CORE

3 research outputs found

Simulation of a workflow execution as a real Cloud by adding noise

Author: Radu Prodan
Roland Matha
Sasko Ristov
Publication venue
Publication date: 15/12/2017
Field of study

Cloud computing provides a cheap and elastic platform for executing large scientific workflow applications, but it rises two challenges in prediction of makespan (total execution time): performance instability of Cloud instances and variant scheduling of dynamic schedulers. Estimating the makespan is necessary for IT managers in order to calculate the cost of execution, for which they can use Cloud simulators. However, the ideal simulated environment produces the same output for the same workflow schedule and input parameters and thus can not reproduce the Cloud variant behavior. In this paper, we define a model and a methodology to add a noise to the simulation in order to equalise its behavior with the Clouds’ one. We propose several metrics to model a Cloud fluctuating behavior and then by injecting them within the simulator, it starts to behave as close as the real Cloud. Instead of using a normal distribution naively by using mean value and standard deviation of workflow tasks’ runtime, we inject two noises in the tasks’ runtime: noisiness of tasks within a workflow (defined as average runtime deviation) and noisiness provoked by the environment over the whole workflow (defined as average environmental deviation). In order to measure the quality of simulation by quantifying the relative difference between the simulated and measured values, we introduce the parameter inaccuracy. A series of experiments with different workflows and Cloud resources were conducted in order to evaluate our model and methodology. The results show that the inaccuracy of the makespan’s mean value was reduced up to 59 times compared to naively using the normal distribution. Additionally, we analyse the impact of particular workflow and Cloud parameters, which shows that the Cloud performance instability is simulated more correctly for small instance type (inaccuracy of up to 11.5%), instead of medium (inaccuracy of up to 35%), regardless of the workflow. Since our approach requires collecting data by executing the workflow in the Cloud in order to learn its behavior, we conduct a comprehensive sensitivity analysis. We determine the minimum amount of data that needs to be collected or minimum number of test cases that needs to be repeated for each experiment in order to get less than 12% inaccuracy for our noising parameter. Additionally, in order to reduce the number of experiments and determine the dependency of our model against Cloud resource and workflow parameters, the conducted comprehensive sensitivity analysis shows that the correctness of our model is independent of workflow parallel section size. With our sensitivity analysis, we show that we can reduce the inaccuracy of the naive approach with only 40% of total number of executions per experiment in the learning phase. In our case, 20 executions per experiment instead of 50, and only half of all experiments, which means down to 20%, i.e. 120 test cases instead of 600

ZENODO

Analysing the Performance Instability Correlation with Various Workflow and Cloud Parameters

Author: Matha Roland
Prodan Radu
Ristov Sasko
Publication venue
Publication date
Field of study

Cloud is an eco-system in which virtual machine instances are starting and terminating asynchronously on user demand or automatically when the load is rapidly increased or decreased. Although this dynamic environment allows to rent computing or storage resources cheaper rather than buying them, still it does not guarantee the stable execution during a period of time as the traditional physical environment. This is emphasised even more for workflows execution, since they consist of many data and control dependencies, which cause the makespan to be instable when a workflow is being executed in different periods of time in Cloud. In this paper we analyse several parameters of workflow and the cloud environment that are expected to impact the workflow execution instability and investigate the correlation between them. The cloud parameters include the number of instances and their type, as well as the correlation with the efficient or inefficient execution of workflow parallel sections. We conduct a series of experiments, repeating each experiment by 30 test cases in order to evaluate instability for different cloud and workflow parameters. The results show a neglectful correlation between each pair of parameters, as well as the tasks and file transfers within the workflow. Oppose to the expectations, the distribution of the makespan per experiment does not always comply with the normal distribution, which is also not correlated to a particular cloud or workflow parameter.(VLID)2218660Submitted versio

University of Innsbruck Digital Library

The workflow trace archive:Open-access data from public and private computing infrastructures

Author: Deelman Ewa
Hegeman Tim
Iosup Alexandru
Matha Roland
Prodan Radu
Talluri Sacheendra
Versluis Laurens
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2020
Field of study

Realistic, relevant, and reproducible experiments often need input traces collected from real-world environments. In this work, we focus on traces of workflows - common in datacenters, clouds, and HPC infrastructures. We show that the state-of-the-art in using workflow-traces raises important issues: (1) the use of realistic traces is infrequent and (2) the use of realistic, open-access traces even more so. Alleviating these issues, we introduce the Workflow Trace Archive (WTA), an open-access archive of workflow traces from diverse computing infrastructures and tooling to parse, validate, and analyze traces. The WTA includes {>}48>48 million workflows captured from {>}10>10 computing infrastructures, representing a broad diversity of trace domains and characteristics. To emphasize the importance of trace diversity, we characterize the WTA contents and analyze in simulation the impact of trace diversity on experiment results. Our results indicate significant differences in characteristics, properties, and workflow structures between workload sources, domains, and fields

VU Research Portal