8 research outputs found
The Workflow Trace Archive: Open-Access Data from Public and Private Computing Infrastructures -- Technical Report
Realistic, relevant, and reproducible experiments often need input traces
collected from real-world environments. We focus in this work on traces of
workflows---common in datacenters, clouds, and HPC infrastructures. We show
that the state-of-the-art in using workflow-traces raises important issues: (1)
the use of realistic traces is infrequent, and (2) the use of realistic, {\it
open-access} traces even more so. Alleviating these issues, we introduce the
Workflow Trace Archive (WTA), an open-access archive of workflow traces from
diverse computing infrastructures and tooling to parse, validate, and analyze
traces. The WTA includes million workflows captured from
computing infrastructures, representing a broad diversity of trace domains and
characteristics. To emphasize the importance of trace diversity, we
characterize the WTA contents and analyze in simulation the impact of trace
diversity on experiment results. Our results indicate significant differences
in characteristics, properties, and workflow structures between workload
sources, domains, and fields.Comment: Technical repor
Automatic deployment and reproducibility of workflow on the Cloud using container virtualization
PhD ThesisCloud computing is a service-oriented approach to distributed computing that has
many attractive features, including on-demand access to large compute resources. One
type of cloud applications are scientific work
ows, which are playing an increasingly
important role in building applications from heterogeneous components. Work
ows are
increasingly used in science as a means to capture, share, and publish computational
analysis. Clouds can offer a number of benefits to work
ow systems, including the
dynamic provisioning of the resources needed for computation and storage, which has
the potential to dramatically increase the ability to quickly extract new results from
the huge amounts of data now being collected.
However, there are increasing number of Cloud computing platforms, each with different
functionality and interfaces. It therefore becomes increasingly challenging to
de ne work
ows in a portable way so that they can be run reliably on different clouds.
As a consequence, work
ow developers face the problem of deciding which Cloud to
select and - more importantly for the long-term - how to avoid vendor lock-in.
A further issue that has arisen with work
ows is that it is common for them to stop
being executable a relatively short time after they were created. This can be due to
the external resources required to execute a work
ow - such as data and services -
becoming unavailable. It can also be caused by changes in the execution environment
on which the work
ow depends, such as changes to a library causing an error when a
work
ow service is executed. This "work
ow decay" issue is recognised as an impediment
to the reuse of work
ows and the reproducibility of their results. It is becoming
a major problem, as the reproducibility of science is increasingly dependent on the
reproducibility of scientific work
ows.
In this thesis we presented new solutions to address these challenges. We propose a new
approach to work
ow modelling that offers a portable and re-usable description of the
work
ow using the TOSCA specification language. Our approach addresses portability
by allowing work
ow components to be systematically specifed and automatically
- v -
deployed on a range of clouds, or in local computing environments, using container
virtualisation techniques.
To address the issues of reproducibility and work
ow decay, our modelling and deployment
approach has also been integrated with source control and container management
techniques to create a new framework that e ciently supports dynamic work
ow deployment,
(re-)execution and reproducibility.
To improve deployment performance, we extend the framework with number of new
optimisation techniques, and evaluate their effect on a range of real and synthetic
work
ows.Ministry of Higher Education and
Scientific Research in Iraq and Mosul Universit