2 research outputs found
Automatic deployment and reproducibility of workflow on the Cloud using container virtualization
PhD ThesisCloud computing is a service-oriented approach to distributed computing that has
many attractive features, including on-demand access to large compute resources. One
type of cloud applications are scientific work
ows, which are playing an increasingly
important role in building applications from heterogeneous components. Work
ows are
increasingly used in science as a means to capture, share, and publish computational
analysis. Clouds can offer a number of benefits to work
ow systems, including the
dynamic provisioning of the resources needed for computation and storage, which has
the potential to dramatically increase the ability to quickly extract new results from
the huge amounts of data now being collected.
However, there are increasing number of Cloud computing platforms, each with different
functionality and interfaces. It therefore becomes increasingly challenging to
de ne work
ows in a portable way so that they can be run reliably on different clouds.
As a consequence, work
ow developers face the problem of deciding which Cloud to
select and - more importantly for the long-term - how to avoid vendor lock-in.
A further issue that has arisen with work
ows is that it is common for them to stop
being executable a relatively short time after they were created. This can be due to
the external resources required to execute a work
ow - such as data and services -
becoming unavailable. It can also be caused by changes in the execution environment
on which the work
ow depends, such as changes to a library causing an error when a
work
ow service is executed. This "work
ow decay" issue is recognised as an impediment
to the reuse of work
ows and the reproducibility of their results. It is becoming
a major problem, as the reproducibility of science is increasingly dependent on the
reproducibility of scientific work
ows.
In this thesis we presented new solutions to address these challenges. We propose a new
approach to work
ow modelling that offers a portable and re-usable description of the
work
ow using the TOSCA specification language. Our approach addresses portability
by allowing work
ow components to be systematically specifed and automatically
- v -
deployed on a range of clouds, or in local computing environments, using container
virtualisation techniques.
To address the issues of reproducibility and work
ow decay, our modelling and deployment
approach has also been integrated with source control and container management
techniques to create a new framework that e ciently supports dynamic work
ow deployment,
(re-)execution and reproducibility.
To improve deployment performance, we extend the framework with number of new
optimisation techniques, and evaluate their effect on a range of real and synthetic
work
ows.Ministry of Higher Education and
Scientific Research in Iraq and Mosul Universit
Gathering solutions and providing APIs for their orchestration to implement continuous software delivery
In traditional IT environments, it is common for software updates and new releases to take up to several weeks or even months to be eventually available to end users. Therefore, many IT vendors and providers of software products and services face the challenge of delivering updates considerably more frequently. This is because users, customers, and other stakeholders expect accelerated feedback loops and significantly faster responses to changing demands and issues that arise. Thus, taking this challenge seriously is of utmost economic importance for IT organizations if they wish to remain competitive. Continuous software delivery is an emerging paradigm adopted by an increasing number of organizations in order to address this challenge. It aims to drastically shorten release cycles while ensuring the delivery of high-quality software. Adopting continuous delivery essentially means to make it economical to constantly deliver changes in small batches. Infrequent high-risk releases with lots of accumulated changes are thereby replaced by a continuous stream of small and low-risk updates.
To gain from the benefits of continuous delivery, a high degree of automation is required. This is technically achieved by implementing continuous delivery pipelines consisting of different application-specific stages (build, test, production, etc.) to automate most parts of the application delivery process. Each stage relies on a corresponding application environment such as a build environment or production environment. This work presents concepts and approaches to implement continuous delivery pipelines based on systematically gathered solutions to be used and orchestrated as building blocks of application environments. Initially, the presented Gather'n'Deliver method is centered around a shared knowledge base to provide the foundation for gathering, utilizing, and orchestrating diverse solutions such as deployment scripts, configuration definitions, and Cloud services. Several classification dimensions and taxonomies are discussed in order to facilitate a systematic categorization of solutions, in addition to expressing application environment requirements that are satisfied by those solutions. The presented GatherBase framework enables the collaborative and automated gathering of solutions through solution repositories. These repositories are the foundation for building diverse knowledge base variants that provide fine-grained query mechanisms to find and retrieve solutions, for example, to be used as building blocks of specific application environments. Combining and integrating diverse solutions at runtime is achieved by orchestrating their APIs. Since some solutions such as lower-level executable artifacts (deployment scripts, configuration definitions, etc.) do not immediately provide their functionality through APIs, additional APIs need to be supplied. This issue is addressed by different approaches, such as the presented Any2API framework that is intended to generate individual APIs for such artifacts. An integrated architecture in conjunction with corresponding prototype implementations aims to demonstrate the technical feasibility of the presented approaches. Finally, various validation scenarios evaluate the approaches within the scope of continuous delivery and application environments and even beyond