2 research outputs found

    Automatic deployment and reproducibility of workflow on the Cloud using container virtualization

    Get PDF
    PhD ThesisCloud computing is a service-oriented approach to distributed computing that has many attractive features, including on-demand access to large compute resources. One type of cloud applications are scientific work ows, which are playing an increasingly important role in building applications from heterogeneous components. Work ows are increasingly used in science as a means to capture, share, and publish computational analysis. Clouds can offer a number of benefits to work ow systems, including the dynamic provisioning of the resources needed for computation and storage, which has the potential to dramatically increase the ability to quickly extract new results from the huge amounts of data now being collected. However, there are increasing number of Cloud computing platforms, each with different functionality and interfaces. It therefore becomes increasingly challenging to de ne work ows in a portable way so that they can be run reliably on different clouds. As a consequence, work ow developers face the problem of deciding which Cloud to select and - more importantly for the long-term - how to avoid vendor lock-in. A further issue that has arisen with work ows is that it is common for them to stop being executable a relatively short time after they were created. This can be due to the external resources required to execute a work ow - such as data and services - becoming unavailable. It can also be caused by changes in the execution environment on which the work ow depends, such as changes to a library causing an error when a work ow service is executed. This "work ow decay" issue is recognised as an impediment to the reuse of work ows and the reproducibility of their results. It is becoming a major problem, as the reproducibility of science is increasingly dependent on the reproducibility of scientific work ows. In this thesis we presented new solutions to address these challenges. We propose a new approach to work ow modelling that offers a portable and re-usable description of the work ow using the TOSCA specification language. Our approach addresses portability by allowing work ow components to be systematically specifed and automatically - v - deployed on a range of clouds, or in local computing environments, using container virtualisation techniques. To address the issues of reproducibility and work ow decay, our modelling and deployment approach has also been integrated with source control and container management techniques to create a new framework that e ciently supports dynamic work ow deployment, (re-)execution and reproducibility. To improve deployment performance, we extend the framework with number of new optimisation techniques, and evaluate their effect on a range of real and synthetic work ows.Ministry of Higher Education and Scientific Research in Iraq and Mosul Universit

    Gathering solutions and providing APIs for their orchestration to implement continuous software delivery

    Get PDF
    In traditional IT environments, it is common for software updates and new releases to take up to several weeks or even months to be eventually available to end users. Therefore, many IT vendors and providers of software products and services face the challenge of delivering updates considerably more frequently. This is because users, customers, and other stakeholders expect accelerated feedback loops and significantly faster responses to changing demands and issues that arise. Thus, taking this challenge seriously is of utmost economic importance for IT organizations if they wish to remain competitive. Continuous software delivery is an emerging paradigm adopted by an increasing number of organizations in order to address this challenge. It aims to drastically shorten release cycles while ensuring the delivery of high-quality software. Adopting continuous delivery essentially means to make it economical to constantly deliver changes in small batches. Infrequent high-risk releases with lots of accumulated changes are thereby replaced by a continuous stream of small and low-risk updates. To gain from the benefits of continuous delivery, a high degree of automation is required. This is technically achieved by implementing continuous delivery pipelines consisting of different application-specific stages (build, test, production, etc.) to automate most parts of the application delivery process. Each stage relies on a corresponding application environment such as a build environment or production environment. This work presents concepts and approaches to implement continuous delivery pipelines based on systematically gathered solutions to be used and orchestrated as building blocks of application environments. Initially, the presented Gather'n'Deliver method is centered around a shared knowledge base to provide the foundation for gathering, utilizing, and orchestrating diverse solutions such as deployment scripts, configuration definitions, and Cloud services. Several classification dimensions and taxonomies are discussed in order to facilitate a systematic categorization of solutions, in addition to expressing application environment requirements that are satisfied by those solutions. The presented GatherBase framework enables the collaborative and automated gathering of solutions through solution repositories. These repositories are the foundation for building diverse knowledge base variants that provide fine-grained query mechanisms to find and retrieve solutions, for example, to be used as building blocks of specific application environments. Combining and integrating diverse solutions at runtime is achieved by orchestrating their APIs. Since some solutions such as lower-level executable artifacts (deployment scripts, configuration definitions, etc.) do not immediately provide their functionality through APIs, additional APIs need to be supplied. This issue is addressed by different approaches, such as the presented Any2API framework that is intended to generate individual APIs for such artifacts. An integrated architecture in conjunction with corresponding prototype implementations aims to demonstrate the technical feasibility of the presented approaches. Finally, various validation scenarios evaluate the approaches within the scope of continuous delivery and application environments and even beyond
    corecore