384 research outputs found
Managing Workflows on top of a Cloud Computing Orchestrator for using heterogeneous environments on e-Science
[EN] Scientific workflows (SWFs) are widely used to model processes
in e-Science. SWFs are executed by means of workflow management systems
(WMSs), which orchestrate the workload on top of computing infrastructures.
The advent of cloud computing infrastructures has opened the door of using
on-demand infrastructures to complement or even replace local infrastructures.
However, new issues have arisen, such as the integration of hybrid resources
or the compromise between infrastructure reutilisation and elasticity. In this
article, we present an ad hoc solution for managing workflows exploiting the
capabilities of cloud orchestrators to deploy resources on demand according to
the workload and to combine heterogeneous cloud providers (such as on-premise
clouds and public clouds) and traditional infrastructures (clusters) to minimise
costs and response time. The work does not propose yet another WMS but
demonstrates the benefits of the integration of cloud orchestration when running
complex workflows. The article shows several configuration experiments from
a realistic comparative genomics workflow called Orthosearch, to migrate
memory-intensive workload to public infrastructures while keeping other blocks
of the experiment running locally. The article computes running time and cost
suggesting best practices.This paper wants to acknowledge the support of the EUBrazilCC project, funded by the European Commission (STREP 614048) and the Brazilian MCT/CNPq N. 13/2012, for the use of its infrastructure. The authors would like also to thank the Spanish 'Ministerio de Economia y Competitividad' for the project 'Clusters Virtuales Elasticos y Migrables sobre Infraestructuras Cloud Hibridas' with reference TIN2013-44390-R.Carrión Collado, AA.; Caballer Fernández, M.; Blanquer Espert, I.; Kotowski, N.; Jardim, R.; Dávila, AMR. (2017). Managing Workflows on top of a Cloud Computing Orchestrator for using heterogeneous environments on e-Science. International Journal of Web and Grid Services. 13(4):375-402. doi:10.1504/IJWGS.2017.10003225S37540213
StreamFlow: cross-breeding cloud with HPC
Workflows are among the most commonly used tools in a variety of execution
environments. Many of them target a specific environment; few of them make it
possible to execute an entire workflow in different environments, e.g.
Kubernetes and batch clusters. We present a novel approach to workflow
execution, called StreamFlow, that complements the workflow graph with the
declarative description of potentially complex execution environments, and that
makes it possible the execution onto multiple sites not sharing a common data
space. StreamFlow is then exemplified on a novel bioinformatics pipeline for
single-cell transcriptomic data analysis workflow.Comment: 30 pages - 2020 IEEE Transactions on Emerging Topics in Computin
Scalable Multi-cloud Platform to Support Industry and Scientific Applications
Cloud computing offers resources on-demand and without large capital investments. As such, it is attractive to many industry and scientific application areas that require large computation and storage facilities. Although Infrastructure as a Service (IaaS) clouds provide elasticity and on demand resource access, the challenges represented by multi-cloud capabilities and application level scalability are still largely unsolved. The CloudSME Simulation Platform (CSSP) extended with the Microservices-based Cloud Application-level Dynamic Orchestrator (MiCADO) addresses such issues. CSSP is a generic multi-cloud access platform for the development and execution of large scale industry and scientific simulations on heterogeneous cloud resources. MiCADO provides application level scalability to optimise execution time and costs. This paper outlines how these technologies have been developed in various European research projects, and showcases several application case-studies from manufacturing, engineering and life-sciences where these tools have been successfully utilised to execute large-scale simulations in an optimised way on heterogeneous cloud infrastructures
Collaborative Cloud Computing Framework for Health Data with Open Source Technologies
The proliferation of sensor technologies and advancements in data collection
methods have enabled the accumulation of very large amounts of data.
Increasingly, these datasets are considered for scientific research. However,
the design of the system architecture to achieve high performance in terms of
parallelization, query processing time, aggregation of heterogeneous data types
(e.g., time series, images, structured data, among others), and difficulty in
reproducing scientific research remain a major challenge. This is specifically
true for health sciences research, where the systems must be i) easy to use
with the flexibility to manipulate data at the most granular level, ii)
agnostic of programming language kernel, iii) scalable, and iv) compliant with
the HIPAA privacy law. In this paper, we review the existing literature for
such big data systems for scientific research in health sciences and identify
the gaps of the current system landscape. We propose a novel architecture for
software-hardware-data ecosystem using open source technologies such as Apache
Hadoop, Kubernetes and JupyterHub in a distributed environment. We also
evaluate the system using a large clinical data set of 69M patients.Comment: This paper is accepted in ACM-BCB 202
Deployment and Operation of Complex Software in Heterogeneous Execution Environments
This open access book provides an overview of the work developed within the SODALITE project, which aims at facilitating the deployment and operation of distributed software on top of heterogeneous infrastructures, including cloud, HPC and edge resources. The experts participating in the project describe how SODALITE works and how it can be exploited by end users. While multiple languages and tools are available in the literature to support DevOps teams in the automation of deployment and operation steps, still these activities require specific know-how and skills that cannot be found in average teams. The SODALITE framework tackles this problem by offering modelling and smart editing features to allow those we call Application Ops Experts to work without knowing low level details about the adopted, potentially heterogeneous, infrastructures. The framework offers also mechanisms to verify the quality of the defined models, generate the corresponding executable infrastructural code, automatically wrap application components within proper execution containers, orchestrate all activities concerned with deployment and operation of all system components, and support on-the-fly self-adaptation and refactoring
Network Service Orchestration: A Survey
Business models of network service providers are undergoing an evolving
transformation fueled by vertical customer demands and technological advances
such as 5G, Software Defined Networking~(SDN), and Network Function
Virtualization~(NFV). Emerging scenarios call for agile network services
consuming network, storage, and compute resources across heterogeneous
infrastructures and administrative domains. Coordinating resource control and
service creation across interconnected domains and diverse technologies becomes
a grand challenge. Research and development efforts are being devoted to
enabling orchestration processes to automate, coordinate, and manage the
deployment and operation of network services. In this survey, we delve into the
topic of Network Service Orchestration~(NSO) by reviewing the historical
background, relevant research projects, enabling technologies, and
standardization activities. We define key concepts and propose a taxonomy of
NSO approaches and solutions to pave the way towards a common understanding of
the various ongoing efforts around the realization of diverse NSO application
scenarios. Based on the analysis of the state of affairs, we present a series
of open challenges and research opportunities, altogether contributing to a
timely and comprehensive survey on the vibrant and strategic topic of network
service orchestration.Comment: Accepted for publication at Computer Communications Journa
INDIGO-Datacloud: foundations and architectural description of a Platform as a Service oriented to scientific computing
Software Engineering.-- et al.In this paper we describe the architecture of a Platform as a Service (PaaS) oriented to computing and data analysis. In order to clarify the choices we made, we explain the features using practical examples, applied to several known usage patterns in the area of HEP computing. The proposed architecture is devised to provide researchers with a unified view of distributed computing infrastructures, focusing in facilitating seamless access. In this respect the Platform is able to profit from the most recent developments for computing and processing large amounts of data, and to
exploit current storage and preservation technologies, with the appropriate mechanisms to ensure security and privacy.INDIGO-DataCloud is co-founded by the Horizon 2020Framework Programme.Peer reviewe
Component-aware Orchestration of Cloud-based Enterprise Applications, from TOSCA to Docker and Kubernetes
Enterprise IT is currently facing the challenge of coordinating the
management of complex, multi-component applications across heterogeneous cloud
platforms. Containers and container orchestrators provide a valuable solution
to deploy multi-component applications over cloud platforms, by coupling the
lifecycle of each application component to that of its hosting container. We
hereby propose a solution for going beyond such a coupling, based on the OASIS
standard TOSCA and on Docker. We indeed propose a novel approach for deploying
multi-component applications on top of existing container orchestrators, which
allows to manage each component independently from the container used to run
it. We also present prototype tools implementing our approach, and we show how
we effectively exploited them to carry out a concrete case study
- …