    Cloud-native RStudio on Kubernetes for Hopsworks

    In order to fully benefit from cloud computing, services are designed following the "multi-tenant" architectural model, which is aimed at maximizing resource sharing among users. However, multi-tenancy introduces challenges of security, performance isolation, scaling, and customization. RStudio server is an open-source Integrated Development Environment (IDE) accessible over a web browser for the R programming language. We present the design and implementation of a multi-user distributed system on Hopsworks, a data-intensive AI platform, following the multi-tenant model that provides RStudio as Software as a Service (SaaS). We use the most popular cloud-native technologies: Docker and Kubernetes, to solve the problems of performance isolation, security, and scaling that are present in a multi-tenant environment. We further enable secure data sharing in RStudio server instances to provide data privacy and allow collaboration among RStudio users. We integrate our system with Apache Spark, which can scale and handle Big Data processing workloads. Also, we provide a UI where users can provide custom configurations and have full control of their own RStudio server instances. Our system was tested on a Google Cloud Platform cluster with four worker nodes, each with 30GB of RAM allocated to them. The tests on this cluster showed that 44 RStudio servers, each with 2GB of RAM, can be run concurrently. Our system can scale out to potentially support hundreds of concurrently running RStudio servers by adding more resources (CPUs and RAM) to the cluster or system.Comment: 8 pages, 4 figure

    From Bare Metal to Virtual: Lessons Learned when a Supercomputing Institute Deploys its First Cloud

    As primary provider for research computing services at the University of Minnesota, the Minnesota Supercomputing Institute (MSI) has long been responsible for serving the needs of a user-base numbering in the thousands. In recent years, MSI---like many other HPC centers---has observed a growing need for self-service, on-demand, data-intensive research, as well as the emergence of many new controlled-access datasets for research purposes. In light of this, MSI constructed a new on-premise cloud service, named Stratus, which is architected from the ground up to easily satisfy data-use agreements and fill four gaps left by traditional HPC. The resulting OpenStack cloud, constructed from HPC-specific compute nodes and backed by Ceph storage, is designed to fully comply with controls set forth by the NIH Genomic Data Sharing Policy. Herein, we present twelve lessons learned during the ambitious sprint to take Stratus from inception and into production in less than 18 months. Important, and often overlooked, components of this timeline included the development of new leadership roles, staff and user training, and user support documentation. Along the way, the lessons learned extended well beyond the technical challenges often associated with acquiring, configuring, and maintaining large-scale systems.Comment: 8 pages, 5 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

    Implementing and evaluating an ICON orchestrator

    The cloud computing paradigm has risen, during the last 20 years, to the task of bringing powerful computational services to the masses. Centralizing the computer hardware to a few large data centers has brought large monetary savings, but at the cost of a greater geographical distance between the server and the client. As a new generation of thin clients have emerged, e.g. smartphones and IoT-devices, the larger latencies induced by these greater distances, can limit the applications that could benefit from using the vast resources available in cloud computing. Not long after the explosive growth of cloud computing, a new paradigm, edge computing has risen. Edge computing aims at bringing the resources generally found in cloud computing closer to the edge where many of the end-users, clients and data producers reside. In this thesis, I will present the edge computing concept as well as the technologies enabling it. Furthermore I will show a few edge computing concepts and architectures, including multi- access edge computing (MEC), Fog computing and intelligent containers (ICON). Finally, I will also present a new edge-orchestrator, the ICON Python Orchestrator (IPO), that enables intelligent containers to migrate closer to the users. The ICON Python orchestrator tests the feasibility of the ICON concept and provides per- formance measurements that can be compared to other contemporary edge computing im- plementations. In this thesis, I will present the IPO architecture design including challenges encountered during the implementation phase and solutions to specific problems. I will also show the testing and validation setup. By using the artificial testing and validation network, client migration speeds were measured using three different cases - redirection, cache hot ICON migration and cache cold ICON migration. While there is room for improvements, the migration speeds measured are on par with other edge computing implementations

    Elastic, Interoperable and Container-based Cloud Infrastructures for High Performance Computing

    Tesis por compendio[ES] Las aplicaciones científicas implican generalmente una carga computacional variable y no predecible a la que las instituciones deben hacer frente variando dinámicamente la asignación de recursos en función de las distintas necesidades computacionales. Las aplicaciones científicas pueden necesitar grandes requisitos. Por ejemplo, una gran cantidad de recursos computacionales para el procesado de numerosos trabajos independientes (High Throughput Computing o HTC) o recursos de alto rendimiento para la resolución de un problema individual (High Performance Computing o HPC). Los recursos computacionales necesarios en este tipo de aplicaciones suelen acarrear un coste muy alto que puede exceder la disponibilidad de los recursos de la institución o estos pueden no adaptarse correctamente a las necesidades de las aplicaciones científicas, especialmente en el caso de infraestructuras preparadas para la ejecución de aplicaciones de HPC. De hecho, es posible que las diferentes partes de una aplicación necesiten distintos tipos de recursos computacionales. Actualmente las plataformas de servicios en la nube se han convertido en una solución eficiente para satisfacer la demanda de las aplicaciones HTC, ya que proporcionan un abanico de recursos computacionales accesibles bajo demanda. Por esta razón, se ha producido un incremento en la cantidad de clouds híbridos, los cuales son una combinación de infraestructuras alojadas en servicios en la nube y en las propias instituciones (on-premise). Dado que las aplicaciones pueden ser procesadas en distintas infraestructuras, actualmente la portabilidad de las aplicaciones se ha convertido en un aspecto clave. Probablemente, las tecnologías de contenedores son la tecnología más popular para la entrega de aplicaciones gracias a que permiten reproducibilidad, trazabilidad, versionado, aislamiento y portabilidad. El objetivo de la tesis es proporcionar una arquitectura y una serie de servicios para proveer infraestructuras elásticas híbridas de procesamiento que puedan dar respuesta a las diferentes cargas de trabajo. Para ello, se ha considerado la utilización de elasticidad vertical y horizontal desarrollando una prueba de concepto para proporcionar elasticidad vertical y se ha diseñado una arquitectura cloud elástica de procesamiento de Análisis de Datos. Después, se ha trabajo en una arquitectura cloud de recursos heterogéneos de procesamiento de imágenes médicas que proporciona distintas colas de procesamiento para trabajos con diferentes requisitos. Esta arquitectura ha estado enmarcada en una colaboración con la empresa QUIBIM. En la última parte de la tesis, se ha evolucionado esta arquitectura para diseñar e implementar un cloud elástico, multi-site y multi-tenant para el procesamiento de imágenes médicas en el marco del proyecto europeo PRIMAGE. Esta arquitectura utiliza un almacenamiento distribuido integrando servicios externos para la autenticación y la autorización basados en OpenID Connect (OIDC). Para ello, se ha desarrollado la herramienta kube-authorizer que, de manera automatizada y a partir de la información obtenida en el proceso de autenticación, proporciona el control de acceso a los recursos de la infraestructura de procesamiento mediante la creación de las políticas y roles. Finalmente, se ha desarrollado otra herramienta, hpc-connector, que permite la integración de infraestructuras de procesamiento HPC en infraestructuras cloud sin necesitar realizar cambios en la infraestructura HPC ni en la arquitectura cloud. Cabe destacar que, durante la realización de esta tesis, se han utilizado distintas tecnologías de gestión de trabajos y de contenedores de código abierto, se han desarrollado herramientas y componentes de código abierto y se han implementado recetas para la configuración automatizada de las distintas arquitecturas diseñadas desde la perspectiva DevOps.[CA] Les aplicacions científiques impliquen generalment una càrrega computacional variable i no predictible a què les institucions han de fer front variant dinàmicament l'assignació de recursos en funció de les diferents necessitats computacionals. Les aplicacions científiques poden necessitar grans requisits. Per exemple, una gran quantitat de recursos computacionals per al processament de nombrosos treballs independents (High Throughput Computing o HTC) o recursos d'alt rendiment per a la resolució d'un problema individual (High Performance Computing o HPC). Els recursos computacionals necessaris en aquest tipus d'aplicacions solen comportar un cost molt elevat que pot excedir la disponibilitat dels recursos de la institució o aquests poden no adaptar-se correctament a les necessitats de les aplicacions científiques, especialment en el cas d'infraestructures preparades per a l'avaluació d'aplicacions d'HPC. De fet, és possible que les diferents parts d'una aplicació necessiten diferents tipus de recursos computacionals. Actualment les plataformes de servicis al núvol han esdevingut una solució eficient per satisfer la demanda de les aplicacions HTC, ja que proporcionen un ventall de recursos computacionals accessibles a demanda. Per aquest motiu, s'ha produït un increment de la quantitat de clouds híbrids, els quals són una combinació d'infraestructures allotjades a servicis en el núvol i a les mateixes institucions (on-premise). Donat que les aplicacions poden ser processades en diferents infraestructures, actualment la portabilitat de les aplicacions s'ha convertit en un aspecte clau. Probablement, les tecnologies de contenidors són la tecnologia més popular per a l'entrega d'aplicacions gràcies al fet que permeten reproductibilitat, traçabilitat, versionat, aïllament i portabilitat. L'objectiu de la tesi és proporcionar una arquitectura i una sèrie de servicis per proveir infraestructures elàstiques híbrides de processament que puguen donar resposta a les diferents càrregues de treball. Per a això, s'ha considerat la utilització d'elasticitat vertical i horitzontal desenvolupant una prova de concepte per proporcionar elasticitat vertical i s'ha dissenyat una arquitectura cloud elàstica de processament d'Anàlisi de Dades. Després, s'ha treballat en una arquitectura cloud de recursos heterogenis de processament d'imatges mèdiques que proporciona distintes cues de processament per a treballs amb diferents requisits. Aquesta arquitectura ha estat emmarcada en una col·laboració amb l'empresa QUIBIM. En l'última part de la tesi, s'ha evolucionat aquesta arquitectura per dissenyar i implementar un cloud elàstic, multi-site i multi-tenant per al processament d'imatges mèdiques en el marc del projecte europeu PRIMAGE. Aquesta arquitectura utilitza un emmagatzemament integrant servicis externs per a l'autenticació i autorització basats en OpenID Connect (OIDC). Per a això, s'ha desenvolupat la ferramenta kube-authorizer que, de manera automatitzada i a partir de la informació obtinguda en el procés d'autenticació, proporciona el control d'accés als recursos de la infraestructura de processament mitjançant la creació de les polítiques i rols. Finalment, s'ha desenvolupat una altra ferramenta, hpc-connector, que permet la integració d'infraestructures de processament HPC en infraestructures cloud sense necessitat de realitzar canvis en la infraestructura HPC ni en l'arquitectura cloud. Es pot destacar que, durant la realització d'aquesta tesi, s'han utilitzat diferents tecnologies de gestió de treballs i de contenidors de codi obert, s'han desenvolupat ferramentes i components de codi obert, i s'han implementat receptes per a la configuració automatitzada de les distintes arquitectures dissenyades des de la perspectiva DevOps.[EN] Scientific applications generally imply a variable and an unpredictable computational workload that institutions must address by dynamically adjusting the allocation of resources to their different computational needs. Scientific applications could require a high capacity, e.g. the concurrent usage of computational resources for processing several independent jobs (High Throughput Computing or HTC) or a high capability by means of using high-performance resources for solving complex problems (High Performance Computing or HPC). The computational resources required in this type of applications usually have a very high cost that may exceed the availability of the institution's resources or they are may not be successfully adapted to the scientific applications, especially in the case of infrastructures prepared for the execution of HPC applications. Indeed, it is possible that the different parts that compose an application require different type of computational resources. Nowadays, cloud service platforms have become an efficient solution to meet the need of HTC applications as they provide a wide range of computing resources accessible on demand. For this reason, the number of hybrid computational infrastructures has increased during the last years. The hybrid computation infrastructures are the combination of infrastructures hosted in cloud platforms and the computation resources hosted in the institutions, which are named on-premise infrastructures. As scientific applications can be processed on different infrastructures, the application delivery has become a key issue. Nowadays, containers are probably the most popular technology for application delivery as they ease reproducibility, traceability, versioning, isolation, and portability. The main objective of this thesis is to provide an architecture and a set of services to build up hybrid processing infrastructures that fit the need of different workloads. Hence, the thesis considered aspects such as elasticity and federation. The use of vertical and horizontal elasticity by developing a proof of concept to provide vertical elasticity on top of an elastic cloud architecture for data analytics. Afterwards, an elastic cloud architecture comprising heterogeneous computational resources has been implemented for medical imaging processing using multiple processing queues for jobs with different requirements. The development of this architecture has been framed in a collaboration with a company called QUIBIM. In the last part of the thesis, the previous work has been evolved to design and implement an elastic, multi-site and multi-tenant cloud architecture for medical image processing has been designed in the framework of a European project PRIMAGE. This architecture uses a storage integrating external services for the authentication and authorization based on OpenID Connect (OIDC). The tool kube-authorizer has been developed to provide access control to the resources of the processing infrastructure in an automatic way from the information obtained in the authentication process, by creating policies and roles. Finally, another tool, hpc-connector, has been developed to enable the integration of HPC processing infrastructures into cloud infrastructures without requiring modifications in both infrastructures, cloud and HPC. It should be noted that, during the realization of this thesis, different contributions to open source container and job management technologies have been performed by developing open source tools and components and configuration recipes for the automated configuration of the different architectures designed from the DevOps perspective. The results obtained support the feasibility of the vertical elasticity combined with the horizontal elasticity to implement QoS policies based on a deadline, as well as the feasibility of the federated authentication model to combine public and on-premise clouds.López Huguet, S. (2021). Elastic, Interoperable and Container-based Cloud Infrastructures for High Performance Computing [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/172327TESISCompendi

    Containerization in Cloud Computing: performance analysis of virtualization architectures

    La crescente adozione del cloud è fortemente influenzata dall’emergere di tecnologie che mirano a migliorare i processi di sviluppo e deployment di applicazioni di livello enterprise. L’obiettivo di questa tesi è analizzare una di queste soluzioni, chiamata “containerization” e di valutare nel dettaglio come questa tecnologia possa essere adottata in infrastrutture cloud in alternativa a soluzioni complementari come le macchine virtuali. Fino ad oggi, il modello tradizionale “virtual machine” è stata la soluzione predominante nel mercato. L’importante differenza architetturale che i container offrono ha portato questa tecnologia ad una rapida adozione poichè migliora di molto la gestione delle risorse, la loro condivisione e garantisce significativi miglioramenti in termini di provisioning delle singole istanze. Nella tesi, verrà esaminata la “containerization” sia dal punto di vista infrastrutturale che applicativo. Per quanto riguarda il primo aspetto, verranno analizzate le performances confrontando LXD, Docker e KVM, come hypervisor dell’infrastruttura cloud OpenStack, mentre il secondo punto concerne lo sviluppo di applicazioni di livello enterprise che devono essere installate su un insieme di server distribuiti. In tal caso, abbiamo bisogno di servizi di alto livello, come l’orchestrazione. Pertanto, verranno confrontate le performances delle seguenti soluzioni: Kubernetes, Docker Swarm, Apache Mesos e Cattle

    Evaluating Security Aspects for Building a Secure Virtual Machine

    One of the essential characteristics of cloud computing that revolutionized the IT business is the sharing of computing resources. Despite all the benefits, security is a major concern in a cloud virtualization environment. Among those security issues is securely managing the Virtual Machine (VM) images that contain operating systems, configured platforms, and data. Confidentiality, availability, and integrity of such images pose major concerns as it determines the overall security of the virtual machines. This paper identified and discussed the attributes that define the degree of security in VM images. It will address this problem by explaining the different methods and frameworks developed in the past to address implementing secure VM images. Finally, this paper analyses the security issues and attributes and proposes a framework that will include an approach that helps to develop secure VM images. This work aims to enhance the security of cloud environments

    Degrees of tenant isolation for cloud-hosted software services : a cross-case analysis

    A challenge, when implementing multi-tenancy in a cloud-hosted software service, is how to ensure that the performance and resource consumption of one tenant does not adversely affect other tenants. Software designers and architects must achieve an optimal degree of tenant isolation for their chosen application requirements. The objective of this research is to reveal the trade-offs, commonalities, and differences to be considered when implementing the required degree of tenant isolation. This research uses a cross-case analysis of selected open source cloud-hosted software engineering tools to empirically evaluate varying degrees of isolation between tenants. Our research reveals five commonalities across the case studies: disk space reduction, use of locking, low cloud resource consumption, customization and use of plug-in architecture, and choice of multi-tenancy pattern. Two of these common factors compromise tenant isolation. The degree of isolation is reduced when there is no strategy to reduce disk space and customization and plug-in architecture is not adopted. In contrast, the degree of isolation improves when careful consideration is given to how to handle a high workload, locking of data and processes is used to prevent clashes between multiple tenants and selection of appropriate multi-tenancy pattern. The research also revealed five case study differences: size of generated data, cloud resource consumption, sensitivity to workload changes, the effect of the software process, client latency and bandwidth, and type of software process. The degree of isolation is impaired, in our results, by the large size of generated data, high resource consumption by certain software processes, high or fluctuating workload, low client latency, and bandwidth when transferring multiple files between repositories. Additionally, this research provides a novel explanatory framework for (i) mapping tenant isolation to different software development processes, cloud resources and layers of the cloud stack; and (ii) explaining the different trade-offs to consider affecting tenant isolation (i.e. resource sharing, the number of users/requests, customizability, the size of generated data, the scope of control of the cloud application stack and business constraints) when implementing multi-tenant cloud-hosted software services. This research suggests that software architects have to pay attention to the trade-offs, commonalities, and differences we identify to achieve their degree of tenant isolation requirements

    Security Aware Virtual Machine Allocation Policy to Improve QoS

    Cloud service providers find managing the energy consumption for datacentres as a critical operation. Significant energy is being used by a rising spike in the number of data centres. To overcome this challenge datacentres, attempt to reduce the number of active physical servers by carrying out virtual machine consolidation process. However, due to inadequate security measures to verify hostile cloud users, the security threats on cloud multitenancy platform have escalated.  In this paper we propose energy efficient virtual machine consolidation using priority-based security aware virtual machine allocation policy to improve datacentre security. The proposed security solution considers the host threat score before virtual machine placement, which has reduced the security threats for co-residency attacks without impacting datacentre energy consumption

    A Backup-as-a-Service (BaaS) software solution

    Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2018.O backup é uma réplica de todos os dados que podem ser usados para restaurar seu formato original. No entanto, a quantidade total de dados digitais criados em todo o mundo mais do que dobra a cada dois anos e deve atingir 44 trilhões de gigabytes em 2020, trazendo novos desafios constantes aos processos de backup. O backup corporativo é uma das tarefas mais antigas e executadas por profissionais de infraestrutura e operações. Ainda assim, a maioria dos sistemas de backup foi projetada e otimizada para ambientes desatualizados e casos de uso. Esse fato gera frustração sobre os desafios atuais de backup e leva a uma maior disposição para modernizar e considerar novas tecnologias. As soluções tradicionais de backup e arquivamento não são mais capazes de atender às necessidades atuais dos usuários. O software de backup e recuperação moderno ideal não deve apenas fornecer recursos para atender a um data center tradicional, mas também permitir a integração e exploração da crescente nuvem, incluindo “backup client as a service” e “backup storage como um serviço”'. O presente estudo tem como objetivo propor e implantar uma solução de software Backup como Serviço. Para isso, são determinados os parâmetros de backup/nuvem, os desafios de backup na nuvem, as arquiteturas pesquisadas e os requisitos do sistema de BaaS. Em seguida, selecionamos um conjunto de recursos desejados do BaaS a serem desenvolvidos, o que resulta na primeira interface Backup-as-a-Service baseada na API REST, chamada de “bcloud”'. Realizamos uma consulta de usabilidade on-line com um número significativo de usuários e realizamos uma análise de resultados. A avaliação média geral das perguntas objetivas de zero a dez foi de 8,29%, indicando uma percepção muito satisfatória do usuário do protótipo da interface de BaaS bcloud.Backup is a replica of any data that can be used to restore its original form. However, the total amount of digital data created worldwide more than doubles every two years and is expected reach 44 trillions of gigabytes in 2020, bringing constant new challenges to backup processes. Enterprise backup is one of the oldest and most performed tasks by infrastructure and operations professionals. Still, most backup systems have been designed and optimized for outdated environments and use cases. That fact, generates frustration over currently backup challenges and leads to a greater willingness to modernize and to consider new technologies. Traditional backup and archive solutions are no longer able to meet users current needs. The ideal modern currently backup and recovery software product should not only provide features to attend a traditional data center, but also allow the integration and exploration of the growing Cloud, including “backup client as a service” and “backup storage as a service”. The present study aims to propose and deploy a Backup as a Service software solution. To achieve that, the cloud/backup parameters, cloud backup challenges, researched architectures and BaaS system requirements are determined. Then, we select a set of BaaS desired features to be developed, that results in the first truly cloud REST API based Backup-as-a-Service interface, namely “bcloud”. We conduct an on-line usability inquiry with a significant number of users and perform a result analysis. The overall average objective zero to ten questions evaluation was 8.29%, indicating a very satisfactory user perception of the bcloud BaaS interface prototype

    Secure Cloud Connectivity for Scientific Applications

    Cloud computing improves utilization and flexibility in allocating computing resources while reducing the infrastructural costs. However, in many cases cloud technology is still proprietary and tainted by security issues rooted in the multi-user and hybrid cloud environment. A lack of secure connectivity in a hybrid cloud environment hinders the adaptation of clouds by scientific communities that require scaling-out of the local infrastructure using publicly available resources for large-scale experiments. In this article, we present a case study of the DII-HEP secure cloud infrastructure and propose an approach to securely scale-out a private cloud deployment to public clouds in order to support hybrid cloud scenarios. A challenge in such scenarios is that cloud vendors may offer varying and possibly incompatible ways to isolate and interconnect virtual machines located in different cloud networks. Our approach is tenant driven in the sense that the tenant provides its connectivity mechanism. We provide a qualitative and quantitative analysis of a number of alternatives to solve this problem. We have chosen one of the standardized alternatives, Host Identity Protocol, for further experimentation in a production system because it supports legacy applications in a topologically-independent and secure way.Peer reviewe