22 research outputs found

    High availability using virtualization

    Get PDF
    High availability has always been one of the main problems for a data center. Till now high availability was achieved by host per host redundancy, a highly expensive method in terms of hardware and human costs. A new approach to the problem can be offered by virtualization. Using virtualization, it is possible to achieve a redundancy system for all the services running on a data center. This new approach to high availability allows to share the running virtual machines over the servers up and running, by exploiting the features of the virtualization layer: start, stop and move virtual machines between physical hosts. The system (3RC) is based on a finite state machine with hysteresis, providing the possibility to restart each virtual machine over any physical host, or reinstall it from scratch. A complete infrastructure has been developed to install operating system and middleware in a few minutes. To virtualize the main servers of a data center, a new procedure has been developed to migrate physical to virtual hosts. The whole Grid data center SNS-PISA is running at the moment in virtual environment under the high availability system. As extension of the 3RC architecture, several storage solutions have been tested to store and centralize all the virtual disks, from NAS to SAN, to grant data safety and access from everywhere. Exploiting virtualization and ability to automatically reinstall a host, we provide a sort of host on-demand, where the action on a virtual machine is performed only when a disaster occurs.Comment: PhD Thesis in Information Technology Engineering: Electronics, Computer Science, Telecommunications, pp. 94, University of Pisa [Italy

    A Pattern-Based Approach to Scaffold the IT Infrastructure Design Process

    Get PDF
    Context. The design of Information Technology (IT) infrastructures is a challenging task since it implies proficiency in several areas that are rarely mastered by a single person, thus raising communication problems among those in charge of conceiving, deploying, operating and maintaining/managing them. Most IT infrastructure designs are based on proprietary models, known as blueprints or product-oriented architectures, defined by vendors to facilitate the configuration of a particular solution, based upon their services and products portfolio. Existing blueprints can be facilitators in the design of solutions for a particular vendor or technology. However, since organizations may have infrastructure components from multiple vendors, the use of blueprints aligned with commercial product(s) may cause integration problems among these components and can lead to vendor lock-in. Additionally, these blueprints have a short lifecycle, due to their association with product version(s) or a specific technology, which hampers their usage as a tool for the reuse of IT infrastructure knowledge. Objectives. The objectives of this dissertation are (i) to mitigate the inability to reuse knowledge in terms of best practices in the design of IT infrastructures and, (ii) to simplify the usage of this knowledge, making the IT infrastructure designs simpler, quicker and better documented, while facilitating the integration of components from different vendors and minimizing the communication problems between teams. Method. We conducted an online survey and performed a systematic literature review to support the state of the art and to provide evidence that this research was relevant and had not been conducted before. A model-driven approach was also used for the formalization and empirical validation of well-formedness rules to enhance the overall process of designing IT infrastructures. To simplify and support the design process, a modeling tool, including its abstract and concrete syntaxes was also extended to include the main contributions of this dissertation. Results. We obtained 123 responses to the online survey. Their majority were from people with more than 15 years experience with IT infrastructures. The respondents confirmed our claims regarding the lack of formality and documentation problems on knowledge transfer and only 19% considered that their current practices to represent IT Infrastructures are efficient. A language for modeling IT Infrastructures including an abstract and concrete syntax is proposed to address the problem of informality in their design. A catalog of IT Infrastructure patterns is also proposed to allow expressing best practices in their design. The modeling tool was also evaluated and according to 84% of the respondents, this approach decreases the effort associated with IT infrastructure design and 89% considered that the use of a repository with infrastructure patterns, will help to improve the overall quality of IT infrastructures representations. A controlled experiment was also performed to assess the effectiveness of both the proposed language and the pattern-based IT infrastructure design process supported by the tool. Conclusion. With this work, we contribute to improve the current state of the art in the design of IT infrastructures replacing the ad-hoc methods with more formal ones to address the problems of ambiguity, traceability and documentation, among others, that characterize most of IT infrastructure representations. Categories and Subject Descriptors:C.0 [Computer Systems Organization]: System architecture; D.2.10 [Software Engineering]: Design-Methodologies; D.2.11 [Software Engineering]: Software Architectures-Patterns

    Enhancing the Programmability of Cloud Object Storage

    Get PDF
    En un món que depèn cada vegada més de la tecnologia, les dades digitals es generen a una escala sense precedents. Això fa que empreses que requereixen d'un gran espai d'emmagatzematge, com Netflix o Dropbox, utilitzin solucions d'emmagatzematge al núvol. Mes concretament, l'emmagatzematge d'objectes, donada la seva simplicitat, escalabilitat i alta disponibilitat. No obstant això, aquests magatzems s'enfronten a tres desafiaments principals: 1) Gestió flexible de càrregues de treball de múltiples usuaris. Normalment, els magatzems d'objectes són sistemes multi-usuari, la qual cosa significa que tots ells comparteixen els mateixos recursos, el que podria ocasionar problemes d'interferència. A més, és complex administrar polítiques d'emmagatzematge heterogènies a gran escala en ells. 2) Autogestió de dades. Els magatzems d'objectes no ofereixen molta flexibilitat pel que fa a l'autogestió de dades per part dels usuaris. Típicament, són sistemes rígids, la qual cosa impedeix gestionar els requisits específics dels objectes. 3) Còmput elàstic prop de les dades. Situar els càlculs prop de les dades pot ser útil per reduir la transferència de dades. Però, el desafiament aquí és com aconseguir la seva elasticitat sense provocar contenció de recursos i interferències en la capa d'emmagatzematge. En aquesta tesi presentem tres contribucions innovadores que resolen aquests desafiaments. En primer lloc, presentem la primera arquitectura d'emmagatzematge definida per programari (SDS) per a magatzems d'objectes que separa les capes de control i de dades. Això permet gestionar les càrregues de treball de múltiples usuaris d'una manera flexible i dinàmica. En segon lloc, hem dissenyat una nova abstracció de polítiques anomenada "microcontrolador" que transforma els objectes comuns en objectes intel·ligents, permetent als usuaris programar el seu comportament. Finalment, presentem la primera plataforma informàtica "serverless" guiada per dades i elàstica, que mitiga els problemes de col·locar el càlcul prop de les dades.En un mundo que depende cada vez más de la tecnología, los datos digitales se generan a una escala sin precedentes. Esto hace que empresas que requieren de un gran espacio de almacenamiento, como Netflix o Dropbox, usen soluciones de almacenamiento en la nube. Mas concretamente, el almacenamiento de objectos, dada su escalabilidad y alta disponibilidad. Sin embargo, estos almacenes se enfrentan a tres desafíos principales: 1) Gestión flexible de cargas de trabajo de múltiples usuarios. Normalmente, los almacenes de objetos son sistemas multi-usuario, lo que significa que todos ellos comparten los mismos recursos, lo que podría ocasionar problemas de interferencia. Además, es complejo administrar políticas de almacenamiento heterogéneas a gran escala en ellos. 2) Autogestión de datos. Los almacenes de objetos no ofrecen mucha flexibilidad con respecto a la autogestión de datos por parte de los usuarios. Típicamente, son sistemas rígidos, lo que impide gestionar los requisitos específicos de los objetos. 3) Cómputo elástico cerca de los datos. Situar los cálculos cerca de los datos puede ser útil para reducir la transferencia de datos. Pero, el desafío aquí es cómo lograr su elasticidad sin provocar contención de recursos e interferencias en la capa de almacenamiento. En esta tesis presentamos tres contribuciones que resuelven estos desafíos. En primer lugar, presentamos la primera arquitectura de almacenamiento definida por software (SDS) para almacenes de objetos que separa las capas de control y de datos. Esto permite gestionar las cargas de trabajo de múltiples usuarios de una manera flexible y dinámica. En segundo lugar, hemos diseñado una nueva abstracción de políticas llamada "microcontrolador" que transforma los objetos comunes en objetos inteligentes, permitiendo a los usuarios programar su comportamiento. Finalmente, presentamos la primera plataforma informática "serverless" guiada por datos y elástica, que mitiga los problemas de colocar el cálculo cerca de los datos.In a world that is increasingly dependent on technology, digital data is generated in an unprecedented way. This makes companies that require large storage space, such as Netflix or Dropbox, use cloud object storage solutions. This is mainly thanks to their built-in characteristics, such as simplicity, scalability and high-availability. However, cloud object stores face three main challenges: 1) Flexible management of multi-tenant workloads. Commonly, cloud object stores are multi-tenant systems, meaning that all tenants share the same system resources, which could lead to interference problems. Furthermore, it is now complex to manage heterogeneous storage policies in a massive scale. 2) Data self-management. Cloud object stores themselves do not offer much flexibility regarding data self-management by tenants. Typically, they are rigid, which prevent tenants to handle the specific requirements of their objects. 3) Elastic computation close to the data. Placing computations close to the data can be useful to reduce data transfers. But, the challenge here is how to achieve elasticity in those computations without provoking resource contention and interferences in the storage layer. In this thesis, we present three novel research contributions that solve the aforementioned challenges. Firstly, we introduce the first Software-defined Storage (SDS) architecture for cloud object stores that separates the control plane from the data plane, allowing to manage multi-tenant workloads in a flexible and dynamic way. For example, by applying different service levels of bandwidth to different tenants. Secondly, we designed a novel policy abstraction called microcontroller that transforms common objects into smart objects, enabling tenants to programmatically manage their behavior. For example, a content-level access control microcontroller attached to an specific object to filter its content depending on who is accessing it. Finally, we present the first elastic data-driven serverless computing platform that mitigates the resource contention problem of placing computation close to the data

    Optimization, high availability and redundancy in information communications and technology using networks and systems virtualization architectures

    Get PDF
    Mestrado em Engenharia InformáticaIn an era marked by the so-called technological revolution, where computing power has become a critical production factor, just like manpower was in the industrial revolution, very quickly this dispersal of computing power for numerous facilities and sites became unsustainable, both economically as well in terms of maintenance and management. This constringency led to the need of adopting new strategies for the provision of information systems equally capable but in a more consolidated way, while also increasing the desired levels of the classic set of information assurances: availability, confidentiality, authenticity and integrity. At this point enters a topic not as new as it may seem, virtualization, which after two decades of oblivion presents itself as the best candidate to solve the problem technology faces today. Throughout this thesis, it will be explained what this technology represents and what approach should be taken in the evolution of a systems architecture classical model into a model based in optimization, high availability, redundancy and consolidation of information communications and technology, using networks and systems virtualization. A number of good management principles to adopt will also be taken into account to ensure quality standards in providing services, culminating in a practical application of this model to the infrastructure of the company Metro do Porto, SA. Briefly, this document can almost be classified as a step by step manual for the understanding and implementation of this model, transversal to any infrastructure without extremely particular needs or constraints which derail conversion.Numa era marcada pela chamada revolução tecnológica, em que o poder de computação se tornou um factor crítico de produção, tal como a mão-de-obra o foi na revolução industrial, rapidamente esta dispersão do poder de computação por inúmeros equipamentos e locais tornou-se insustentável, quer ao nível económico como de manutenção e gestão. Este constrangimento levou à necessidade de adopção de novas estratégias para a disponibilização de sistemas de informação igualmente capazes mas de forma mais consolidada, procurando simultaneamente aumentar os níveis desejáveis do clássico paradigma dos factores a garantir na informação: disponibilidade, confidencialidade, autenticidade e integridade. É neste ponto que entra um tema não tão novo como parece, a virtualização, que após duas décadas de esquecimento se apresenta como o maior candidato à resolução do problema que a tecnologia enfrenta. Ao longo desta tese, será explicado o que esta tecnologia representa e qual a abordagem a ter na evolução de um modelo clássico de arquitectura de sistemas para um modelo baseado em optimização, alta disponibilidade, redundância e consolidação de tecnologias de informação e comunicações com recurso à virtualização de redes e sistemas. Serão igualmente tidos em conta um número de bons princípios de gestão a adoptar de forma a garantir parâmetros de qualidade na disponibilização de serviços, culminando numa aplicação prática de todo este modelo na infra-estrutura da empresa Metro do Porto, SA. Resumidamente, este documento pode ser classificado quase como um manual passo a passo de compreensão e implementação deste modelo, transversal a qualquer infra-estrutura sem constrangimentos ou necessidades extremamente particulares e que inviabilizem a conversão

    Development and application in clinical practice of Computer-aided Diagnosis systems for the early detection of lung cancer

    Get PDF
    Lung cancer is the main cause of cancer-related deaths both in Europe and United States, because often it is diagnosed at late stages of the disease, when the survival rate is very low if compared to first asymptomatic stage. Lung cancer screening using annual low-dose Computed Tomography (CT) reduces lung cancer 5-year mortality by about 20% in comparison to annual screening with chest radiography. However, the detection of pulmonary nodules in low-dose chest CT scans is a very difficult task for radiologists, because of the large number (300/500) of slices to be analyzed. In order to support radiologists, researchers have developed Computer aided Detection (CAD) algorithms for the automated detection of pulmonary nodules in chest CT scans. Despite proved benefits of those systems on the radiologists detection sensitivity, the usage of CADs in clinical practice has not spread yet. The main objective of this thesis is to investigate and tackle the issues underlying this inconsistency. In particular, in Chapter 2 we introduce M5L, a fully automated Web and Cloud-based CAD for the automated detection of pulmonary nodules in chest CT scans. This system introduces a new paradigm in clinical practice, by making available CAD systems without requiring to radiologists any additional software and hardware installation. The proposed solution provides an innovative cost-effective approach for clinical structures. In Chapter 3 we present our international challenge aiming at a large-scale validation of state-of-the-art CAD systems. We also investigate and prove how the combination of different CAD systems reaches performances much higher than any best stand-alone system developed so far. Our results open the possibility to introduce in clinical practice very high-performing CAD systems, which miss a tiny fraction of clinically relevant nodules. Finally, we tested the performance of M5L on clinical data-sets. In chapter 4 we present the results of its clinical validation, which prove the positive impact of CAD as second reader in the diagnosis of pulmonary metastases on oncological patients with extra-thoracic cancers. The proposed approaches have the potential to exploit at best the features of different algorithms, developed independently, for any possible clinical application, setting a collaborative environment for algorithm comparison, combination, clinical validation and, if all of the above were successful, clinical practice

    Bayesian Prognostic Framework for High-Availability Clusters

    Get PDF
    Critical services from domains as diverse as finance, manufacturing and healthcare are often delivered by complex enterprise applications (EAs). High-availability clusters (HACs) are software-managed IT infrastructures that enable these EAs to operate with minimum downtime. To that end, HACs monitor the health of EA layers (e.g., application servers and databases) and resources (i.e., components), and attempt to reinitialise or restart failed resources swiftly. When this is unsuccessful, HACs try to failover (i.e., relocate) the resource group to which the failed resource belongs to another server. If the resource group failover is also unsuccessful, or when a system-wide critical failure occurs, HACs initiate a complete system failover. Despite the availability of multiple commercial and open-source HAC solutions, these HACs (i) disregard important sources of historical and runtime information, and (ii) have limited reasoning capabilities. Therefore, they may conservatively perform unnecessary resource group or system failovers or delay justified failovers for longer than necessary. This thesis introduces the first HAC taxonomy, uses it to carry out an extensive survey of current HAC solutions, and develops a novel Bayesian prognostic (BP) framework that addresses the significant HAC limitations that are mentioned above and are identified by the survey. The BP framework comprises four \emph{modules}. The first module is a technique for modelling high availability using a combination of established and new HAC characteristics. The second is a suite of methods for obtaining and maintaining the information required by the other modules. The third is a HAC-independent Bayesian decision network (BDN) that predicts whether resource failures can be managed locally (i.e., without failovers). The fourth is a method for constructing a HAC-specific Bayesian network for the fast prediction of resource group and system failures. Used together, these modules reduce the downtime of HAC-protected EAs significantly. The experiments presented in this thesis show that the BP framework can deliver downtimes between 5.5 and 7.9 times smaller than those obtained with an established open-source HAC

    DEPENDABILITY BENCHMARKING OF NETWORK FUNCTION VIRTUALIZATION

    Get PDF
    Network Function Virtualization (NFV) is an emerging networking paradigm that aims to reduce costs and time-to-market, improve manageability, and foster competition and innovative services. NFV exploits virtualization and cloud computing technologies to turn physical network functions into Virtualized Network Functions (VNFs), which will be implemented in software, and will run as Virtual Machines (VMs) on commodity hardware located in high-performance data centers, namely Network Function Virtualization Infrastructures (NFVIs). The NFV paradigm relies on cloud computing and virtualization technologies to provide carrier-grade services, i.e., the ability of a service to be highly reliable and available, within fast and automatic failure recovery mechanisms. The availability of many virtualization solutions for NFV poses the question on which virtualization technology should be adopted for NFV, in order to fulfill the requirements described above. Currently, there are limited solutions for analyzing, in quantitative terms, the performance and reliability trade-offs, which are important concerns for the adoption of NFV. This thesis deals with assessment of the reliability and of the performance of NFV systems. It proposes a methodology, which includes context, measures, and faultloads, to conduct dependability benchmarks in NFV, according to the general principles of dependability benchmarking. To this aim, a fault injection framework for the virtualization technologies has been designed and implemented for the virtualized technologies being used as case studies in this thesis. This framework is successfully used to conduct an extensive experimental campaign, where we compare two candidate virtualization technologies for NFV adoption: the commercial, hypervisor-based virtualization platform VMware vSphere, and the open-source, container-based virtualization platform Docker. These technologies are assessed in the context of a high-availability, NFV-oriented IP Multimedia Subsystem (IMS). The analysis of experimental results reveal that i) fault management mechanisms are crucial in NFV, in order to provide accurate failure detection and start the subsequent failover actions, and ii) fault injection proves to be valuable way to introduce uncommon scenarios in the NFVI, which can be fundamental to provide a high reliable service in production

    A new MDA-SOA based framework for intercloud interoperability

    Get PDF
    Cloud computing has been one of the most important topics in Information Technology which aims to assure scalable and reliable on-demand services over the Internet. The expansion of the application scope of cloud services would require cooperation between clouds from different providers that have heterogeneous functionalities. This collaboration between different cloud vendors can provide better Quality of Services (QoS) at the lower price. However, current cloud systems have been developed without concerns of seamless cloud interconnection, and actually they do not support intercloud interoperability to enable collaboration between cloud service providers. Hence, the PhD work is motivated to address interoperability issue between cloud providers as a challenging research objective. This thesis proposes a new framework which supports inter-cloud interoperability in a heterogeneous computing resource cloud environment with the goal of dispatching the workload to the most effective clouds available at runtime. Analysing different methodologies that have been applied to resolve various problem scenarios related to interoperability lead us to exploit Model Driven Architecture (MDA) and Service Oriented Architecture (SOA) methods as appropriate approaches for our inter-cloud framework. Moreover, since distributing the operations in a cloud-based environment is a nondeterministic polynomial time (NP-complete) problem, a Genetic Algorithm (GA) based job scheduler proposed as a part of interoperability framework, offering workload migration with the best performance at the least cost. A new Agent Based Simulation (ABS) approach is proposed to model the inter-cloud environment with three types of agents: Cloud Subscriber agent, Cloud Provider agent, and Job agent. The ABS model is proposed to evaluate the proposed framework.Fundação para a Ciência e a Tecnologia (FCT) - (Referencia da bolsa: SFRH SFRH / BD / 33965 / 2009) and EC 7th Framework Programme under grant agreement n° FITMAN 604674 (http://www.fitman-fi.eu

    Market driven elastic secure infrastructure

    Full text link
    In today’s Data Centers, a combination of factors leads to the static allocation of physical servers and switches into dedicated clusters such that it is difficult to add or remove hardware from these clusters for short periods of time. This silofication of the hardware leads to inefficient use of clusters. This dissertation proposes a novel architecture for improving the efficiency of clusters by enabling them to add or remove bare-metal servers for short periods of time. We demonstrate by implementing a working prototype of the architecture that such silos can be broken and it is possible to share servers between clusters that are managed by different tools, have different security requirements, and are operated by tenants of the Data Center, which may not trust each other. Physical servers and switches in a Data Center are grouped for a combination of reasons. They are used for different purposes (staging, production, research, etc); host applications required for servicing specific workloads (HPC, Cloud, Big Data, etc); and/or configured to meet stringent security and compliance requirements. Additionally, different provisioning systems and tools such as Openstack-Ironic, MaaS, Foreman, etc that are used to manage these clusters take control of the servers making it difficult to add or remove the hardware from their control. Moreover, these clusters are typically stood up with sufficient capacity to meet anticipated peak workload. This leads to inefficient usage of the clusters. They are under-utilized during off-peak hours and in the cases where the demand exceeds capacity the clusters suffer from degraded quality of service (QoS) or may violate service level objectives (SLOs). Although today’s clouds offer huge benefits in terms of on-demand elasticity, economies of scale, and a pay-as-you-go model yet many organizations are reluctant to move their workloads to the cloud. Organizations that (i) needs total control of their hardware (ii) has custom deployment practices (iii) needs to match stringent security and compliance requirements or (iv) do not want to pay high costs incurred from running workloads in the cloud prefers to own its hardware and host it in a data center. This includes a large section of the economy including financial companies, medical institutions, and government agencies that continue to host their own clusters outside of the public cloud. Considering that all the clusters may not undergo peak demand at the same time provides an opportunity to improve the efficiency of clusters by sharing resources between them. The dissertation describes the design and implementation of the Market Driven Elastic Secure Infrastructure (MESI) as an alternative to the public cloud and as an architecture for the lowest layer of the public cloud to improve its efficiency. It allows mutually non-trusting physically deployed services to share the physical servers of a data center efficiently. The approach proposed here is to build a system composed of a set of services each fulfilling a specific functionality. A tenant of the MESI has to trust only a minimal functionality of the tenant that offers the hardware resources. The rest of the services can be deployed by each tenant themselves MESI is based on the idea of enabling tenants to share hardware they own with tenants they may not trust and between clusters with different security requirements. The architecture provides control and freedom of choice to the tenants whether they wish to deploy and manage these services themselves or use them from a trusted third party. MESI services fit into three layers that build on each other to provide: 1) Elastic Infrastructure, 2) Elastic Secure Infrastructure, and 3) Market-driven Elastic Secure Infrastructure. 1) Hardware Isolation Layer (HIL) – the bottommost layer of MESI is designed for moving nodes between multiple tools and schedulers used for managing the clusters. It defines HIL to control the layer 2 switches and bare-metal servers such that tenants can elastically adjust the size of the clusters in response to the changing demand of the workload. It enables the movement of nodes between clusters with minimal to no modifications required to the tools and workflow used for managing these clusters. (2) Elastic Secure Infrastructure (ESI) builds on HIL to enable sharing of servers between clusters with different security requirements and mutually non-trusting tenants of the Data Center. ESI enables the borrowing tenant to minimize its trust in the node provider and take control of trade-offs between cost, performance, and security. This enables sharing of nodes between tenants that are not only part of the same organization by can be organization tenants in a co-located Data Center. (3) The Bare-metal Marketplace is an incentive-based system that uses economic principles of the marketplace to encourage the tenants to share their servers with others not just when they do not need them but also when others need them more. It provides tenants the ability to define their own cluster objectives and sharing constraints and the freedom to decide the number of nodes they wish to share with others. MESI is evaluated using prototype implementations at each layer of the architecture. (i) The HIL prototype implemented with only 3000 Lines of Code (LOC) is able to support many provisioning tools and schedulers with little to no modification; adds no overhead to the performance of the clusters and is in active production use at MOC managing over 150 servers and 11 switches. (ii) The ESI prototype builds on the HIL prototype and adds to it an attestation service, a provisioning service, and a deterministically built open-source firmware. Results demonstrate that it is possible to build a cluster that is secure, elastic, and fairly quick to set up. The tenant requires only minimum trust in the provider for the availability of the node. (iii) The MESI prototype demonstrates the feasibility of having a one-of-kind multi-provider marketplace for trading bare-metal servers where providers also use the nodes. The evaluation of the MESI prototype shows that all the clusters benefit from participating in the marketplace. It uses agents to trade bare-metal servers in a marketplace to meet the requirements of their clusters. Results show that compared to operating as silos individual clusters see a 50% improvement in the total work done; up to 75% improvement (reduction) in waiting for queues and up to 60% improvement in the aggregate utilization of the test bed. This dissertation makes the following contributions: (i) It defines the architecture of MESI allows mutually non-trusting tenants of the data center to share resources between clusters with different security requirements. (ii) Demonstrates that it is possible to design a service that breaks the silos of static allocation of clusters yet has a small Trusted Computing Base (TCB) and no overhead to the performance of the clusters. (iii) Provides a unique architecture that puts the tenant in control of its own security and minimizes the trust needed in the provider for sharing nodes. (iv) A working prototype of a multi-provider marketplace for bare-metal servers which is a first proof-of-concept that demonstrates that it is possible to trade real bare-metal nodes at practical time scales such that moving nodes between clusters is sufficiently fast to be able to get some useful work done. (v) Finally results show that it is possible to encourage even mutually non-trusting tenants to share their nodes with each other without any central authority making allocation decisions. Many smart, dedicated engineers and researchers have contributed to this work over the years. I have jointly led the efforts to design the HIL and the ESI layer; led the design and implementation of the bare-metal marketplace and the overall MESI architecture

    Matching distributed file systems with application workloads

    Get PDF
    Modern storage systems have a large number of configurable parameters, distributed over many layers of abstraction. The number of combinations of these parameters, that can be altered to create an instance of such a system, is enormous. In practise, many of these parameters are never altered; instead default values, intended to support generic workloads and access patterns, are used. As systems become larger and evolve to support different workloads, the appropriateness of using default parameters in this way comes into question. This thesis examines the implications of changing some of these parameters and explores the effects these changes have on performance. As part of that work multiple contributions have been made, including the creation of a structured method to create and evaluate different storage configurations, choosing appropriate access sizes for the evaluation, picking representative cloud workloads and capturing storage traces for further analysis, extraction of the workload storage characteristics, creating logical partitions of the distributed file system used for the optimization, the creation of heterogeneous storage pools within the homogeneous system and the mapping and evaluation of the chosen workloads to the examined configurations
    corecore