100 research outputs found

    Optimizing Virtual Machine I/O Performance in Cloud Environments

    Get PDF
    Maintaining closeness between data sources and data consumers is crucial for workload I/O performance. In cloud environments, this kind of closeness can be violated by system administrative events and storage architecture barriers. VM migration events are frequent in cloud environments. VM migration changes VM runtime inter-connection or cache contexts, significantly degrading VM I/O performance. Virtualization is the backbone of cloud platforms. I/O virtualization adds additional hops to workload data access path, prolonging I/O latencies. I/O virtualization overheads cap the throughput of high-speed storage devices and imposes high CPU utilizations and energy consumptions to cloud infrastructures. To maintain the closeness between data sources and workloads during VM migration, we propose Clique, an affinity-aware migration scheduling policy, to minimize the aggregate wide area communication traffic during storage migration in virtual cluster contexts. In host-side caching contexts, we propose Successor to recognize warm pages and prefetch them into caches of destination hosts before migration completion. To bypass the I/O virtualization barriers, we propose VIP, an adaptive I/O prefetching framework, which utilizes a virtual I/O front-end buffer for prefetching so as to avoid the on-demand involvement of I/O virtualization stacks and accelerate the I/O response. Analysis on the traffic trace of a virtual cluster containing 68 VMs demonstrates that Clique can reduce inter-cloud traffic by up to 40%. Tests of MPI Reduce_scatter benchmark show that Clique can keep VM performance during migration up to 75% of the non-migration scenario, which is more than 3 times of the Random VM choosing policy. In host-side caching environments, Successor performs better than existing cache warm-up solutions and achieves zero VM-perceived cache warm-up time with low resource costs. At system level, we conducted comprehensive quantitative analysis on I/O virtualization overheads. Our trace replay based simulation demonstrates the effectiveness of VIP for data prefetching with ignorable additional cache resource costs

    Autonomic Overload Management For Large-Scale Virtualized Network Functions

    Get PDF
    The explosion of data traffic in telecommunication networks has been impressive in the last few years. To keep up with the high demand and staying profitable, Telcos are embracing the Network Function Virtualization (NFV) paradigm by shifting from hardware network appliances to software virtual network functions, which are expected to support extremely large scale architectures, providing both high performance and high reliability. The main objective of this dissertation is to provide frameworks and techniques to enable proper overload detection and mitigation for the emerging virtualized software-based network services. The thesis contribution is threefold. First, it proposes a novel approach to quickly detect performance anomalies in complex and large-scale VNF services. Second, it presents NFV-Throttle, an autonomic overload control framework to protect NFV services from overload within a short period of time, allowing to preserve the QoS of traffic flows admitted by network services in response to both traffic spikes (up to 10x the available capacity) and capacity reduction due to infrastructure problems (such as CPU contention). Third, it proposes DRACO, to manage overload problems arising in novel large-scale multi-tier applications, such as complex stateful network functions in which the state is spread across modern key-value stores to achieve both scalability and performance. DRACO performs a fine-grained admission control, by tuning the amount and type of traffic according to datastore node dependencies among the tiers (which are dynamically discovered at run-time), and to the current capacity of individual nodes, in order to mitigate overloads and preventing hot-spots. This thesis presents the implementation details and an extensive experimental evaluation for all the above overload management solutions, by means of a virtualized IP Multimedia Subsystem (IMS), which provides modern multimedia services for Telco operators, such as Videoconferencing and VoLTE, and which is one of the top use-cases of the NFV technology

    YOLO: AccĂ©lĂ©ration du temps de dĂ©marrage de la machine virtuelleen rĂ©duisant les opĂ©rations d’I/O

    Get PDF
    Several works have shown that the time to boot one virtual machine (VM) can last up to a fewminutes in high consolidated cloud scenarios. This time is critical as VM boot duration defines how anapplication can react w.r.t. demands’ fluctuations (horizontal elasticity). To limit as much as possible thetime to boot a VM, we design the YOLO mechanism (You Only Load Once). YOLO optimizes the numberof I/O operations generated during a VM boot process by relying on the boot image abstraction, a subsetof the VM image (VMI) that contains data blocks necessary to complete the boot operation. Whenevera VM is booted, YOLO intercepts all read accesses and serves them directly from the boot image, whichhas been locally stored on fast access storage devices (e.g., memory, SSD, etc.). Creating boot imagesfor 900+ VMIs from Google Cloud shows that only 40 GB is needed to store all the mandatory data.Experiments show that YOLO can speed up VM boot duration 2-13 times under different resourcescontention with a negligible overhead on the I/O path. Finally, we underline that although YOLO hasbeen validated with a KVM environment, it does not require any modification on the hypervisor, theguest kernel nor the VM image (VMI) structure and can be used for several kinds of VMIs (in this study,Linux and Windows VMIs have been tested)Plusieurs travaux ont montrĂ© que le temps de dĂ©marrage d’une machine virtuelle (VM)peut s’étale sur plusieurs minutes dans des scĂ©narios fortement consolidĂ©s. Ce dĂ©lai est critique car ladurĂ©e de dĂ©marrage d’une VM dĂ©finit la rĂ©activitĂ© d’une application en fonction des fluctuations decharge (Ă©lasticitĂ© horizontale). Pour limiter au maximum le temps de dĂ©marrage d’une VM, nous avonsconçu le mĂ©canisme YOLO (You Only Load Once). YOLO optimise le nombre d’opĂ©rations “disque”gĂ©nĂ©rĂ©es pendant le processus de dĂ©marrage. Pour ce faire, il s’appuie sur une nouvelle abstractionintitulĂ©e “image de dĂ©marrage” et correspondant Ă  un sous-ensemble des donnĂ©es de l’image de la VM.Chaque fois qu’une machine virtuelle est dĂ©marrĂ©e, YOLO intercepte l’ensemble des accĂšs en lectureafin de les satisfaire directement Ă  partir de l’image de dĂ©marrage, qui a Ă©tĂ© stockĂ©e prĂ©alablement surdes pĂ©riphĂ©riques de stockage Ă  accĂšs rapide (par exemple, mĂ©moire, SSD, etc.). La crĂ©ation d’imagede dĂ©marrage pour les 900 types des VMs proposĂ©es sur l’infrastructure Cloud de Google reprĂ©senteseulement 40 Go, ce qui est une quantitĂ© de donnĂ©es qui peut tout Ă  fait ĂȘtre stockĂ©e sur chacundes noeuds de calculs. Les expĂ©riences rĂ©alisĂ©es montrent que YOLO permet accĂ©lĂ©rer la durĂ©e dedĂ©marrage d’un facteur allant de 2 Ă  13 selon les diffĂ©rents scĂ©narios de consolidation. Nous soulignonsque bien que YOLO ait Ă©tĂ© validĂ© avec un environnement KVM, il ne nĂ©cessite aucune modificatfionsur l’hyperviseur, le noyau invitĂ© ou la structure d’image de la VM et peut donc ĂȘtre utilisĂ© pourplusieurs types d’images (dans cette Ă©tude, nous testons des images Linux et Windows)

    Performance Model of MapReduce Iterative Applications for Hybrid Cloud Bursting

    Get PDF
    Hybrid cloud bursting (i.e., leasing temporary off-premise cloud resources to boost the overall capacity during peak utilization) can be a cost-effective way to deal with the increasing complexity of big data analytics, especially for iterative applications. However, the low throughput, high latency network link between the on-premise and off-premise resources (“weak link”) makes maintaining scalability difficult. While several data locality techniques have been designed for big data bursting on hybrid clouds, their effectiveness is difficult to estimate in advance. Yet such estimations are critical, because they help users decide whether the extra pay-as-you-go cost incurred by using the off-premise resources justifies the runtime speed-up. To this end, the current paper presents a performance model and methodology to estimate the runtime of iterative MapReduce applications in a hybrid cloud-bursting scenario. The paper focuses on the overhead incurred by the weak link at fine granularity, for both the map and the reduce phases. This approach enables high estimation accuracy, as demonstrated by extensive experiments at scale using a mix of real-world iterative MapReduce applications from standard big data benchmarking suites that cover a broad spectrum of data patterns. Not only are the produced estimations accurate in absolute terms compared with experimental results, but they are also up to an order of magnitude more accurate than applying state-of-art estimation approaches originally designed for single-site MapReduce deployments

    Privacy-Enhanced Query Processing in a Cloud-Based Encrypted DBaaS (Database as a Service)

    Get PDF
    In this dissertation, we researched techniques to support trustable and privacy enhanced solutions for on-line applications accessing to “always encrypted” data in remote DBaaS (data-base-as-a-service) or Cloud SQL-enabled backend solutions. Although solutions for SQL-querying of encrypted databases have been proposed in recent research, they fail in providing: (i) flexible multimodal query facilities includ ing online image searching and retrieval as extended queries to conventional SQL-based searches, (ii) searchable cryptographic constructions for image-indexing, searching and retrieving operations, (iii) reusable client-appliances for transparent integration of multi modal applications, and (iv) lack of performance and effectiveness validations for Cloud based DBaaS integrated deployments. At the same time, the study of partial homomorphic encryption and multimodal searchable encryption constructions is yet an ongoing research field. In this research direction, the need for a study and practical evaluations of such cryptographic is essential, to evaluate those cryptographic methods and techniques towards the materialization of effective solutions for practical applications. The objective of the dissertation is to design, implement and perform experimental evaluation of a security middleware solution, implementing a client/client-proxy/server appliance software architecture, to support the execution of applications requiring on line multimodal queries on “always encrypted” data maintained in outsourced cloud DBaaS backends. In this objective we include the support for SQL-based text-queries enhanced with searchable encrypted image-retrieval capabilities. We implemented a prototype of the proposed solution and we conducted an experimental benchmarking evaluation, to observe the effectiveness, latency and performance conditions in support ing those queries. The dissertation addressed the envisaged security middleware solution, as an experimental and usable solution that can be extended for future experimental testbench evaluations using different real cloud DBaaS deployments, as offered by well known cloud-providers.Nesta dissertação foram investigadas tĂ©cnicas para suportar soluçÔes com garantias de privacidade para aplicaçÔes que acedem on-line a dados que sĂŁo mantidos sempre cifrados em nuvens que disponibilizam serviços de armazenamento de dados, nomeadamente soluçÔes do tipo bases de dados interrogĂĄveis por SQL. Embora soluçÔes para suportar interrogaçÔes SQL em bases de dados cifradas tenham sido propostas anteriormente, estas falham em providenciar: (i) capacidade de efectuar pesquisas multimodais que possam incluir pesquisa combinada de texto e imagem com obtenção de imagens online, (ii) suporte de privacidade com base em construçÔes criptograficas que permitam operaçÔes de indexacao, pesquisa e obtenção de imagens como dados cifrados pesquisĂĄveis, (iii) suporte de integração para aplicaçÔes de gestĂŁo de dados em contexto multimodal, e (iv) ausĂȘncia de validaçÔes experimentais com benchmarking dobre desempenho e eficiĂȘncia em soluçÔes DBaaS em que os dados sejam armazenados e manipulados na sua forma cifrada. A pesquisa de soluçÔes de privacidade baseada em primitivas de cifras homomĂłrficas parciais, tem sido vista como uma possĂ­vel solução prĂĄtica para interrogação de dados e bases de dados cifradas. No entanto, este Ă© ainda um campo de investigação em desenvolvimento. Nesta direção de investigação, a necessidade de estudar e efectuar avaliaçÔes experimentais destas primitivas em bibliotecas de cifras homomĂłrficas, reutilizĂĄveis em diferentes contextos de aplicação e como solução efetiva para uso prĂĄtico mais generalizado, Ă© um aspeto essencial. O objectivo da dissertação e desenhar, implementar e efectuar avaliçÔes experimentais de uma proposta de solução middleware para suportar pesquisas multimodais em bases de dados mantidas cifradas em soluçÔes de nuvens de armazenamento. Esta proposta visa a concepção e implementação de uma arquitectura de software client/client-proxy/server appliance para suportar execução eficiente de interrogaçÔes online sobre dados cifrados, suportando operaçÔes multimodais sobre dados mantidos protegidos em serviços de nuvens de armazenamento. Neste objectivo incluĂ­mos o suporte para interrogaçÔes estendidas de SQL, com capacidade para pesquisa e obtenção de dados cifrados que podem incluir texto e pesquisa de imagens por similaridade. Foi implementado um prototipo da solução proposta e foi efectuada uma avaliação experimental do mesmo, para observar as condiçÔes de eficiencia, latencia e desempenho do suporte dessas interrogaçÔes. Nesta avaliação incluĂ­mos a anĂĄlise experimental da eficiĂȘncia e impacto de diferentes construçÔes criptogrĂĄficas para pesquisas cifradas (searchable encryption) e cifras parcialmente homomĂłrficas e que sĂŁo usadas como componentes da solução proposta. A dissertaçao aborda a soluçao de seguranca projectada, como uma solução experimental que pode ser estendida e utilizavel para futuras aplcaçÔes e respetivas avaliaçÔes experimentais. Estas podem vir a adoptar soluçÔes do tipo DBaaS, oferecidos como serviços na nuvem, por parte de diversos provedores ou fornecedores

    Scalable and Highly Available Database Systems in the Cloud

    Get PDF
    Cloud computing allows users to tap into a massive pool of shared computing resources such as servers, storage, and network. These resources are provided as a service to the users allowing them to “plug into the cloud” similar to a utility grid. The promise of the cloud is to free users from the tedious and often complex task of managing and provisioning computing resources to run applications. At the same time, the cloud brings several additional benefits including: a pay-as-you-go cost model, easier deployment of applications, elastic scalability, high availability, and a more robust and secure infrastructure. One important class of applications that users are increasingly deploying in the cloud is database management systems. Database management systems differ from other types of applications in that they manage large amounts of state that is frequently updated, and that must be kept consistent at all scales and in the presence of failure. This makes it difficult to provide scalability and high availability for database systems in the cloud. In this thesis, we show how we can exploit cloud technologies and relational database systems to provide a highly available and scalable database service in the cloud. The first part of the thesis presents RemusDB, a reliable, cost-effective high availability solution that is implemented as a service provided by the virtualization platform. RemusDB can make any database system highly available with little or no code modifications by exploiting the capabilities of virtualization. In the second part of the thesis, we present two systems that aim to provide elastic scalability for database systems in the cloud using two very different approaches. The three systems presented in this thesis bring us closer to the goal of building a scalable and reliable transactional database service in the cloud

    Improving energy efficiency of virtualized datacenters

    Get PDF
    Nowadays, many organizations choose to increasingly implement the cloud computing approach. More specifically, as customers, these organizations are outsourcing the management of their physical infrastructure to data centers (or cloud computing platforms). Energy consumption is a primary concern for datacenter (DC) management. Its cost represents about 80% of the total cost of ownership and it is estimated that in 2020, the US DCs alone will spend about $13 billion on energy bills. Generally, the datacenter servers are manufactured in such a way that they achieve high energy efficiency at high utilizations. Thereby for a low cost per computation all datacenter servers should push the utilization as high as possible. In order to fight the historically low utilization, cloud computing adopted server virtualization. The latter allows a physical server to execute multiple virtual servers (called virtual machines) in an isolated way. With virtualization, the cloud provider can pack (consolidate) the entire set of virtual machines (VMs) on a small set of physical servers and thereby, reduce the number of active servers. Even so, the datacenter servers rarely reach utilizations higher than 50% which means that they operate with sets of longterm unused resources (called 'holes'). My first contribution is a cloud management system that dynamically splits/fusions VMs such that they can better fill the holes. This solution is effective only for elastic applications, i.e. applications that can be executed and reconfigured over an arbitrary number of VMs. However the datacenter resource fragmentation stems from a more fundamental problem. Over time, cloud applications demand more and more memory but the physical servers provide more an more CPU. In nowadays datacenters, the two resources are strongly coupled since they are bounded to a physical sever. My second contribution is a practical way to decouple the CPU-memory tuple that can simply be applied to a commodity server. Thereby, the two resources can vary independently, depending on their demand. My third and my forth contribution show a practical system which exploit the second contribution. The underutilization observed on physical servers is also true for virtual machines. It has been shown that VMs consume only a small fraction of the allocated resources because the cloud customers are not able to correctly estimate the resource amount necessary for their applications. My third contribution is a system that estimates the memory consumption (i.e. the working set size) of a VM, with low overhead and high accuracy. Thereby, we can now consolidate the VMs based on their working set size (not the booked memory). However, the drawback of this approach is the risk of memory starvation. If one or multiple VMs have an sharp increase in memory demand, the physical server may run out of memory. This event is undesirable because the cloud platform is unable to provide the client with the booked memory. My fourth contribution is a system that allows a VM to use remote memory provided by a different rack server. Thereby, in the case of a peak memory demand, my system allows the VM to allocate memory on a remote physical server
    • 

    corecore