52 research outputs found
Effective Resource and Workload Management in Data Centers
The increasing demand for storage, computation, and business continuity has driven the growth of data centers. Managing data centers efficiently is a difficult task because of the wide variety of datacenter applications, their ever-changing intensities, and the fact that application performance targets may differ widely. Server virtualization has been a game-changing technology for IT, providing the possibility to support multiple virtual machines (VMs) simultaneously. This dissertation focuses on how virtualization technologies can be utilized to develop new tools for maintaining high resource utilization, for achieving high application performance, and for reducing the cost of data center management.;For multi-tiered applications, bursty workload traffic can significantly deteriorate performance. This dissertation proposes an admission control algorithm AWAIT, for handling overloading conditions in multi-tier web services. AWAIT places on hold requests of accepted sessions and refuses to admit new sessions when the system is in a sudden workload surge. to meet the service-level objective, AWAIT serves the requests in the blocking queue with high priority. The size of the queue is dynamically determined according to the workload burstiness.;Many admission control policies are triggered by instantaneous measurements of system resource usage, e.g., CPU utilization. This dissertation first demonstrates that directly measuring virtual machine resource utilizations with standard tools cannot always lead to accurate estimates. A directed factor graph (DFG) model is defined to model the dependencies among multiple types of resources across physical and virtual layers.;Virtualized data centers always enable sharing of resources among hosted applications for achieving high resource utilization. However, it is difficult to satisfy application SLOs on a shared infrastructure, as application workloads patterns change over time. AppRM, an automated management system not only allocates right amount of resources to applications for their performance target but also adjusts to dynamic workloads using an adaptive model.;Server consolidation is one of the key applications of server virtualization. This dissertation proposes a VM consolidation mechanism, first by extending the fair load balancing scheme for multi-dimensional vector scheduling, and then by using a queueing network model to capture the service contentions for a particular virtual machine placement
On the Energy Efficiency of Networked Systems
Energy is a first-class resource for datacenter operators since its cost is the biggest limiting factor in scaling a large computing facility. The solution embraced by major operators is to build their facilities in strategic geographical locations and to abandon expensive specialized hardware for cheap commodity systems. However, such systems are not efficient when it comes to energy and a considerable amount of research effort has been put in finding a solution to this problem. Furthermore, the need for more programmable and flexible networking devices is pushing the need for hardware commoditization also within the datacenter network.
In this thesis we propose two solutions aimed at improving the overall energy efficiency of a datacenter facility. The first address efficiency in computing, by proposing a different hardware architecture for server systems. We propose a hybrid architecture that blends traditional server processors with very-low-power processors from the mobile devices world. The second solution envisions the usage of current server platforms as network switches or routers and provides guidelines for the implementation of power saving algorithms that do not affect peak performance while saving up to 50% power.
This work is based on both theoretical modeling and simulation and experimentation with real-world prototypes
Strong Temporal Isolation among Containers in OpenStack for NFV Services
In this paper, the problem of temporal isolation among containerized software components running in shared cloud infrastructures is tackled, proposing an approach based on hierarchical real-time CPU scheduling. This allows for reserving a precise share of the available computing power for each container deployed in a multi-core server, so to provide it with a stable performance, independently from the load of other co-located containers. The proposed technique enables the use of reliable modeling techniques for end-to-end service chains that are effective in controlling the application-level performance. An implementation of the technique within the well-known OpenStack cloud orchestration software is presented, focusing on a use-case framed in the context of network function virtualization. The modified OpenStack is capable of leveraging the special real-time scheduling features made available in the underlying Linux operating system through a patch to the in-kernel process scheduler. The effectiveness of the technique is validated by gathering performance data from two applications running in a real test-bed with the mentioned modifications to OpenStack and the Linux kernel. A performance model is developed that tightly models the application behavior under a variety of conditions. Extensive experimentation shows that the proposed mechanism is successful in guaranteeing isolation of individual containerized activities on the platform
Recommended from our members
Optimising data centre operation by removing the transport bottleneck
Data centres lie at the heart of almost every service on the Internet. Data centres are used to provide search results, to power social media, to store and index email, to host “cloud” applications, for online retail and to provide a myriad of other web services. Consequently the more efficient they can be made the better for all of us. The power of modern data centres is in combining commodity off-the-shelf server hardware and network equipment to provide what Google’s Barrosso and Ho ̈lzle describe as “warehouse scale” computers.
Data centres rely on TCP, a transport protocol that was originally designed for use in the Internet. Like other such protocols, TCP has been optimised to maximise throughput, usually by filling up queues at the bottleneck. However, for most applications within a data centre network latency is more critical than throughput. Consequently the choice of transport protocol becomes a bottleneck for performance. My thesis is that the solution to this is to move away from the use of one-size-fits-all transport protocols towards ones that have been designed to reduce latency across the data centre and which can dynamically respond to the needs of the applications.
This dissertation focuses on optimising the transport layer in data centre networks. In particular I address the question of whether any single transport mechanism can be flexible enough to cater to the needs of all data centre traffic. I show that one leading protocol (DCTCP) has been heavily optimised for certain network conditions. I then explore approaches that seek to minimise latency for applications that care about it while still allowing throughput-intensive applications to receive a good level of service. My key contributions to this are Silo and Trevi.
Trevi is a novel transport system for storage traffic that utilises fountain coding to max- imise throughput and minimise latency while being agnostic to drop, thus allowing storage traffic to be pushed out of the way when latency sensitive traffic is present in the network. Silo is an admission control system that is designed to give tenants of a multi-tenant data centre guaranteed low latency network performance. Both of these were developed in collaboration with others
Dynamic service chain composition in virtualised environment
Network Function Virtualisation (NFV) has contributed to improving the flexibility of network service provisioning and reducing the time to market of new services. NFV leverages the virtualisation technology to decouple the software implementation of network appliances from the physical devices on which they run. However, with the emergence of this paradigm, providing data centre applications with an adequate network performance becomes challenging. For instance, virtualised environments cause network congestion, decrease the throughput and hurt the end user experience. Moreover, applications usually communicate through multiple sequences of virtual network functions (VNFs), aka service chains, for policy enforcement and performance and security enhancement, which increases the management complexity at to the network level.
To address this problematic situation, existing studies have proposed high-level approaches of VNFs chaining and placement that improve service chain performance. They consider the VNFs as homogenous entities regardless of their specific characteristics. They have overlooked their distinct behaviour toward the traffic load and how their underpinning implementation can intervene in defining resource usage. Our research aims at filling this gap by finding out particular patterns on production and widely used VNFs. And proposing a categorisation that helps in reducing network latency at the chains.
Based on experimental evaluation, we have classified firewalls, NAT, IDS/IPS, Flow monitors into I/O- and CPU-bound functions. The former category is mainly sensitive to the throughput, in packets per second, while the performance of the latter is primarily affected by the network bandwidth, in bits per second. By doing so, we correlate the VNF category with the traversing traffic characteristics and this will dictate how the service chains would be composed.
We propose a heuristic called Natif, for a VNF-Aware VNF insTantIation and traFfic distribution scheme, to reconcile the discrepancy in VNF requirements based on the category they belong to and to eventually reduce network latency. We have deployed Natif in an OpenStack-based environment and have compared it to a network-aware VNF composition approach. Our results show a decrease in latency by around 188% on average without sacrificing the throughput
Optimization of energy efficiency in data and WEB hosting centers
Mención Internacional en el título de doctorThis thesis tackles the optimization of energy efficiency in data centers in terms of network
and server utilization.
For what concerns networking utilization the work focuses on Energy Efficient Ethernet
(EEE) - IEEE 802.3az standard - which is the energy-aware alternative to legacy Ethernet, and an
important component of current and future green data centers. More specifically the first contribution
of this thesis consists in deriving and analytical model of gigabit EEE links with coalescing
using M/G/1 queues with sleep and wake-up periods. Packet coalescing has been proposed to save
energy by extending the sojourn in the Low Power Idle state of EEE. The model presented in this
thesis approximates with a good accuracy both the energy saving and the average packet delay by
using a few significant traffic descriptors. While coalescing improves by far the energy efficiency
of EEE, it is still far from achieving energy consumption proportional to traffic. Moreover, coalescing
can introduce high delays. To this extend, by using sensitivity analysis the thesis evaluates
the impact of coalescing timers and buffer sizes, and sheds light on the delay incurred by adopting
coalescing schemes. Accordingly, the design and study of a first family of dynamic algorithms,
namely measurement-based coalescing control (MBCC), is proposed. MBCC schemes tune the
coalescing parameters on-the-fly, according to the instantaneous load and the coalescing delay
experienced by the packets. The thesis also discusses a second family of dynamic algorithms,
namely NT-policy coalescing control (NTCC), that adjusts the coalescing parameters based on
the sole occurrence of timeouts and buffer fill-ups. Furthermore, the performance of static as well
as dynamic coalescing schemes is investigated using real traffic traces. The results reported in this
work show that, by relying on run-time delay measurements, simple and practical MBCC adaptive
coalescing schemes outperform traditional static and dynamic coalescing while the adoption
of NTCC coalescing schemes has practically no advantages with respect to static coalescing when
delay guarantees have to be provided. Notably, MBCC schemes double the energy saving benefit
of legacy EEE coalescing and allow to control the coalescing delay.
For what concerns server utilization, the thesis presents an exhaustive empirical characterization
of the power requirements of multiple components of data center servers. The characterization
is the second key contribution of this thesis, and is achieved by devising different experiments
to stress server components, taking into account the multiple available CPU frequencies and the
presence of multicore servers. The described experiments, allow to measure energy consumption of server components and identify their optimal operational points. The study proves that the
curve defining the minimal CPU power utilization, as a function of the load expressed in Active
Cycles Per Second, is neither concave nor purely convex. Instead, it definitively shows a superlinear
dependence on the load. The results illustrate how to improve the efficiency of network
cards and disks. Finally, the accuracy of the model derived from the server components consumption
characterization is validated by comparing the real energy consumed by two Hadoop
applications - PageRank and WordCount - with the estimation from the model, obtaining errors
below 4:1%, on average.This work has been partially supported by IMDEA Networks Institute and the Greek State Scholarships
FoundationPrograma Oficial de Doctorado en Ingeniería TelemáticaPresidente: Marco Giuseppe Ajmone Marsan.- Secretario: Jose Luis Ayala Rodrigo.- Vocal: Gianluca Antonio Rizz
AUGURES : profit-aware web infrastructure management
Over the last decade, advances in technology together with the increasing use of the Internet for everyday tasks, are causing profound changes in end-users, as well as in businesses and technology providers. The widespread adoption of high-speed and ubiquitous Internet access, is also changing the way users interact with Web applications and their expectations in terms of Quality-of-Service (QoS) and User eXperience (UX). Recently, Cloud computing has been rapidly adopted to host and manage Web applications, due to its inherent cost effectiveness and on-demand scaling of infrastructures. However, system administrators still need to make manual decisions about the parameters that affect the business results of their applications ie., setting QoS targets and defining metrics for scaling the number of servers during the day. Therefore, understanding the workload and user behavior ¿the demand, poses new challenges for capacity planning and scalability ¿the supply, and ultimately for the success of a Web site.
This thesis contributes to the current state-of-art of Web infrastructure management by providing: i) a methodology for predicting Web session revenue; ii) a methodology to determine high response time effect on sales; and iii) a policy for profit-aware resource management, that relates server capacity, to QoS, and sales. The approach leverages Machine Learning (ML) techniques on custom, real-life datasets from an Ecommerce retailer featuring popular Web applications. Where the experimentation shows how user behavior and server performance models can be built from offline information, to determine how demand and supply relations work as resources are consumed. Producing in this way, economical metrics that are consumed by profit-aware policies, that allow the self-configuration of cloud infrastructures to an optimal number of servers under a variety of conditions. While at the same time, the thesis, provides several insights applicable for improving Autonomic infrastructure management and the profitability of Ecommerce applications.Durante la última década, avances en tecnología junto al incremento de uso de Internet, están causando cambios en los usuarios finales, así como también a las empresas y proveedores de tecnología. La adopción masiva del acceso ubicuo a Internet de alta velocidad, crea cambios en la forma de interacción con las aplicaciones Web y en las expectativas de los usuarios en relación de calidad de servicio (QoS) y experiencia de usuario (UX) ofrecidas. Recientemente, el modelo de computación Cloud ha sido adoptado rápidamente para albergar y gestionar aplicaciones Web, debido a su inherente efectividad en costos y servidores bajo demanda. Sin embargo, los administradores de sistema aún tienen que tomar decisiones manuales con respecto a los parámetros de ejecución que afectan a los resultados de negocio p.ej. definir objetivos de QoS y métricas para escalar en número de servidores. Por estos motivos, entender la carga y el comportamiento de usuario (la demanda), pone nuevos desafíos a la planificación de capacidad y escalabilidad (el suministro), y finalmente el éxito de un sitio Web.Esta tesis contribuye al estado del arte actual en gestión de infraestructuras Web presentado: i) una metodología para predecir los beneficios de una sesión Web; ii) una metodología para determinar el efecto de tiempos de respuesta altos en las ventas; y iii) una política para la gestión de recursos basada en beneficios, al relacionar la capacidad de los servidores, QoS, y ventas. La propuesta se basa en aplicar técnicas Machine Learning (ML) a fuentes de datos de producción de un proveedor de Ecommerce, que ofrece aplicaciones Web populares. Donde los experimentos realizados muestran cómo modelos de comportamiento de usuario y de rendimiento de servidor pueden obtenerse de datos históricos; con el fin de determinar la relación entre la demanda y el suministro, según se utilizan los recursos. Produciendo así, métricas económicas que son luego aplicadas en políticas basadas en beneficios, para permitir la auto-configuración de infraestructuras Cloud a un número adecuado de servidores. Mientras que al mismo tiempo, la tesis provee información relevante para mejorar la gestión de infraestructuras Web de forma autónoma y aumentar los beneficios en aplicaciones de Ecommerce
Improving Application Performance in the Emerging Hyper-converged Infrastructure
University of Minnesota Ph.D. dissertation.April 2019. Major: Computer Science. Advisor: David Du. 1 computer file (PDF); viii, 118 pages.In today's world, the hyper-converged infrastructure is emerging as a new type of infrastructure. In the hyper-converged infrastructure, service providers deploy compute, network and storage services on inexpensive hardware rather than expensive proprietary hardware. It allows the service providers to customize the services they can provide by deploying applications in Virtual Machines (VMs) or containers. They can have controls on all resources including compute, network and storage. In this hyper-converged infrastructure, improving the application performance is an important issue. Throughout my Ph.D. research, I have been studying how to improve the performance of applications in the emerging hyper-converged infrastructure. I have been focusing on improving the performance of applications in VMs and in containers when accessing data, and how to improve the performance of applications in the networked storage environment. In the hyper-converged infrastructure, administrators can provide desktop services by deploying Virtual Desktop Infrastructure application (VDI) based on VMs. We first investigate how to identify storage requirements and determine how to meet such requirements with minimal storage resources for VDI application. We create a model to describe the behavior of VDI, and collect real VDI traces to populate this model. The model allows us to identify the storage requirements of VDI and determine the potential bottlenecks in storage. Based on this information, we can tell what capacity and minimum capability a storage system needs in order to support and satisfy a given VDI configuration. We show that our model can describe more fine-grained storage requirements of VDI compared with the rules of thumb which are currently used in industry. In the hyper-converged infrastructure, more and more applications are running in containers. We design and implement a system, called k8sES (k8s Enhanced Storage), that efficiently supports applications with various storage SLOs (Service Level Objectives) along with all other requirements deployed in the Kubernetes environment which is based on containers. Kubernetes (k8s) is a system for managing containerized applications across multiple hosts. The current storage support for containerized applications in k8s is limited. To satisfy users' SLOs, k8s administrators must manually configure storage in advance, and users must know the configurations and capabilities of different types of the provided storage. In k8sES, storage resources are dynamically allocated based on users' requirements. Given users' SLOs, k8sES will select the correct node and storage that can meet their requirements when scheduling applications. The storage allocation mechanism in k8sES also improves the storage utilization efficiency. In addition, we provide a tool to monitor the I/O activities of both applications and storage devices in Kubernetes. With the capabilities of controlling client, network and storage with hyper-convergence, we study how to coordinate different components along the I/O path to ensure latency SLOs for applications in the networked storage environment. We propose and implement JoiNS, a system trying to ensure latency SLOs for applications that access data on remote networked storage. JoiNS carefully considers all the components along the I/O path and controls them in a coordinated fashion. JoiNS has both global network and storage visibility with a logically centralized controller which keeps monitoring the status of each involved component. JoiNS coordinates these components and adjusts the priority of I/Os in each component based on the latency SLO, network and storage status, time estimation, and characteristics of each I/O request
Wireless Resource Management in Industrial Internet of Things
Wireless communications are highly demanded in Industrial Internet of Things (IIoT) to realize the vision of future flexible, scalable and customized manufacturing. Despite the academia research and on-going standardization efforts, there are still many challenges for IIoT, including the ultra-high reliability and low latency requirements, spectral shortage, and limited energy supply. To tackle the above challenges, we will focus on wireless resource management in IIoT in this thesis by designing novel framework, analyzing performance and optimizing wireless resources. We first propose a bandwidth reservation scheme for Tactile Internet in the local area network of IIoT. Specifically, we minimize the reserved bandwidth taking into account the classification errors while ensuring the latency and reliability requirements. We then extend to the more challenging long distance communications for IIoT, which can support the global skill-set delivery network. We propose to predict the future system state and send to the receiver in advance, and thus the delay experienced by the user is reduced. The bandwidth usage is analysed and minimized to ensure delay and reliability requirements. Finally, we address the issue of energy supply in IIoT, where Radio frequency energy harvesting (RFEH) is used to charge unattended IIoT low-power devices remotely and continuously. To motivate the third-party chargers, a contract theory-based framework is proposed, where the optimal contract is derived to maximize the social welfare
- …