372 research outputs found
Policy Conflict Management in Distributed SDN Environments
abstract: The ease of programmability in Software-Defined Networking (SDN) makes it a great platform for implementation of various initiatives that involve application deployment, dynamic topology changes, and decentralized network management in a multi-tenant data center environment. However, implementing security solutions in such an environment is fraught with policy conflicts and consistency issues with the hardness of this problem being affected by the distribution scheme for the SDN controllers.
In this dissertation, a formalism for flow rule conflicts in SDN environments is introduced. This formalism is realized in Brew, a security policy analysis framework implemented on an OpenDaylight SDN controller. Brew has comprehensive conflict detection and resolution modules to ensure that no two flow rules in a distributed SDN-based cloud environment have conflicts at any layer; thereby assuring consistent conflict-free security policy implementation and preventing information leakage. Techniques for global prioritization of flow rules in a decentralized environment are presented, using which all SDN flow rule conflicts are recognized and classified. Strategies for unassisted resolution of these conflicts are also detailed. Alternately, if administrator input is desired to resolve conflicts, a novel visualization scheme is implemented to help the administrators view the conflicts in an aesthetic manner. The correctness, feasibility and scalability of the Brew proof-of-concept prototype is demonstrated. Flow rule conflict avoidance using a buddy address space management technique is studied as an alternate to conflict detection and resolution in highly dynamic cloud systems attempting to implement an SDN-based Moving Target Defense (MTD) countermeasures.Dissertation/ThesisDoctoral Dissertation Computer Science 201
Recommended from our members
Elastic Resource Management in Distributed Clouds
The ubiquitous nature of computing devices and their increasing reliance on remote resources have driven and shaped public cloud platforms into unprecedented large-scale, distributed data centers. Concurrently, a plethora of cloud-based applications are experiencing multi-dimensional workload dynamics---workload volumes that vary along both time and space axes and with higher frequency.
The interplay of diverse workload characteristics and distributed clouds raises several key challenges for efficiently and dynamically managing server resources. First, current cloud platforms impose certain restrictions that might hinder some resource management tasks. Second, an application-agnostic approach might not entail appropriate performance goals, therefore, requires numerous specific methods. Third, provisioning resources outside LAN boundary might incur huge delay which would impact the desired agility.
In this dissertation, I investigate the above challenges and present the design of automated systems that manage resources for various applications in distributed clouds. The intermediate goal of these automated systems is to fully exploit potential benefits such as reduced network latency offered by increasingly distributed server resources. The ultimate goal is to improve end-to-end user response time with novel resource management approaches, within a certain cost budget.
Centered around these two goals, I first investigate how to optimize the location and performance of virtual machines in distributed clouds. I use virtual desktops, mostly serving a single user, as an example use case for developing a black-box approach that ranks virtual machines based on their dynamic latency requirements. Those with high latency sensitivities have a higher priority of being placed or migrated to a cloud location closest to their users. Next, I relax the assumption of well-provisioned virtual machines and look at how to provision enough resources for applications that exhibit both temporal and spatial workload fluctuations. I propose an application-agnostic queueing model that captures the resource utilization and server response time. Building upon this model, I present a geo-elastic provisioning approach---referred as geo-elasticity---for replicable multi-tier applications that can spin up an appropriate amount of server resources in any cloud locations. Last, I explore the benefits of providing geo-elasticity for database clouds, a popular platform for hosting application backends. Performing geo-elastic provisioning for backend database servers entails several challenges that are specific to database workload, and therefore requires tailored solutions. In addition, cloud platforms offer resources at various prices for different locations. Towards this end, I propose a cost-aware geo-elasticity that combines a regression-based workload model and a queueing network capacity model for database clouds.
In summary, hosting a diverse set of applications in an increasingly distributed cloud makes it interesting and necessary to develop new, efficient and dynamic resource management approaches
A Design Framework for Efficient Distributed Analytics on Structured Big Data
Distributed analytics architectures are often comprised of two elements: a compute engine and a storage system. Conventional distributed storage systems usually store data in the form of files or key-value pairs. This abstraction simplifies how the data is accessed and reasoned about by an application developer. However, the separation of compute and storage systems makes it difficult to optimize costly disk and network operations. By design the storage system is isolated from the workload and its performance requirements such as block co-location and replication. Furthermore, optimizing fine-grained data access requests becomes difficult as the storage layer is hidden away behind such abstractions.
Using a clean slate approach, this thesis proposes a modular distributed analytics system design which is centered around a unified interface for distributed data objects named the DDO. The interface couples key mechanisms that utilize storage, memory, and compute resources. This coupling makes it ideal to optimize data access requests across all memory hierarchy levels, with respect to the workload and its performance requirements. In addition to the DDO, a complementary DDO controller implementation controls the logical view of DDOs, their replication, and distribution across the cluster. A proof-of-concept implementation shows improvement in mean query time by 3-6x on the TPC-H and TPC-DS benchmarks, and more than an order of magnitude improvement in many cases
Minimal deployable endpoint-driven network forwarding: principle, designs and applications
Networked systems now have significant impact on human lives: the Internet, connecting the world globally, is the foundation of our information age, the data centers, running hundreds of thousands of servers, drive the era of cloud computing, and even the Tor project, a networked system providing online anonymity, now serves millions of daily users.
Guided by the end-to-end principle, many computer networks have been designed with a simple and flexible core offering general data transfer service, whereas the bulk of the application-level functionalities have been implemented on endpoints that are attached to the edge of the network. Although the end-to-end design principle gives these networked systems tremendous success, a number of new requirements have emerged for computer networks and their running applications, including untrustworthy of endpoints, privacy requirement of endpoints, more demanding applications, the rise of third-party Intermediaries and the asymmetric capability of endpoints and so on. These emerging requirements have created various challenges in different networked systems.
To address these challenges, there are no obvious solutions without adding in-network functions to the network core. However, no design principle has ever been proposed for guiding the implementation of in-network functions. In this thesis, We propose the first such principle and apply this principle to propose four designs in three different networked systems to address four separate challenges. We demonstrate through detailed implementation and extensive evaluations that the proposed principle can live in harmony with the end-to-end principle, and a combination of the two principle offers more complete, effective and accurate guides for innovating the modern computer networks and their applications.Ope
Improving the performance of software-defined networks using dynamic flow installation and management techniques
As computer networks evolve, they become more complex, introducing several challenges in the areas of performance and management. Such problems can lead to stagnation in network innovation. Software Defined Networks (SDN) framework could be one of the best candidates for improving and revolutionising networking by giving the full control to the network administrators to implement new management and performance optimisation techniques.
This thesis examines performance issues faced in SDN due to the introduction of the SDN Controller. These issues include the extra delay due to the round-trip time between the switch and the controller as well as the fact that some packets arrive at the destination out-of-order.
We propose a novel dynamic flow installation and management algorithm (OFPE) using the SDN protocol OpenFlow, which preserves the controller to a non-overloaded CPU state and allow it to dynamically add and adjust flow table rules to reduce packet delay and out-of-order packets. In addition, we propose OFPEX, an extension to OFPE algorithm that includes techniques for managing multi-switch environments as well as methods that make use of the packets interarrival time in categorising and serving packet flows. Such techniques allow topology awareness, helping the controller to install flow table rules in such a way to form optimal routes for high priority flows thus increasing network performance. For the performance evaluation of the proposed algorithms, both hardware testbed as well as emulation experiments have been conducted. The performance results indicate that OFPE algorithm achieves a significant enhancement in performance in the form of reduced delay by up to 92.56% (depending on the scenario), reduced packet loss by up to 55.32% and reduced out-of-order packets by up to 69.44%.
Furthermore, we propose a novel placement algorithm for distributed Mininet implementations which uses weights in order to distribute the experiment components to the appropriately distributed machines. The proposed algorithm uses static code analysis in order to examine the experimental code as well as it measures the capabilities of physical components in order to create a weights table which is then used to distribute the experiment components properly. The performance results of the proposed algorithm evaluation indicated reductions in delay and packet loss of up to 65.51% and 86.35% respectively, as well as a decrease in the standard deviation of CPU usage by up to 88.63%. These results indicate that the proposed algorithm distributes the experiment components evenly across the available resources.
Finally, we propose a series of Benchmarking tests that can be used to rate all the available SDN experimental platforms. These tests allow the selection of the appropriate experimental platform according to the scenario needs as well as they indicate the resources needed by each platform
MINIMIZATION OF RESOURCE CONSUMPTION THROUGH WORKLOAD CONSOLIDATION IN LARGE-SCALE DISTRIBUTED DATA PLATFORMS
The rapid increase in the data volumes encountered in many application domains has led to widespread adoption of parallel and distributed data management systems like parallel databases and MapReduce-based frameworks (e.g., Hadoop) in recent years. Use of such parallel and distributed frameworks is expected to accelerate in the coming years, putting further strain on already-scarce resources like compute power, network bandwidth, and energy. To reduce total execution times, there is a trend towards increasing execution parallelism by spreading out data across a large number of machines. However, this often increases the total resource consumption, and especially energy consumption, significantly because of process startup costs and other overheads (e.g., communication overheads). In this dissertation, we develop several data management techniques to minimize resource consumption through workload consolidation.
In this dissertation, we introduce a key metric called query span, i.e., number of machines involved in the execution of a query or a job. In order to minimize the per query resource consumption we propose to minimize query span. To that end, we develop several workload-driven data partitioning and replica selection algorithms that attempt to minimize the average query span by exploiting the fact that most distributed environments need to use replication for fault tolerance. Extensive experiments on various datasets show that judicious data placement and replication can dramatically reduce the average query spans resulting in significant reductions in resource consumption. We show our results primarily on two applications, distributed data warehouse system and distributed information retrieval. In the first case, we show that minimizing average query spans can minimize overall resource consumption for a given workload and can also improve the performance of complex analytical queries. In the second case, our approach minimizes the overall search cost as well as effectively trades off search cost with load imbalance.
The best case of resource efficiency for any underlying data processing system is achieved when the job or the query can be run efficiently on a single machine (i.e., query span=1). In the final part of dissertation, we discuss an in-memory MapReduce system optimized for performing complex analytics tasks on input data sizes that fit in a single machine's memory. We argue that systems like Hadoop that are designed to operate across a large number of machines are not optimal in performance for small and medium sized complex analytics tasks because of high startup costs, heavy disk activity, and wasteful checkpointing. We have developed a prototype runtime called HONE that is API compatible with standard (distributed) Hadoop. In other words, we can take existing Hadoop code and run it, without modification, on a multi-core shared memory machine. This allows us to take existing Hadoop algorithms and find the most suitable runtime environment for execution on datasets of varying sizes.
Overall, in this dissertation, our key contributions in this work include identification of key metric query span and its relationship with overall resource consumption in scale-out architectures. We introduce several workload-aware techniques to optimize this key metric. We go on to demonstrate the effectiveness of query span minimization on different application scenarios. In order to take advantage of scale-up architectures effectively we develop novel in-memory MapReduce system HONE for single machine. Our thorough experiments on real and synthetic datasets demonstrate the efficacy of our proposed approaches
Towards an Automatic Microservices Manager for Hybrid Cloud Edge Environments
Cloud computing came to make computing resources easier to access thus helping a
faster deployment of applications/services benefiting from the scalability provided by
the service providers. It has been registered an exponential growth of the data volume
received by the cloud. This is due to the fact that almost every device used in everyday
life are connected to the internet sharing information in a global scale (ex: smartwatches,
clocks, cars, industrial equipment’s). Increasing the data volume results in an increased
latency in client applications resulting in the degradation of the QoS (Quality of service).
With these problems, hybrid systems were born by integrating the cloud resources
with the various edge devices between the cloud and edge, Fog/Edge computation. These
devices are very heterogeneous, with different resources capabilities (such as memory
and computational power), and geographically distributed.
Software architectures also evolved and microservice architecture emerged to make
application development more flexible and increase their scalability. The Microservices
architecture comprehends decomposing monolithic applications into small services each
one with a specific functionality and that can be independently developed, deployed and
scaled. Due to their small size, microservices are adquate for deployment on Hybrid
Cloud/Edge infrastructures. However, the heterogeneity of those deployment locations
makes microservices’ management and monitoring rather complex. Monitoring, in particular,
is essential when considering that microservices may be replicated and migrated
in the cloud/edge infrastructure.
The main problem this dissertation aims to contribute is to build an automatic system
of microservices management that can be deployed in hybrid infrastructures cloud/fog
computing. Such automatic system will allow edge enabled applications to have an
adaptive deployment at runtime in response to variations inworkloads and computational
resources available. Towards this end, this work is a first step on integrating two existing
projects that combined may support an automatic system. One project does the automatic
management of microservices but uses only an heavy monitor, Prometheus, as a cloud
monitor. The second project is a light adaptive monitor. This thesis integrates the light
monitor into the automatic manager of microservices.A computação na Cloud surgiu como forma de simplificar o acesso aos recursos computacionais,
permitindo um deployment mais rápido das aplicações e serviços como resultado
da escalabilidade suportada pelos provedores de serviços.
Computação na cloud surgiu para facilitar o acesso aos recursos de computação provocando
um facultamento no deployment de aplicações/serviços sendo benéfico para a
escalabilidade fornecida pelos provedores de serviços. Tem-se registado um crescimento
exponencial do volume de data que é recebido pela cloud. Este aumento deve-se ao facto de
quase todos os dispositivos utilizados no nosso quotidiano estarem conectados à internet
(exemplos destes são, relogios, maquinas industriais, carros). Este aumento no volume de
dados resulta num aumento da latência para as aplicações cliente, resultando assim numa
degradação na qualidade de serviço QoS.
Com estes problemas, nasceram os sistemas hÃbridos, nascidos pela integração dos
recursos de cloud com os variados dispositivos presentes no caminho entre a cloud e
a periferia denominando-se computação na Edge/Fog (Computação na periferia). Estes
dispositivos apresentam uma grande heterogeneidade e são geograficamente muito
distribuÃdos.
As arquitecturas dos sistemas também evoluÃram emergindo a arquitectura de micro
serviços que permitem tornar o desenvolvimento de aplicações não só mais flexivel
como para aumentar a sua escalabilidade. A arquitetura de micro serviços consiste na
decomposição de aplicações monolÃticas em pequenos serviços, onde cada um destes
possuà uma funcionalidade especÃfica e que pode ser desenvolvido, lançado e migrado
de forma independente. Devido ao seu tamanho os micro serviços são adequados para
serem lançados em ambientes de infrastructuras hÃbridas (cloud e periferia). No entanto,
a heterogeneidade da localização para serem lançados torna a gestão e monitorização
de micro serviços bastante mais complexa. A monitorização, em particular, é essencial
quando consideramos que os micro serviços podem ser replicados e migrados nestas
infrastruturas de cloud e periferia (Edge).
O problema abordado nesta dissertação é contribuir para a construção de um sistema
automático de gestão de micro serviços que podem ser lançados em estruturas hibridas.
Este sistema automático irá tornar possÃvel à s aplicações que estão na edge possuÃrem um
deploy adaptativo enquanto estão em execução, como resposta às variações dos recursos
computacionais disponÃveis e suas cargas. Para chegar a este fim, este trabalho será o primeiro passo na integração de dois projectos já existentes que, juntos poderão suportar
umsistema automático. Umdeles realiza a gestão automática de micro serviços mas utiliza
apenas o Prometheus como monitor na cloud, enquanto o segundo projecto é um monitor
leve adaptativo. Esta tese integra então um monitor leve com um gestor automático de
micro serviços
- …