    Evaluation of Forwarding Efficiency in NFV-Nodes Toward Predictable Service Chain Performance

    Projeto, implementaçâo e avaliação de um gateway de rede de banda larga usando um processador de pacote programável

    Orientador: Christian Rodolfo Esteve RothenbergDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: BNG, também conhecido como BRAS desempenha um papel crucial na Internet atual, ele maneja a maioria do tráfego da rede de acesso, implementando políticas e serviços que um Internet Service Provider (ISP) define por assinante. Porém, este dispositivo de rede é caro, proprietário, limitado e de lenta atualização, virando um ponto de falha para implantar novas funcionalidades e corrigir problemas na rede sem interromper a operação normal do serviço. Este trabalho pretende projetar, implementar e evaluar um BNG S/W switch flexível e otimizado usando a linguagem P4, aproveitando seus recursos para descrever o processamento de pacotes com programação de plano de dados agnóstica ao alvo que é vantajoso ao fornecer aos desenvolvedores (por exemplo, operadores de rede) uma alternativa aos esquemas de manipulação de pacotes. Nos propomos usar o compilador MACSAD que junta a abstração de P4 e APIs de OpenDataPlane (ODP) para suportar aplicativos de rede baseados em Linux com alto desempenho em varias arquiteturas (ARM, Intel, MIPS e PowerPC). A versão atual do MACSAD dá suporte para a versão anterior da linguagem P4 (P4 _14). Portanto, parte deste trabalho é dar suporte ao compilador MACSAD para o versão atual (P4 _16) no qual é construído o plano de dados de BNG. Além disso, foi feita uma avaliação funcional e de desempenho do BNG S/W switch em um cenário realista e usando dois geradores de tráfego Open Source em H/W e S/W. Os resultados mostram o impacto dos parâmetros do dispositivo alvo, tais como o número de núcleos, tamanhos de burst, pacotes de E/S e cargas de trabalho que afetam o desempenho em termos de taxa de transferência e latência fornecendo os melhores parametros de configuração para nossa implementaçãoAbstract: Broadband Network Gateway (BNG), also known as Broadband Remote Access Server (BRAS) plays a crucial role in today¿s internet, as it handles the majority of access network traffic, implementing network policies and services that a Internet Service Provider (ISP) defines per subscriber. However, this network device is normally expensive, proprietary, limited and slow to upgrade, and a point of failure when deploying new functionalities and correcing issues on the network without disrupting normal service operations. This work intends to design, implement and evaluate a flexible and optimized BNG Software (S/W) switch by using programming protocol-independent packet processors (P4) language, taking advantage of its features to describe the packet processing with target-agnostic data-plane programmability which is advantageous by providing to developers (i.e., network operators) an alternative to packet handling schemes. We propose to use the Multi-Architecture Compiler System for Abstract Dataplanes (MACSAD) compiler which merges the P4 abstraction and OpenDataPlane (ODP) APIs to support Linux-based network applications with high performance across some architectures (ARM, Intel, MIPS, and PowerPC). The actual version of MACSAD brings support for the previous release of the P4 language (P4_14). Therefore part of this work is to bring support to MACSAD for the current release (P4_16) on which the proposed BNG dataplane is built. We use functional and performance evaluation of the BNG S/W switch in a realistic scenario rs. The results show the impact of both target parameters such as the number of cores, burst sizes, packet IO, and workloads and how it affects the performance regarding throughput and latency bringing the best parameter configuration for our implementationMestradoEngenharia de ComputaçãoMestre em Engenharia Elétric

    Hardware-Aware Algorithm Designs for Efficient Parallel and Distributed Processing

    The introduction and widespread adoption of the Internet of Things, together with emerging new industrial applications, bring new requirements in data processing. Specifically, the need for timely processing of data that arrives at high rates creates a challenge for the traditional cloud computing paradigm, where data collected at various sources is sent to the cloud for processing. As an approach to this challenge, processing algorithms and infrastructure are distributed from the cloud to multiple tiers of computing, closer to the sources of data. This creates a wide range of devices for algorithms to be deployed on and software designs to adapt to.In this thesis, we investigate how hardware-aware algorithm designs on a variety of platforms lead to algorithm implementations that efficiently utilize the underlying resources. We design, implement and evaluate new techniques for representative applications that involve the whole spectrum of devices, from resource-constrained sensors in the field, to highly parallel servers. At each tier of processing capability, we identify key architectural features that are relevant for applications and propose designs that make use of these features to achieve high-rate, timely and energy-efficient processing.In the first part of the thesis, we focus on high-end servers and utilize two main approaches to achieve high throughput processing: vectorization and thread parallelism. We employ vectorization for the case of pattern matching algorithms used in security applications. We show that re-thinking the design of algorithms to better utilize the resources available in the platforms they are deployed on, such as vector processing units, can bring significant speedups in processing throughout. We then show how thread-aware data distribution and proper inter-thread synchronization allow scalability, especially for the problem of high-rate network traffic monitoring. We design a parallelization scheme for sketch-based algorithms that summarize traffic information, which allows them to handle incoming data at high rates and be able to answer queries on that data efficiently, without overheads.In the second part of the thesis, we target the intermediate tier of computing devices and focus on the typical examples of hardware that is found there. We show how single-board computers with embedded accelerators can be used to handle the computationally heavy part of applications and showcase it specifically for pattern matching for security-related processing. We further identify key hardware features that affect the performance of pattern matching algorithms on such devices, present a co-evaluation framework to compare algorithms, and design a new algorithm that efficiently utilizes the hardware features.In the last part of the thesis, we shift the focus to the low-power, resource-constrained tier of processing devices. We target wireless sensor networks and study distributed data processing algorithms where the processing happens on the same devices that generate the data. Specifically, we focus on a continuous monitoring algorithm (geometric monitoring) that aims to minimize communication between nodes. By deploying that algorithm in action, under realistic environments, we demonstrate that the interplay between the network protocol and the application plays an important role in this layer of devices. Based on that observation, we co-design a continuous monitoring application with a modern network stack and augment it further with an in-network aggregation technique. In this way, we show that awareness of the underlying network stack is important to realize the full potential of the continuous monitoring algorithm.The techniques and solutions presented in this thesis contribute to better utilization of hardware characteristics, across a wide spectrum of platforms. We employ these techniques on problems that are representative examples of current and upcoming applications and contribute with an outlook of emerging possibilities that can build on the results of the thesis

    MACSAD: Sistema de Compilador Multi-Arquitetura para Planos de Dados Abstratos

    Orientador: Christian Rodolfo Esteve RothenbergTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Redes Definidas por Software (Software-Defined Networking - SDN) almejam um plano de dados programável, além de planos de controle e aplicação flexíveis e escaláveis. Apesar de ter recebido menor atenção quando comparado aos aspectos dos planos de controle e aplicação, o plano de dados concerne uma peça chave nos enigmas de SDN. Nós contemplamos um plano de dados flexível apresentando as características, nomeadas, Programabilidade, Portabilidade, Desempenho e Escalabilidade (Programmability, Portability, Performance, and Scalability - 3PS) como diferentes aspectos de flexibilidade. Enquanto os aspectos de Programabilidade e Portabilidade focam na arquitetura e projeto do plano de dados, Desempenho e Escalabilidade aparecem durante a avaliação do mesmo. Estendemos o foco da evolução do plano de dados de Programabilidade da escola de pensamento SDN para incluir Portabilidade como aspecto de flexibilidade. O plano de dados programável confirma a natureza independente do protocolo, enquanto a Portabilidade atende aos requisitos de arquitetura múltipla do projeto do plano de dados. A linguagem P4, uma nova entrante, sendo uma linguagem de programação de alto nível independente do protocolo e independente do alvo, é capaz de levar a evolução do plano de dados ao próximo nível, desbloqueando as facetas desejadas da flexibilidade do plano de dados. Para trazer esse nível necessário de flexibilidade para um plano de dados, é necessário um sistema de compilador com várias arquiteturas que possa compilar um programa P4 em conformidade com o protocolo e a natureza de independência de destino de P4; No entanto, essa solução de sistema de compilador unificado é o que nos falta. A principal contribuição desta tese, a proposta do Sistema de Compiladores de Arquitetura Múltipla para Planos de Dados (Multi-Architecture Compiler System for Abstract Dataplanes - MACSAD), é um esforço para preencher a lacuna estendendo a abordagem Top-Down de P4 em direção à programabilidade com a abordagem Bottom-Up do OpenDataPlane (ODP) em direção à independência de destino com suas APIs de baixo nível, mas de plataforma cruzada (HW & SW). Reforçamos as contribuições desta tese incluindo aspectos de Desempenho e Escalabilidade da flexibilidade também como parte de nossa avaliação do MACSAD em múltiplos cenários realistasAbstract: Software-Defined Networking (SDN) strives for programmable data plane, yet flexible and scalable control and application planes. Despite having received less attention compared to control and application aspects of SDN, data planes are a critical piece of the SDN puzzle. We envision a flexible data plane showing characteristics, namely, Programmability, Portability, Performance, and Scalability (3PS) as different aspects of flexibility. While Programmability & Portability aspects focus on the architecture and design of the data plane, Performance & Scalability appears during the evaluation of it. We extend the focus of data plane evolution from Programmability from SDN school of thought to include Portability aspect of flexibility. Programmable data plane confirms to protocol-independent nature, whereas Portability addresses multi-architecture requirements of data plane design. P4 language, a new entrant, being a protocol-independent and target-independent high-level programming language is capable to take data plane evolution to the next level by unlocking the desired facets of data plane flexibility. To bring this required level of flexibility to a data plane, a multi-architecture compiler system is necessary which can compile P4 program conforming to protocol & target independence nature of P4; However, such a unified compiler system solution is what we lack of. The main contribution of this thesis, the MACSAD proposal, is an effort to fill the gap by extending the Top-Down approach of P4 towards programmability with Bottom-Up approach of OpenDataPlane (ODP) towards target-independence with its low-level but cross-platform (HW & SW) APIs. We strengthen the contributions of this thesis by including Performance, and Scalability aspects of flexibility too as part of our evaluation of MACSAD in multiple realistic scenariosDoutoradoEngenharia de ComputaçãoDoutor em Engenharia Elétric