16 research outputs found

    PINT: Probabilistic In-band Network Telemetry

    Get PDF
    © 2020 ACM. Commodity network devices support adding in-band telemetry measurements into data packets, enabling a wide range of applications, including network troubleshooting, congestion control, and path tracing. However, including such information on packets adds significant overhead that impacts both flow completion times and application-level performance. We introduce PINT, an in-band network telemetry framework that bounds the amount of information added to each packet. PINT encodes the requested data on multiple packets, allowing per-packet overhead limits that can be as low as one bit. We analyze PINT and prove performance bounds, including cases when multiple queries are running simultaneously. PINT is implemented in P4 and can be deployed on network devices.Using real topologies and traffic characteristics, we show that PINT concurrently enables applications such as congestion control, path tracing, and computing tail latencies, using only sixteen bits per packet, with performance comparable to the state of the art

    Architecting Data Centers for High Efficiency and Low Latency

    Full text link
    Modern data centers, housing remarkably powerful computational capacity, are built in massive scales and consume a huge amount of energy. The energy consumption of data centers has mushroomed from virtually nothing to about three percent of the global electricity supply in the last decade, and will continuously grow. Unfortunately, a significant fraction of this energy consumption is wasted due to the inefficiency of current data center architectures, and one of the key reasons behind this inefficiency is the stringent response latency requirements of the user-facing services hosted in these data centers such as web search and social networks. To deliver such low response latency, data center operators often have to overprovision resources to handle high peaks in user load and unexpected load spikes, resulting in low efficiency. This dissertation investigates data center architecture designs that reconcile high system efficiency and low response latency. To increase the efficiency, we propose techniques that understand both microarchitectural-level resource sharing and system-level resource usage dynamics to enable highly efficient co-locations of latency-critical services and low-priority batch workloads. We investigate the resource sharing on real-system simultaneous multithreading (SMT) processors to enable SMT co-locations by precisely predicting the performance interference. We then leverage historical resource usage patterns to further optimize the task scheduling algorithm and data placement policy to improve the efficiency of workload co-locations. Moreover, we introduce methodologies to better manage the response latency by automatically attributing the source of tail latency to low-level architectural and system configurations in both offline load testing environment and online production environment. We design and develop a response latency evaluation framework at microsecond-level precision for data center applications, with which we construct statistical inference procedures to attribute the source of tail latency. Finally, we present an approach that proactively enacts carefully designed causal inference micro-experiments to diagnose the root causes of response latency anomalies, and automatically correct them to reduce the response latency.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144144/1/yunqi_1.pd

    Empowering Cloud Data Centers with Network Programmability

    Get PDF
    Cloud data centers are a critical infrastructure for modern Internet services such as web search, social networking and e-commerce. However, the gradual slow-down of Moore’s law has put a burden on the growth of data centers’ performance and energy efficiency. In addition, the increasing of millisecond-scale and microsecond-scale tasks also bring higher requirements to the throughput and latency for the cloud applications. Today’s server-based solutions are hard to meet the performance requirements in many scenarios like resource management, scheduling, high-speed traffic monitoring and testing. In this dissertation, we study these problems from a network perspective. We investigate a new architecture that leverages the programmability of new-generation network switches to improve the performance and reliability of clouds. As programmable switches only provide very limited memory and functionalities, we exploit compact data structures and deeply co-design software and hardware to best utilize the resource. More specifically, this dissertation presents four systems: (i) NetLock: A new centralized lock management architecture that co-designs programmable switches and servers to simultaneously achieve high performance and rich policy support. It provides orders-of-magnitude higher throughput than existing systems with microsecond-level latency, and supports many commonly-used policies such as performance isolation. (ii) HCSFQ: A scalable and practical solution to implement hierarchical fair queueing on commodity hardware at line rate. Instead of relying on a hierarchy of queues with complex queue management, HCSFQ does not keep per-flow states and uses only one queue to achieve hierarchical fair queueing. (iii) AIFO: A new approach for programmable packet scheduling that only uses a single FIFO queue. AIFO utilizes an admission control mechanism to approximate PIFO which is theoretically ideal but hard to implement with commodity devices. (iv) Lumina: A tool that enables fine-grained analysis of hardware network stack. By exploiting network programmability to emulate various network scenarios, Lumina is able to help users understand the micro-behaviors of hardware network stacks

    Analyzing Data-center Application Performance Via Constraint-based Models

    Get PDF
    Hyperscale Data Centers (HDCs) are the largest distributed computing machines ever constructed. They serve as the backbone for many popular applications, such as YouTube, Netflix, Meta, and Airbnb, which involve millions of users and generate billions in revenue. As the networking infrastructure plays a pivotal role in determining the performance of HDC applications, understanding and optimizing their networking performance is critical. This thesis proposes and evaluates a constraint-based approach to characterize the networking performance of HDC applications. Through extensive evaluations conducted in both controlled settings and real-world case studies within a production HDC, I demonstrated the effectiveness of the constraint-based approach in handling the immense volume of performance data in HDCs, achieving tremendous dimension reduction, and providing very useful interpretability.Doctor of Philosoph

    Every timestamp counts: accurate tracking of network latencies using reconcilable difference aggregator

    Get PDF
    IEEE User-facing services deployed in data centers must respond quickly to user actions. The measurement of network latencies is of paramount importance. Recently, a new family of compact data structures has been proposed to estimate one-way latencies. In order to achieve scalability, these new methods rely on timestamp aggregation. Unfortunately, this approach suffers from serious accuracy problems in the presence of packet loss and reordering, given that a single lost or out-of-order packet may invalidate a huge number of aggregated samples. In this paper, we unify the problem to detect lost and reordered packets within the set reconciliation framework. Although the set reconciliation approach and the data structures for aggregating packet timestamps are previously known, the combination of these two principles is novel. We present a space-efficient synopsis called reconcilable difference aggregator (RDA). RDA maximizes the percentage of useful packets for latency measurement by mapping packets to multiple banks and repairing aggregated samples that have been damaged by lost and reordered packets. RDA simultaneously obtains the average and the standard deviation of the latency. We provide a formal guarantee of the performance and derive optimized parameters. We further design and implement a user-space passive latency measurement system that addresses practical issues of integrating RDA into the network stack. Our extensive evaluation shows that compared with existing methods, our approach improves the relative error of the average latency estimation in 10-15 orders of magnitude, and the relative error of the standard deviation in 0.5-6 orders of magnitude.Peer ReviewedPostprint (author's final draft

    A monitoring and threat detection system using stream processing as a virtual function for big data

    Get PDF
    The late detection of security threats causes a significant increase in the risk of irreparable damages, disabling any defense attempt. As a consequence, fast realtime threat detection is mandatory for security guarantees. In addition, Network Function Virtualization (NFV) provides new opportunities for efficient and low-cost security solutions. We propose a fast and efficient threat detection system based on stream processing and machine learning algorithms. The main contributions of this work are i) a novel monitoring threat detection system based on stream processing; ii) two datasets, first a dataset of synthetic security data containing both legitimate and malicious traffic, and the second, a week of real traffic of a telecommunications operator in Rio de Janeiro, Brazil; iii) a data pre-processing algorithm, a normalizing algorithm and an algorithm for fast feature selection based on the correlation between variables; iv) a virtualized network function in an open-source platform for providing a real-time threat detection service; v) near-optimal placement of sensors through a proposed heuristic for strategically positioning sensors in the network infrastructure, with a minimum number of sensors; and, finally, vi) a greedy algorithm that allocates on demand a sequence of virtual network functions.A detecção tardia de ameaças de segurança causa um significante aumento no risco de danos irreparáveis, impossibilitando qualquer tentativa de defesa. Como consequência, a detecção rápida de ameaças em tempo real é essencial para a administração de segurança. Além disso, A tecnologia de virtualização de funções de rede (Network Function Virtualization - NFV) oferece novas oportunidades para soluções de segurança eficazes e de baixo custo. Propomos um sistema de detecção de ameaças rápido e eficiente, baseado em algoritmos de processamento de fluxo e de aprendizado de máquina. As principais contribuições deste trabalho são: i) um novo sistema de monitoramento e detecção de ameaças baseado no processamento de fluxo; ii) dois conjuntos de dados, o primeiro ´e um conjunto de dados sintético de segurança contendo tráfego suspeito e malicioso, e o segundo corresponde a uma semana de tráfego real de um operador de telecomunicações no Rio de Janeiro, Brasil; iii) um algoritmo de pré-processamento de dados composto por um algoritmo de normalização e um algoritmo para seleção rápida de características com base na correlação entre variáveis; iv) uma função de rede virtualizada em uma plataforma de código aberto para fornecer um serviço de detecção de ameaças em tempo real; v) posicionamento quase perfeito de sensores através de uma heurística proposta para posicionamento estratégico de sensores na infraestrutura de rede, com um número mínimo de sensores; e, finalmente, vi) um algoritmo guloso que aloca sob demanda uma sequencia de funções de rede virtual

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
    corecore