30 research outputs found

    Incast mitigation in a data center storage cluster through a dynamic fair-share buffer policy

    Get PDF
    Incast is a phenomenon when multiple devices interact with only one device at a given time. Multiple storage senders overflow either the switch buffer or the single-receiver memory. This pattern causes all concurrent-senders to stop and wait for buffer/memory availability, and leads to a packet loss and retransmission—resulting in a huge latency. We present a software-defined technique tackling the many-to-one communication pattern—Incast—in a data center storage cluster. Our proposed method decouples the default TCP windowing mechanism from all storage servers, and delegates it to the software-defined storage controller. The proposed method removes the TCP saw-tooth behavior, provides a global flow awareness, and implements the dynamic fair-share buffer policy for end-to-end I/O path. It considers all I/O stages (applications, device drivers, NICs, switches/routers, file systems, I/O schedulers, main memory, and physical disks) while achieving the maximum I/O throughput. The policy, which is part of the proposed method, allocates fair-share bandwidth utilization for all storage servers. Priority queues are incorporated to handle the most important data flows. In addition, the proposed method provides better manageability and maintainability compared with traditional storage networks, where data plane and control plane reside in the same device

    Satellite Networks: Architectures, Applications, and Technologies

    Get PDF
    Since global satellite networks are moving to the forefront in enhancing the national and global information infrastructures due to communication satellites' unique networking characteristics, a workshop was organized to assess the progress made to date and chart the future. This workshop provided the forum to assess the current state-of-the-art, identify key issues, and highlight the emerging trends in the next-generation architectures, data protocol development, communication interoperability, and applications. Presentations on overview, state-of-the-art in research, development, deployment and applications and future trends on satellite networks are assembled

    High precision packet time-stamping using NetFPGA 10 G plataform

    Full text link
    High precision network measurements is an area with high interest as the performance of the networks a ects the quality and the cost of a service between a Network Service Provider (NSP) and the costumer. The increase of the network speed leads the measurements of the software system to be unreliable even though their low cost and the high con gurability. The solution for high network performance measurement at high network speed is hardware system that can guarantee standard high performance. The NetFPGA is an open source low-cost platform based on networks that permits to implement network system easily due to the wide reference components that o ers. The second version of the NetFPGA platform designed by the Stanford University has four 10GigE SFP+ interfaces and a powerful FPGA providing the ability to implement network system over copper and optic ber at 1Gbps and 10Gbps. This NetFPGA 10G project can measure the network parameters at high precision with the technique of the time stamping. A GPS system guarantees the high precision of the time. The dynamically generation of back-to-back packets gives the exibility for measurements without any recaptured ows that no other system provides. The save of the captured packets gives the possibility of o -line further analysis of the network.Medidas de red de alta precisión es un área de gran interés de como el desempeño de las redes afecta la calidad y el costo del servicio entre un proveedor de servicios de red (NSP) y el consumidor. El incremento de la velocidad de las redes lleva a que las mediciones por software sean poco fiables a pesar de su bajo coste y alta configurabilidad. La solución para mediciones de alto rendimiento en redes de alta velocidad son sistemas hardware que pueden garantizar alta rendimiento estándar. La NetFPGA es una plataforma de código abierto de bajo coste basado en redes que permite implementar sistemas de red con facilidad debido al gran soporte de componentes de referencia que ofrece. La segunda versión de la plataforma de NetFPGA desarrollada por la Universidad de Stanford tiene cuatro interfaces de 10GigE de tecnología SFP+ y una potente FPGA que permite implementar sistemas de red con conexiones de cobre y de fibra óptica de 1Gbps y 10Gbps. Este proyecto de NetFPGA 10G puede medir parámetros de red con alta precisión con la técnica de marca de tiempo (time-stamping). Un sistema GPS garantiza alta precisión de tiempo. La generación dinámica de paquetes consecutivos da la flexibilidad para mediciones sin reproducir tráfico anteriormente capturado, cosa que otros sistemas no pueden hacer. El guardado de paquetes generados da la posibilidad de futuros análisis sin repetir los experimentos (análisis off-line

    An Efficient Framework of Congestion Control for Next-Generation Networks

    Get PDF
    The success of the Internet can partly be attributed to the congestion control algorithm in the Transmission Control Protocol (TCP). However, with the tremendous increase in the diversity of networked systems and applications, TCP performance limitations are becoming increasingly problematic and the need for new transport protocol designs has become increasingly important.Prior research has focused on the design of either end-to-end protocols (e.g., CUBIC) that rely on implicit congestion signals such as loss and/or delay or network-based protocols (e.g., XCP) that use precise per-flow feedback from the network. While the former category of schemes haveperformance limitations, the latter are hard to deploy, can introduce high per-packet overhead, and open up new security challenges. This dissertation explores the middle ground between these designs and makes four contributions. First, we study the interplay between performance and feedback in congestion control protocols. We argue that congestion feedback in the form of aggregate load can provide the richness needed to meet the challenges of next-generation networks and applications. Second, we present the design, analysis, and evaluation of an efficient framework for congestion control called Binary Marking Congestion Control (BMCC). BMCC uses aggregate load feedback to achieve efficient and fair bandwidth allocations on high bandwidth-delaynetworks while minimizing packet loss rates and average queue length. BMCC reduces flow completiontimes by up to 4x over TCP and uses only the existing Explicit Congestion Notification bits.Next, we consider the incremental deployment of BMCC. We study the bandwidth sharing properties of BMCC and TCP over different partial deployment scenarios. We then present algorithms for ensuring safe co-existence of BMCC and TCP on the Internet. Finally, we consider the performance of BMCC over Wireless LANs. We show that the time-varying nature of the capacity of a WLAN can lead to significant performance issues for protocols that require capacity estimates for feedback computation. Using a simple model we characterize the capacity of a WLAN and propose the usage of the average service rate experienced by network layer packets as an estimate for capacity. Through extensive evaluation, we show that the resulting estimates provide good performance

    High Performance Network Evaluation and Testing

    Get PDF

    Improving software middleboxes and datacenter task schedulers

    Get PDF
    Over the last decades, shared systems have contributed to the popularity of many technologies. From Operating Systems to the Internet, they have all brought significant cost savings by allowing the underlying infrastructure to be shared. A common challenge in these systems is to ensure that resources are fairly divided without compromising utilization efficiency. In this thesis, we look at problems in two shared systems—software middleboxes and datacenter task schedulers—and propose ways of improving both efficiency and fairness. We begin by presenting Sprayer, a system that uses packet spraying to load balance packets to cores in software middleboxes. Sprayer eliminates the imbalance problems of per-flow solutions and addresses the new challenges of handling shared flow state that come with packet spraying. We show that Sprayer significantly improves fairness and seamlessly uses the entire capacity, even when there is a single flow in the system. After that, we present Stateful Dominant Resource Fairness (SDRF), a task scheduling policy for datacenters that looks at past allocations and enforces fairness in the long run. We prove that SDRF keeps the fundamental properties of DRF—the allocation policy it is built on—while benefiting users with lower usage. To efficiently implement SDRF, we also introduce live tree, a general-purpose data structure that keeps elements with predictable time-varying priorities sorted. Our trace-driven simulations indicate that SDRF reduces users’ waiting time on average. This improves fairness, by increasing the number of completed tasks for users with lower demands, with small impact on high-demand users.Nas últimas décadas, sistemas compartilhados contribuíram para a popularidade de muitas tecnologias. Desde Sistemas Operacionais até a Internet, esses sistemas trouxeram economias significativas ao permitir que a infraestrutura subjacente fosse compartilhada. Um desafio comum a esses sistemas é garantir que os recursos sejam divididos de forma justa, sem comprometer a eficiência de utilização. Esta dissertação observa problemas em dois sistemas compartilhados distintos—middleboxes em software e escalonadores de tarefas de datacenters—e propõe maneiras de melhorar tanto a eficiência como a justiça. Primeiro é apresentado o sistema Sprayer, que usa espalhamento para direcionar pacotes entre os núcleos em middleboxes em software. O Sprayer elimina os problemas de desbalanceamento causados pelas soluções baseadas em fluxos e lida com os novos desafios de manipular estados de fluxo, consequentes do espalhamento de pacotes. É mostrado que o Sprayer melhora a justiça de forma significativa e consegue usar toda a capacidade, mesmo quando há apenas um fluxo no sistema. Depois disso, é apresentado o SDRF, uma política de alocação de tarefas para datacenters que considera as alocações passadas e garante justiça ao longo do tempo. Prova-se que o SDRF mantém as propriedades fundamentais do DRF—a política de alocação em que ele se baseia—enquanto beneficia os usuários com menor utilização. Para implementar o SDRF de forma eficiente, também é introduzida a árvore viva, uma estrutura de dados genérica que mantém ordenados elementos cujas prioridades variam com o tempo. Simulações com dados reais indicam que o SDRF reduz o tempo de espera na média. Isso melhora a justiça, ao aumentar o número de tarefas completas dos usuários com menor demanda, tendo um impacto pequeno nos usuários de maior demanda

    A Quality of Service framework for upstream traffic in LTE across an XG-PON backhaul

    Get PDF
    Passive Optical Networks (PON) are promising as a transport network technology due to the high network capacity, long reach and strong QoS support in the latest PON standards. Long Term Evolution (LTE) is a popular wireless technology for its large data rates in the last mile. The natural integration of LTE and XG-PON, which is one of the latest standards of PON, presents several challenges for XG-PON to satisfy the backhaul QoS requirements of aggregated upstream LTE applications. This thesis proves that a dedicated XG-PON-based backhaul is capable of ensuring the QoS treatment required by different upstream application types in LTE, by means of standard-compliant Dynamic Bandwidth Allocation (DBA) mechanisms. First the design and evaluation of a standard-compliant, robust and fast XG-PON simulation module developed for the state-of-the-art ns-3 network simulator is presented in the thesis. This XG-PON simulation module forms a trustworthy and large-scale simulation platform for the evaluations in the rest of the thesis, and has been released for use by the scientific community. The design and implementation details of the XGIANT DBA, which provides standard complaint QoS treatment in an isolated XG-PON network, are then presented in the thesis along with comparative evaluations with the recently-published EBU DBA. The evaluations explored the ability of both XGIANT and EBU in terms of queuing-delay and throughput assurances for different classes of simplified (deterministic) traffic models, for a range of upstream loading in XG-PON. The evaluation of XGIANT and EBU DBAs are then presented for the context of a dedicated XG-PON backhaul in LTE with regard to the influence of standard-compliant and QoS-aware DBAs on the performance of large-scale, UDP-based applications. These evaluations disqualify both XGIANT and EBU DBAs in providing prioritised queuing delay performances for three upstream application types (conversational voice, peer-to-peer video and best-effort Internet) in LTE; the evaluations also indicate the need to have more dynamic and efficient QoS policies, along with an improved fairness policy in a DBA used in the dedicated XG-PON backhaul to ensure the QoS requirements of the upstream LTE applications in the backhaul. Finally, the design and implementation details of two standard-compliant DBAs, namely Deficit XGIANT (XGIANT-D) and Proportional XGIANT (XGIANT-P), which provide the required QoS treatment in the dedicated XG-PON backhaul for all three application types in the LTE upstream are presented in the thesis. Evaluations of the XGIANT-D and XGIANT-P DBAs presented in the thesis prove the ability of the fine-tuned QoS and fairness policies in the DBAs in ensuring prioritised and fair queuing-delay and throughput efficiency for UDP- and TCP-based applications, generated and aggregated based on realistic conditions in the LTE upstream

    Enhancing programmability for adaptive resource management in next generation data centre networks

    Get PDF
    Recently, Data Centre (DC) infrastructures have been growing rapidly to support a wide range of emerging services, and provide the underlying connectivity and compute resources that facilitate the "*-as-a-Service" model. This has led to the deployment of a multitude of services multiplexed over few, very large-scale centralised infrastructures. In order to cope with the ebb and flow of users, services and traffic, infrastructures have been provisioned for peak-demand resulting in the average utilisation of resources to be low. This overprovisionning has been further motivated by the complexity in predicting traffic demands over diverse timescales and the stringent economic impact of outages. At the same time, the emergence of Software Defined Networking (SDN), is offering new means to monitor and manage the network infrastructure to address this underutilisation. This dissertation aims to show how measurement-based resource management can improve performance and resource utilisation by adaptively tuning the infrastructure to the changing operating conditions. To achieve this dynamicity, the infrastructure must be able to centrally monitor, notify and react based on the current operating state, from per-packet dynamics to longstanding traffic trends and topological changes. However, the management and orchestration abilities of current SDN realisations is too limiting and must evolve for next generation networks. The current focus has been on logically centralising the routing and forwarding decisions. However, in order to achieve the necessary fine-grained insight, the data plane of the individual device must be programmable to collect and disseminate the metrics of interest. The results of this work demonstrates that a logically centralised controller can dynamically collect and measure network operating metrics to subsequently compute and disseminate fine-tuned environment-specific settings. They show how this approach can prevent TCP throughput incast collapse and improve TCP performance by an order of magnitude for partition-aggregate traffic patterns. Futhermore, the paradigm is generalised to show the benefits for other services widely used in DCs such as, e.g, routing, telemetry, and security

    Towards automatic traffic classification and estimation for available bandwidth in IP networks.

    Get PDF
    Growing rapidly, today's Internet is becoming more difficult to manage. A good understanding of what kind of network traffic classes are consuming network resource as well as how much network resource is available is important for many management tasks like QoS provisioning and traffic engineering. In the light of these objectives, two measurement mechanisms have been explored in this thesis. This thesis explores a new type of traffic classification scheme with automatic and accurate identification capability. First of all, the novel concept of IP flow profile, a unique identifier to the associated traffic class, has been proposed and the relevant model using five IP header based contexts has been presented. Then, this thesis shows that the key statistical features of each context, in the IP flow profile, follows a Gaussian distribution and explores how to use Kohonen Neural Network (KNN) for the purpose of automatically producing IP flow profile map. In order to improve the classification accuracy, this thesis investigates and evaluates the use of PCA for feature selection, which enables the produced patterns to be as tight as possible since tight patterns lead to less overlaps among patterns. In addition, the use of Linear Discriminant Analysis and alternative KNN maps has been investigated as to deal with the overlap issue between produced patterns. The entirety of this process represents a novel addition to the quest for automatic traffic classification in IP networks. This thesis also develops a fast available bandwidth measurement scheme. It firstly addresses the dynamic problem for the one way delay (OWD) trend detection. To deal with this issue, a novel model - asymptotic OWD Comparison (AOC) model for the OWD trend detection has been proposed. Then, three statistical metrics SOT (Sum of Trend), PTC (Positive Trend Checking) and CTC (Complete Trend Comparison) have been proposed to develop the AOC algorithms. To validate the proposed AOC model, an avail-bw estimation tool called Pathpair has been developed and evaluated in the Planetlah environment
    corecore