Search CORE

1,314 research outputs found

On the Efficacy of Fine-Grained Traffic Splitting Protocols in Data Center Networks

Author: Dixit Advait
Kompella Ramana Rao
Prakash Pawan
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2011
Field of study

Multi-rooted tree topologies are commonly used to construct high-bandwidth data center network fabrics. In these networks, switches typically rely on equal-cost multipath (ECMP) routing techniques to split traffic across multiple paths, such that packets within a flow traverse the same end-to-end path. Unfortunately, since ECMP splits traffic based on flow-granularity, it can cause load imbalance across paths resulting in poor utilization of network resources. More finegrained traffic splitting techniques are typically not preferred because they can cause packet reordering that can, according to conventional wisdom, lead to severe TCP throughput degradation. In this work, we revisit this fact in the context of regular data center topologies such as fat-tree architectures. We argue that packet-level traffic splitting, where packets of a flow are sprayed through all available paths, would lead to a better load-balanced network, which in turn leads to significantly more balanced queues and much higher throughput compared to ECMP

CiteSeerX

Purdue E-Pubs

Deep fused flow and topology features for botnet detection basing on pretrained GCN

Author: bo Lang
Xiaoyuan Meng
Publication venue
Publication date: 22/07/2023
Field of study

Nowadays, botnets have become one of the major threats to cyber security. The characteristics of botnets are mainly reflected in bots network behavior and their intercommunication relationships. Existing botnet detection methods use flow features or topology features individually, which overlook the other type of feature. This affects model performance. In this paper, we propose a botnet detection model which uses graph convolutional network (GCN) to deeply fuse flow features and topology features for the first time. We construct communication graphs from network traffic and represent nodes with flow features. Due to the imbalance of existing public traffic flow datasets, it is impossible to train a GCN model on these datasets. Therefore, we use a balanced public communication graph dataset to pretrain a GCN model, thereby guaranteeing its capacity for identify topology features. We then feed the communication graph with flow features into the pretrained GCN. The output from the last hidden layer is treated as the fusion of flow and topology features. Additionally, by adjusting the number of layers in the GCN network, the model can effectively detect botnets under both C2 and P2P structures. Validated on the public ISCX2014 dataset, our approach achieves a remarkable recall rate 92.90% and F1-score 92.76% for C2 botnets, alongside recall rate 94.66% and F1-score of 92.35% for P2P botnets. These results not only demonstrate the effectiveness of our method, but also outperform the performance of the currently leading detection models

arXiv.org e-Print Archive

Recommended from our members

Next Generation Cloud Computing Architectures: Performance and Pricing

Author: Mahajan Kunal
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2021
Field of study

Cloud providers need to optimize the container deployments to efficiently utilize their network, compute and storage resources. In addition, they require an attractive pricing strategy for the compute services like containers, virtual machines, and serverless computing in order to attract users, maximize their profits and achieve a desired utilization of their resources. This thesis aims to tackle the twofold challenge of achieving high performance in container deployments and identifying the pricing for compute services. For performance, the thesis presents a transport-adaptive network architecture (D-TAIL) improving tail latencies. Existing transport protocols such as Homa, pFabric [1, 2] utilize Shortest Remaining Processing Time (SRPT) scheduling policy which is known to have starvation issues for long flows as SRPT prioritizes short flows. D-TAIL addresses this limitation by taking age of the flow in consideration while deciding the priority. D-TAIL shows a maximum reduction of 72%, 29.66% and 28.39% in 99th-percentile FCT for transport protocols like DCTCP, pFabric and Homa respectively. In addition, the thesis also presents a container deployment design utilizing peer-to-peer network and virtual file system with content-addressable storage to address the problem of cold starts in existing container deployment systems. The proposed deployment design increases compute availability, reduces storage requirement and prevents network bottlenecks. For pricing, the thesis studies the tradeoffs between serverless computing (SC) and traditional cloud computing (virtual machine, VM) using realistic cost models, queueing theoretic performance models, and a game theoretic formulation. For customers, we identify their workload distribution between SC and VM to minimize their cost while maintaining a particular performance constraint. For cloud provider, we identify the SC and VM prices to maximize its profit. The main result is the identification and characterization of three optimal operational regimes for both customers and the provider, that leverage either SC or VM only, or both, in a hybrid configuration

Columbia University Academic Commons

Utilizing Public Transport Networks for Bulk Data Transfer

Author: Ahmad Ayesha
Publication venue: Helsingfors universitet
Publication date: 01/01/2019
Field of study

Public transport networks are a subset of vehicular networks with some important distinctions; the actors in the network include buses and bus-stops, they are predictable and they provide reliable physical coverage of an area. The Public Transport Network of a city can also be interpreted as an opportunistic network where nodes are bus-stops and communication between these nodes occurs when a bus travels between two bus-stops. How will a data communication network perform when built upon the opportunistic network formed by the public transport system of a city? In this thesis we explore this question basing our analysis on Helsinki Region’s public bus transport system as a real example. We explore the performance of a public transport network when used for communication of data using both simulation of the network and graph analysis. The key performance factors studied are the data delivery ratio and data delivery time. Additional issues considered are the kind of applications such a system is suited for, the important characteristics governing the reliability and efficacy of such a data communications system, and the design guidelines for building such an application. The results demonstrate that data transfer applications can be built over a city’s Public Transport Network

Helsingin yliopiston digitaalinen arkisto

Delivering Consistent Network Performance in Multi-tenant Data Centers

Author: Haitjema Mart Albert
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2013
Field of study

Data centers are growing rapidly in size and have recently begun acquiring a new role as cloud hosting platforms, allowing outside developers to deploy their own applications on large scales. As a result, today\u27s data centers are multi-tenant environments that host an increasingly diverse set of applications, many of which have very demanding networking requirements. This has prompted research into new data center architectures that offer increased capacity by using topologies that introduce multiple paths between servers. To achieve consistent network performance in these networks, traffic must be effectively load balanced among the available paths. In addition, some form of system-wide traffic regulation is necessary to provide performance guarantees to tenants. To address these issues, this thesis introduces several software-based mechanisms that were inspired by techniques used to regulate traffic in the interconnects of scalable Internet routers. In particular, we borrow two key concepts that serve as the basis for our approach. First, we investigate packet-level routing techniques that are similar to those used to balance load effectively in routers. This work is novel in the data center context because most existing approaches route traffic at the level of flows to prevent their packets from arriving out-of-order. We show that routing at the packet-level allows for far more efficient use of the network\u27s resources and we provide a novel resequencing scheme to deal with out-of-order arrivals. Secondly, we introduce distributed scheduling as a means to engineer traffic in data centers. In routers, distributed scheduling controls the rates between ports on different line cards enabling traffic to move efficiently through the interconnect. We apply the same basic idea to schedule rates between servers in the data center. We show that scheduling can prevent congestion from occurring and can be used as a flexible mechanism to support network performance guarantees for tenants. In contrast to previous work, which relied on centralized controllers to schedule traffic, our approach is fully distributed and we provide a novel distributed algorithm to control rates. In addition, we introduce an optimization problem called backlog scheduling to study scheduling strategies that facilitate more efficient application execution

CiteSeerX

Washington University St. Louis: Open Scholarship

Characterization and Identification of Cloudified Mobile Network Performance Bottlenecks

Author: Elmokashfi Ahmed
Foukas Xenofon
Marina Mahesh K.
Patounas Georgios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/07/2020
Field of study

This study is a first attempt to experimentally explore the range of performance bottlenecks that 5G mobile networks can experience. To this end, we leverage a wide range of measurements obtained with a prototype testbed that captures the key aspects of a cloudified mobile network. We investigate the relevance of the metrics and a number of approaches to accurately and efficiently identify bottlenecks across the different locations of the network and layers of the system architecture. Our findings validate the complexity of this task in the multi-layered architecture and highlight the need for novel monitoring approaches that intelligently fuse metrics across network layers and functions. In particular, we find that distributed analytics performs reasonably well both in terms of bottleneck identification accuracy and incurred computational and communication overhead.Comment: 17 pages, 16 figures, documentclass[journal,comsoc]{IEEEtran}, corrected titl

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Securing the software-defined networking control plane by using control and data dependency techniques

Author: Ujcich Benjamin E
Publication venue
Publication date: 01/08/2020
Field of study

Software-defined networking (SDN) fundamentally changes how network and security practitioners design, implement, and manage their networks. SDN decouples the decision-making about traffic forwarding (i.e., the control plane) from the traffic being forwarded (i.e., the data plane). SDN also allows for network applications, or apps, to programmatically control network forwarding behavior and policy through a logically centralized control plane orchestrated by a set of SDN controllers. As a result of logical centralization, SDN controllers act as network operating systems in the coordination of shared data plane resources and comprehensive security policy implementation. SDN can support network security through the provision of security services and the assurances of policy enforcement. However, SDN’s programmability means that a network’s security considerations are different from those of traditional networks. For instance, an adversary who manipulates the programmable control plane can leverage significant control over the data plane’s behavior. In this dissertation, we demonstrate that the security posture of SDN can be enhanced using control and data dependency techniques that track information flow and enable understanding of application composability, control and data plane decoupling, and control plane insight. We support that statement through investigation of the various ways in which an attacker can use control flow and data flow dependencies to influence the SDN control plane under different threat models. We systematically explore and evaluate the SDN security posture through a combination of runtime, pre-runtime, and post-runtime contributions in both attack development and defense designs. We begin with the development a conceptual accountability framework for SDN. We analyze the extent to which various entities within SDN are accountable to each other, what they are accountable for, mechanisms for assurance about accountability, standards by which accountability is judged, and the consequences of breaching accountability. We discover significant research gaps in SDN’s accountability that impact SDN’s security posture. In particular, the results of applying the accountability framework showed that more control plane attribution is necessary at different layers of abstraction, and that insight motivated the remaining work in this dissertation. Next, we explore the influence of apps in the SDN control plane’s secure operation. We find that existing access control protections that limit what apps can do, such as role-based access controls, prove to be insufficient for preventing malicious apps from damaging control plane operations. The reason is SDN’s reliance on shared network state. We analyze SDN’s shared state model to discover that benign apps can be tricked into acting as “confused deputies”; malicious apps can poison the state used by benign apps, and that leads the benign apps to make decisions that negatively affect the network. That violates an implicit (but unenforced) integrity policy that governs the network’s security. Because of the strong interdependencies among apps that result from SDN’s shared state model, we show that apps can be easily co-opted as “gadgets,” and that allows an attacker who minimally controls one app to make changes to the network state beyond his or her originally granted permissions. We use a data provenance approach to track the lineage of the network state objects by assigning attribution to the set of processes and agents responsible for each control plane object. We design the ProvSDN tool to track API requests from apps as they access the shared network state’s objects, and to check requests against a predefined integrity policy to ensure that low-integrity apps cannot poison high-integrity apps. ProvSDN acts as both a reference monitor and an information flow control enforcement mechanism. Motivated by the strong inter-app dependencies, we investigate whether implicit data plane dependencies affect the control plane’s secure operation too. We find that data plane hosts typically have an outsized effect on the generation of the network state in reactive-based control plane designs. We also find that SDN’s event-based design, and the apps that subscribe to events, can induce dependencies that originate in the data plane and that eventually change forwarding behaviors. That combination gives attackers that are residing on data plane hosts significant opportunities to influence control plane decisions without having to compromise the SDN controller or apps. We design the EventScope tool to automatically identify where such vulnerabilities occur. EventScope clusters apps’ event usage to decide in which cases unhandled events should be handled, statically analyzes controller and app code to understand how events affect control plane execution, and identifies valid control flow paths in which a data plane attacker can reach vulnerable code to cause unintended data plane changes. We use EventScope to discover 14 new vulnerabilities, and we develop exploits that show how such vulnerabilities could allow an attacker to bypass an intended network (i.e., data plane) access control policy. This research direction is critical for SDN security evaluation because such vulnerabilities could be induced by host-based malware campaigns. Finally, although there are classes of vulnerabilities that can be removed prior to deployment, it is inevitable that other classes of attacks will occur that cannot be accounted for ahead of time. In those cases, a network or security practitioner would need to have the right amount of after-the-fact insight to diagnose the root causes of such attacks without being inundated with too much informa- tion. Challenges remain in 1) the modeling of apps and objects, which can lead to overestimation or underestimation of causal dependencies; and 2) the omission of a data plane model that causally links control and data plane activities. We design the PicoSDN tool to mitigate causal dependency modeling challenges, to account for a data plane model through the use of the data plane topology to link activities in the provenance graph, and to account for network semantics to appropriately query and summarize the control plane’s history. We show how prior work can hinder investigations and analysis in SDN-based attacks and demonstrate how PicoSDN can track SDN control plane attacks.Ope

Illinois Digital Environment for Access to Learning and Scholarship Repository

Automatic Parallelization of Software Network Functions

Author: Pedrosa Luis
Pereira Francisco
Ramos Fernando M. V.
Publication venue
Publication date: 13/10/2023
Field of study

Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.Comment: 21 pages, 14 figures, to be published in NSDI2

arXiv.org e-Print Archive

Recommended from our members

Understanding the characteristics of Internet traffic and designing an efficient RaptorQ-based data transport protocol for modern data centres

Author: Alasmar Mohammed
Publication venue
Publication date: 18/12/2019
Field of study

This thesis is the amalgamation of research on efficient data transport protocols for data centres and a comprehensive and systematic study of Internet traffic, which came as a result of the need to understand traffic patterns and workloads in modern computer networks. The first part of the thesis is on the development of efficient data transport pro- tocols for data centres. We study modern data transport protocols for data centres through large scale simulations using the OMNeT++ simulator. We developed and experimented with an OMNeT++ model of NDP. This has led to the identification of limitations of the state of the art and the formulation of research questions with respect to data transport protocols for modern data centres. The developed model includes an implementation of a Fat-tree topology and per-packet ECMP load bal- ancing. We discuss how we integrated the model with the INET Framework and validated it by running various experiments that test different model parameters and components. This work revealed limitations of NDP with respect to efficient one-to-many and many-to-one communication in data centres, which led to the de- velopment of SCDP, a novel and general-purpose data transport protocol for data centres that, in contrast to all other protocols proposed to date, natively supports one-to-many and many-to-one data communication, which is extremely common in modern data centres. SCDP does so without compromising on efficiency for short and long unicast flows. SCDP achieves this by integrating RaptorQ codes with receiver-driven data transport, in-network packet trimming and Multi-Level Feed- back Queuing (MLFQ); (1) RaptorQ codes enable efficient one-to-many and many- to-one data transport; (2) on top of RaptorQ codes, receiver- driven flow control, in combination with in-network packet trimming, enable efficient usage of network re- sources as well as multi-path transport and packet spraying for all transport modes. Incast and Outcast are eliminated; (3) the systematic nature of RaptorQ codes, in combination with MLFQ, enable fast, decoding-free completion of short flows. We extensively evaluated SCDP in a wide range of simulated scenarios with realistic data centre workloads. For one-to-many and many-to-one transport sessions, SCDP performs significantly better than NDP. For short and long unicast flows, SCDP performs equally well or better compared to NDP. In the second part of the thesis, we extensively study Internet traffic. Getting good statistical models of traffic on network links is a well-known, often-studied problem. A lot of attention has been given to correlation patterns and flow duration. The distribution of the amount of traffic per unit time is an equally important but less studied problem. We study a large number of traffic traces from many different networks including academic, commercial and residential networks using state-of-the-art statistical techniques. We show that the log-normal distribution is a better fit than the Gaussian distribution. We also investigate a second, heavy- tailed distribution and show that its performance is better than Gaussian but worse than log-normal. We examine anomalous traces which are a poor fit for all tested distributions and show that this is often due to traffic outages or links that hit maximum capacity. Stationarity tests showed that the traffic is stationary at some range of aggregation times. We demonstrate the utility of the log-normal distribution in two contexts: predicting the proportion of time traffic will exceed a given level (for link capacity estimation) and predicting 95th percentile pricing. We also show the log-normal distribution is a better predictor than Gaussian orWeibull distributions

Sussex Research Online