448 research outputs found

    Multi-Gbps HTTP traffic analysis in commodity hardware based on local knowledge of TCP streams

    Get PDF
    In this paper we propose and implement novel techniques for performance evaluation of web traffic (response time, response code, etc.), with no reassembly of the underlying TCP connection, which severely restricts the traffic analysis throughput. Furthermore, our proposed software for HTTP traffic analysis runs in standard hardware, which is very cost-effective. Besides, we present sub-TCP connection load balancing techniques that significantly increase throughput at the expense of losing very few HTTP transactions. Such techniques provide performance evaluation statistics which are indistinguishable from the single-threaded alternative with full TCP connection reassembly. © 2017 Elsevier B.V

    Diluting the Scalability Boundaries: Exploring the Use of Disaggregated Architectures for High-Level Network Data Analysis

    Get PDF
    Traditional data centers are designed with a rigid architecture of fit-for-purpose servers that provision resources beyond the average workload in order to deal with occasional peaks of data. Heterogeneous data centers are pushing towards more cost-efficient architectures with better resource provisioning. In this paper we study the feasibility of using disaggregated architectures for intensive data applications, in contrast to the monolithic approach of server-oriented architectures. Particularly, we have tested a proactive network analysis system in which the workload demands are highly variable. In the context of the dReDBox disaggregated architecture, the results show that the overhead caused by using remote memory resources is significant, between 66\% and 80\%, but we have also observed that the memory usage is one order of magnitude higher for the stress case with respect to average workloads. Therefore, dimensioning memory for the worst case in conventional systems will result in a notable waste of resources. Finally, we found that, for the selected use case, parallelism is limited by memory. Therefore, using a disaggregated architecture will allow for increased parallelism, which, at the same time, will mitigate the overhead caused by remote memory.Comment: 8 pages, 6 figures, 2 tables, 32 references. Pre-print. The paper will be presented during the IEEE International Conference on High Performance Computing and Communications in Bangkok, Thailand. 18 - 20 December, 2017. To be published in the conference proceeding

    Network Traffic Processing with PFQ

    Get PDF
    This paper presents Packet Family Queue (PFQ), a high-performance framework for packet processing designed to flexibly handle network applications parallelism and making traffic processing safe and easy. PFQ is an open-source module for the Linux kernel that combines software-accelerated packet I/O to in-kernel early stage packet processing and fine-grained distribution to network applications and physical devices. PFQ does not require any modification to network device drivers and exposes programming interfaces to multi-threaded applications natively designed to run on top of it, as well as to legacy monitoring tools using the pcap library. The results show that the flexibility and the backward compatibility provided by PFQ do not impact its processing performance that, in fact, reaches line rate figures in the cases of pure speed tests and real practical monitoring use cases on 10+ Gb/s links

    Accurate and Resource-Efficient Monitoring for Future Networks

    Get PDF
    Monitoring functionality is a key component of any network management system. It is essential for profiling network resource usage, detecting attacks, and capturing the performance of a multitude of services using the network. Traditional monitoring solutions operate on long timescales producing periodic reports, which are mostly used for manual and infrequent network management tasks. However, these practices have been recently questioned by the advent of Software Defined Networking (SDN). By empowering management applications with the right tools to perform automatic, frequent, and fine-grained network reconfigurations, SDN has made these applications more dependent than before on the accuracy and timeliness of monitoring reports. As a result, monitoring systems are required to collect considerable amounts of heterogeneous measurement data, process them in real-time, and expose the resulting knowledge in short timescales to network decision-making processes. Satisfying these requirements is extremely challenging given today’s larger network scales, massive and dynamic traffic volumes, and the stringent constraints on time availability and hardware resources. This PhD thesis tackles this important challenge by investigating how an accurate and resource-efficient monitoring function can be realised in the context of future, software-defined networks. Novel monitoring methodologies, designs, and frameworks are provided in this thesis, which scale with increasing network sizes and automatically adjust to changes in the operating conditions. These achieve the goal of efficient measurement collection and reporting, lightweight measurement- data processing, and timely monitoring knowledge delivery

    THE IDENTIFICATION OF MAJOR FACTORS IN THE DEPLOYMENT OF A SCIENCE DMZ AT SMALL INSTITUTIONS

    Get PDF
    The Science DMZ is a network research tool offering superior large science data transmission between two locations. Through a network design that places the Science DMZ at the edge of the campus network, the Science DMZ defines a network path that avoids packet inspecting devices (firewalls, packet shapers) and produces near line-rate transmission results for large data sets between institutions. Small institutions of higher education (public and private small colleges) seeking to participate in data exchange with other institutions are inhibited in the construction of Science DMZs due to the high costs of deployment. While the National Science Foundation made 18 awards in the Campus Cyberinfrastructure program to investigate the designs, methods, costs, and results of deploying Science DMZs at small institutions, there lacks a cohort view of the success factors and options that must be considered in developing the most impactful solution for any given small institution environment. This research examined the decisions and results of the 18 NSF Science DMZ projects, recording a series of major factors in the small institution deployments, and establishing the Science DMZ Capital Framework (SCF), a model to be considered prior to starting a small institution Science DMZ project

    Enhancing HPC on Virtual Systems in Clouds through Optimizing Virtual Overlay Networks

    Get PDF
    Virtual Ethernet overlay provides a powerful model for realizing virtual distributed and parallel computing systems with strong isolation, portability, and recoverability properties. However, in extremely high throughput and low latency networks, such overlays can suffer from bandwidth and latency limitations, which is of particular concern in HPC environments. Through a careful and quantitative analysis, I iden- tify three core issues limiting performance: delayed and excessive virtual interrupt delivery into guests, copies between host and guest data buffers during encapsulation, and the semantic gap between virtual Ethernet features and underlying physical network features. I propose three novel optimizations in response: optimistic timer- free virtual interrupt injection, zero-copy cut-through data forwarding, and virtual TCP offload. These optimizations improve the latency and bandwidth of the overlay network on 10 Gbps Ethernet and InfiniBand interconnects, resulting in near-native performance for a wide range of microbenchmarks and MPI application benchmarks

    Disk-to-Disk Data Transfer using A Software Defined Networking Solution

    Get PDF
    There have been efforts towards improving the network performance using software defined net-working solutions. One such work is Steroid OpenFlow Service (SOS), which utilizes multiple parallel TCP connections to enhance the network performance transparently to the user. SOS has shown significant improvements in the memory-to-memory data transfer throughput; however, it’s perfor-mance for disk-to-disk data transfer hasn’t been studied. For computing applications involving big data, the data files are stored on non-volatile storage devices separate from the computing servers. Before computing can occur, large volumes of data must be fetched from the “remote” storage devices to the computing server’s local storage device. Since hard drives are the most commonly adopted storage devices today, the process is often called “disk-to-disk” data transfer. For production high performance computing facilities, specialized high throughput data transfer software will be provided for users to copy the data first to a data transfer node before copying to the computing server. Disk-to-Disk data transfer’s throughput performance depends on the network throughput be-tween servers and disk access performance between each server and its storage device. Due to large data sizes the storage devices are typically parallel file systems spanning multiple disks. Disk oper-ations in the disk-to-disk data transfer includes disk read and write operations. The read operation in the transfer is to read the data from the disks and store it in memory. The second step in the transfer is to send out the data to the network through the network interface. Data reaching the destination server is then stored to the disk. Data transfer is faced by multiple delays and is limited at each step of the transfer. To date, one commonly adopted data transfer solution is GridFTP developed by the Argonne National Laboratory. It requires custom application installations and configurations on the hosts. SOS, on the other hand, is a transparent network application without special user software. In this thesis, disk-to-disk data transfer performance is studied with both GridFTP and SOS. The thesis focuses on to two topics, one is the detailed analysis of transfer components for each tool and the second part consists of a systematic experiment to study the two. The experimentation and analysis of the results shows that configuring the data nodes and network with correct parameters results in maximum performance for disk-to-disk data transfer. The GridFTP, for example, is able to get to close to 7Gbps by using four parallel connections with TCP buffer size of 16MB. It achieves the maximum performance by filling the network pipe which has 10Gbps end-to-end link with round trip time (RTT) of 53ms

    Parallel and distributed processing in high speed traffic monitoring

    Get PDF
    This thesis presents a parallel and distributed approach for the purpose of processing network traffic at high speeds. The proposed architecture provides the processing power required to run one or more traffic processing applications at line rates by means of processing full packets at multi-gigabits speeds using a parallel and distributed processing environment. Moreover, the architecture is flexible and scalable to future needs by supporting heterogeneous processing nodes such as different hardware architectures or different generations of the same hardware architecture. In addition to the processing, flexibility, and scalability features, our architecture provides an easy-to-use environment with the help of a new programming language, called FPL, for traffic processing in a distributed environment. The language and its compiler come to hide specific programming details when using heterogeneous systems and a distributed environment.UBL - phd migration 201

    High-speed analysis of SMB2 file sharing traffic without TCP stream reconstruction

    Get PDF
    Trabajo presentado a la 5th IEEE International Symposium on Measurements and Networking (M&N) 2019. Italia, 2019This paper presents a file sharing traffic analysis methodology for Server Message Block (SMB), a common protocol in the corporate environment. The design is focused on improving the traffic analysis rate that can be obtained per CPU core in the analysis machine. SMB is most commonly transported over Transmission Control Protocol (TCP) and therefore its analysis requires TCP stream reconstruction. We evaluate a traffic analysis design which does not require stream reconstruction. We compare the results obtained to a reference full reconstruction analysis, both in accuracy of the measurements and maximum rate per CPU core. We achieve an increment of 30% in the traffic processing rate, at the expense of a small loss in accuracy computing the probability distribution function for the protocol response times.This work was supported by Spanish MINECO through project PIT (TEC2015-69417-C2-2-R)
    corecore