4,205 research outputs found

    Mind the Gap: A Comparison of Software Packet Generators

    Get PDF
    Network research relies on packet generators to assess performance and correctness of new ideas. Software-based generators in particular are widely used by academic researchers because of their flexibility, affordability, and open-source nature. The rise of new frameworks for fast IO on commodity hardware is making them even more attractive. Longstanding performance differences of software generation versus hardware in terms of throughput are no longer as big of a concern as they used to be few years ago. This paper investigates the properties of several high-performance software packet generators and the implications on their precision when a given traffic pattern needs to be generated. We believe that the evaluation strategy presented in this paper helps understanding the actual limitations in high-performance software packet generation, thus helping the research community to build better tools

    ANCHOR: logically-centralized security for Software-Defined Networks

    Get PDF
    While the centralization of SDN brought advantages such as a faster pace of innovation, it also disrupted some of the natural defenses of traditional architectures against different threats. The literature on SDN has mostly been concerned with the functional side, despite some specific works concerning non-functional properties like 'security' or 'dependability'. Though addressing the latter in an ad-hoc, piecemeal way, may work, it will most likely lead to efficiency and effectiveness problems. We claim that the enforcement of non-functional properties as a pillar of SDN robustness calls for a systemic approach. As a general concept, we propose ANCHOR, a subsystem architecture that promotes the logical centralization of non-functional properties. To show the effectiveness of the concept, we focus on 'security' in this paper: we identify the current security gaps in SDNs and we populate the architecture middleware with the appropriate security mechanisms, in a global and consistent manner. Essential security mechanisms provided by anchor include reliable entropy and resilient pseudo-random generators, and protocols for secure registration and association of SDN devices. We claim and justify in the paper that centralizing such mechanisms is key for their effectiveness, by allowing us to: define and enforce global policies for those properties; reduce the complexity of controllers and forwarding devices; ensure higher levels of robustness for critical services; foster interoperability of the non-functional property enforcement mechanisms; and promote the security and resilience of the architecture itself. We discuss design and implementation aspects, and we prove and evaluate our algorithms and mechanisms, including the formalisation of the main protocols and the verification of their core security properties using the Tamarin prover.Comment: 42 pages, 4 figures, 3 tables, 5 algorithms, 139 reference

    Scalable parallel communications

    Get PDF
    Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth service to a single application); and (3) coarse grain parallelism will be able to incorporate many future improvements from related work (e.g., reduced data movement, fast TCP, fine-grain parallelism) also with near linear speed-ups

    A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials

    Full text link
    We introduce PVSC-DTM (Parallel Vectorized Stencil Code for Dirac and Topological Materials), a library and code generator based on a domain-specific language tailored to implement the specific stencil-like algorithms that can describe Dirac and topological materials such as graphene and topological insulators in a matrix-free way. The generated hybrid-parallel (MPI+OpenMP) code is fully vectorized using Single Instruction Multiple Data (SIMD) extensions. It is significantly faster than matrix-based approaches on the node level and performs in accordance with the roofline model. We demonstrate the chip-level performance and distributed-memory scalability of basic building blocks such as sparse matrix-(multiple-) vector multiplication on modern multicore CPUs. As an application example, we use the PVSC-DTM scheme to (i) explore the scattering of a Dirac wave on an array of gate-defined quantum dots, to (ii) calculate a bunch of interior eigenvalues for strong topological insulators, and to (iii) discuss the photoemission spectra of a disordered Weyl semimetal.Comment: 16 pages, 2 tables, 11 figure

    Packet Transactions: High-level Programming for Line-Rate Switches

    Full text link
    Many algorithms for congestion control, scheduling, network measurement, active queue management, security, and load balancing require custom processing of packets as they traverse the data plane of a network switch. To run at line rate, these data-plane algorithms must be in hardware. With today's switch hardware, algorithms cannot be changed, nor new algorithms installed, after a switch has been built. This paper shows how to program data-plane algorithms in a high-level language and compile those programs into low-level microcode that can run on emerging programmable line-rate switching chipsets. The key challenge is that these algorithms create and modify algorithmic state. The key idea to achieve line-rate programmability for stateful algorithms is the notion of a packet transaction : a sequential code block that is atomic and isolated from other such code blocks. We have developed this idea in Domino, a C-like imperative language to express data-plane algorithms. We show with many examples that Domino provides a convenient and natural way to express sophisticated data-plane algorithms, and show that these algorithms can be run at line rate with modest estimated die-area overhead.Comment: 16 page

    Performance measurement methodology for integrated services networks

    Get PDF
    With the emergence of advanced integrated services networks, the need for effective performance analysis techniques has become extremely important. Further advancements in these networks can only be possible if the practical performance issues of the existing networks are clearly understood. This thesis is concerned with the design and development of a measurement system which has been implemented on a large experimental network. The measurement system is based on dedicated traffic generators which have been designed and implemented on the Project Unison network. The Unison project is a multisite networking experiment for conducting research into the interconnection and interworking of local area network based multi-media application systems. The traffic generators were first developed for the Cambridge Ring based Unison network. Once their usefulness and effectiveness was proven, high performance traffic generators using transputer technology were built for the Cambridge Fast Ring based Unison network. The measurement system is capable of measuring the conventional performance parameters such as throughput and packet delay, and is able to characterise the operational performance of network bridging components under various loading conditions. In particular, the measurement system has been used in a 'measure and tune' fashion in order to improve the performance of a complex bridging device. Accurate measurement of packet delay in wide area networks is a recognised problem. The problem is associated with the synchronisation of the clocks between the distant machines. A chronological timestamping technique has been introduced in which the clocks are synchronised using a broadcast synchronisation technique. Rugby time clock receivers have been interfaced to each generator for the purpose of synchronisation. In order to design network applications, an accurate knowledge of the expected network performance under different loading conditions is essential. Using the measurement system, this has been achieved by examining the network characteristics at the network/user interface. Also, the generators are capable of emulating a variety of application traffic which can be injected into the network along with the traffic from real applications, thus enabling user oriented performance parameters to be evaluated in a mixed traffic environment. A number of performance measurement experiments have been conducted using the measurement system. Experimental results obtained from the Unison network serve to emphasise the power and effectiveness of the measurement methodology

    FloWatcher-DPDK: lightweight line-rate flow-level monitoring in software

    Get PDF
    In the last few years, several software-based solutions have been proved to be very efficient for high-speed packet processing, traffic generation and monitoring, and can be considered valid alternatives to expensive and non-flexible hardware-based solutions. In our work, we first benchmark heterogeneous design choices for software-based packet monitoring systems in terms of achievable performance and required resources (i.e., the number of CPU cores). Building on this extensive analysis we design FloWatcher-DPDK, a DPDK-based high-speed software traffic monitor we provide to the community as an open source project. In a nutshell, FloWatcher-DPDK provides tunable fine-grained statistics at packet and flow levels. Experimental results demonstrate that FloWatcher-DPDK sustains per-flow statistics with 5-nines precision at high-speed (e.g., 14.88 Mpps) using a limited amount of resources. Finally, we showcase the usage of FloWatcher-DPDK by configuring it to analyze the performance of two open source prototypes for stateful flow-level end-host and in-network packet processing
    • …
    corecore