4 research outputs found

    Optimising the data-collection time of a large-scale data-acquisition system

    Get PDF
    Data-acquisition systems are a fundamental component of modern scientific experiments. Large-scale experiments, particularly in the field of particle physics, comprise millions of sensors and produce petabytes of data per day. Their data-acquisition systems digitise, collect, filter, and store experimental signals for later analysis. The performance and reliability of these systems are critical to the operation of the experiment: insufficient performance and failures result in the loss of valuable scientific data. By its very nature, data acquisition is a synchronous many-to-one operation: every time a phenomenon is observed by the experiment, data from its various sensors must be assembled into a single coherent dataset. This characteristic yields a particularly challenging traffic pattern for computer networks dedicated to data acquisition. If no corrective measures are taken, this pattern, known as incast, results in a significant underutilisation of the network resources, with a direct impact on a data-acquisition systems' throughput. This thesis presents effective and feasible approaches to maximising network utilisation in data-acquisition systems, avoiding the incast problem without sacrificing throughput. Rather than using abstract models, it focuses on an existing large-scale experiment, used as a case-study: the ATLAS detector at the Large Hadron Collider. First, the impact of incast on data-acquisition performance is characterised through a series of measurements performed on the actual data-acquisition system of the ATLAS experiment. As the size of the data sent synchronously by multiple sources to the same destination grows past the size of the network buffers, the throughput falls. A simple but effective mitigation is proposed and tested: at the application-layer, the data-collection receivers can limit the number of senders they simultaneously collect data from. This solution recovers a large part of the throughput lost to incast, but introduces some performance losses of its own. Further investigations are enabled by the development of a complete packet-level model of the ATLAS data-acquisition network in an event-based simulation framework. Comparing real-world measurements and simulation results, the model is shown to be accurate enough to be used for studying the incast phenomenon in a data-acquisition system. Leveraging the simulation model, various optimisations are analysed. The focus is kept on practical software changes, that can realistically be deployed on otherwise unmodified existing systems. Receiver-side traffic-shaping, incast- and traffic-shaping-aware work scheduling policies, tuning of TCP's timeouts, and centralised network packet injection scheduling are evaluated alone and in combination. Used together, the first three techniques result in a very significant increase of the system's throughput, which gets within 10% of the ideal maximum performance, even with a high network traffic load

    A Lossless Switch for Data Acquisition Networks

    No full text
    The recent trends in software-defined networking (SDN) and network function virtualization (NFV) are boosting the advance of software-based packet processing and forwarding on commodity servers. Although performance has traditionally been the challenge of this approach, this situation changes with modern server platforms. High performance load balancers, proxies, virtual switches and other network functions can be now implemented in software and not limited to specialized commercial hardware, thus reducing cost and increasing the flexibility. In this paper we design a lossless software-based switch for high bandwidth data acquisition (DAQ) networks, using the ATLAS experiment at CERN as a case study. We prove that it can effectively solve the incast pathology arising from the many-to-one communication pattern present in DAQ networks by providing extremely high buffering capabilities. We evaluate this on a commodity server equipped with twelve 10 Gbps Ethernet interfaces providing a total bandwidth of 120 Gbps

    A Lossless Switch for Data Acquisition Networks

    No full text
    The recent trends in software-defined networking (SDN) and network function virtualization (NFV) are boosting the advance of software-based packet processing and forwarding on commodity servers. Although performance has traditionally been the challenge of this approach, this situation changes with modern server platforms. High performance load balancers, proxies, virtual switches and other network functions can be now implemented in software and not limited to specialized commercial hardware, thus reducing cost and increasing the flexibility. In this paper we design a lossless software-based switch for high bandwidth data acquisition (DAQ) networks, using the ATLAS experiment at CERN as a case study. We prove that it can effectively solve the incast pathology arising from the many-to-one communication pattern present in DAQ networks by providing extremely high buffering capabilities. We evaluate this on a commodity server equipped with twelve 10 Gbps Ethernet interfaces providing a total bandwidth of 120 Gbps

    A lossless switch for data acquisition networks

    No full text
    corecore