7 research outputs found

    Management, Optimization and Evolution of the LHCb Online Network

    Get PDF
    The LHCb experiment is one of the four large particle detectors running at the Large Hadron Collider (LHC) at CERN. It is a forward single-arm spectrometer dedicated to test the Standard Model through precision measurements of Charge-Parity (CP) violation and rare decays in the b quark sector. The LHCb experiment will operate at a luminosity of 2x10^32cm-2s-1, the proton-proton bunch crossings rate will be approximately 10 MHz. To select the interesting events, a two-level trigger scheme is applied: the rst level trigger (L0) and the high level trigger (HLT). The L0 trigger is implemented in custom hardware, while HLT is implemented in software runs on the CPUs of the Event Filter Farm (EFF). The L0 trigger rate is dened at about 1 MHz, and the event size for each event is about 35 kByte. It is a serious challenge to handle the resulting data rate (35 GByte/s). The Online system is a key part of the LHCb experiment, providing all the IT services. It consists of three major components: the Data Acquisition (DAQ) system, the Timing and Fast Control (TFC) system and the Experiment Control System (ECS). To provide the services, two large dedicated networks based on Gigabit Ethernet are deployed: one for DAQ and another one for ECS, which are referred to Online network in general. A large network needs sophisticated monitoring for its successful operation. Commercial network management systems are quite expensive and dicult to integrate into the LHCb ECS. A custom network monitoring system has been implemented based on a Supervisory Control And Data Acquisition (SCADA) system called PVSS which is used by LHCb ECS. It is a homogeneous part of the LHCb ECS. In this thesis, it is demonstrated how a large scale network can be monitored and managed using tools originally made for industrial supervisory control. The thesis is organized as the follows: Chapter 1 gives a brief introduction to LHC and the B physics on LHC, then describes all sub-detectors and the trigger and DAQ system of LHCb from structure to performance. Chapter 2 first introduces the LHCb Online system and the dataflow, then focuses on the Online network design and its optimization. In Chapter 3, the SCADA system PVSS is introduced briefly, then the architecture and implementation of the network monitoring system are described in detail, including the front-end processes, the data communication and the supervisory layer. Chapter 4 first discusses the packet sampling theory and one of the packet sampling mechanisms: sFlow, then demonstrates the applications of sFlow for the network trouble-shooting, the traffic monitoring and the anomaly detection. In Chapter 5, the upgrade of LHC and LHCb is introduced, the possible architecture of DAQ is discussed, and two candidate internetworking technologies (high speed Ethernet and InfniBand) are compared in different aspects for DAQ. Three schemes based on 10 Gigabit Ethernet are presented and studied. Chapter 6 is a general summary of the thesis

    Harnessing low-level tuning in modern architectures for high-performance network monitoring in physical and virtual platforms

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones. Fecha de lectura: 02-07-201

    WDM/TDM PON bidirectional networks single-fiber/wavelength RSOA-based ONUs layer 1/2 optimization

    Get PDF
    This Thesis proposes the design and the optimization of a hybrid WDM/TDM PON at the L1 (PHY) and L2 (MAC) layers, in terms of minimum deployment cost and enhanced performance for Greenfield NGPON. The particular case of RSOA-based ONUs and ODN using a single-fibre/single-wavelength is deeply analysed. In this WDM/TDM PON relevant parameters are optimized. Special attention has been given at the main noise impairment in this type of networks: the Rayleigh Backscattering effect, which cannot be prevented. To understand its behaviour and mitigate its effects, a novel mathematical model for the Rayleigh Backscattering in burst mode transmission is presented for the first time, and it has been used to optimize the WDM/TDM RSOA based PON. Also, a cost-effective, simple design SCM WDM/TDM PON with rSOA-based ONU, was optimized and implemented. This prototype was successfully tested showing high performance, robustness, versatility and reliability. So, the system is able to give coverage up to 1280 users at 2.5 Gb/s / 1.25 Gb/s downstream/upstream, over 20 Km, and being compatible with the GPON ITU-T recommendation. This precedent has enabled the SARDANA network to extend the design, architecture and capabilities of a WDM/TDM PON for a long reach metro-access network (100 km). A proposal for an agile Transmission Convergence sub-layer is presented as another relevant contribution of this work. It is based on the optimization of the standards GPON and XG-PON (for compatibility), but applied to a long reach metro-access TDM/WDM PON rSOA-based network with higher client count. Finally, a proposal of physical implementation for the SARDANA layer 2 and possible configurations for SARDANA internetworking, with the metro network and core transport network, are presented

    The ATLAS ROBIN – A High-Performance Data-Acquisition Module

    Get PDF
    This work presents the re-configurable processor ROBIN, which is a key element of the data-acquisition-system of the ATLAS experiment, located at the new LHC at CERN. The ATLAS detector provides data over 1600 channels simultaneously towards the DAQ system. The ATLAS dataflow model follows the “PULL” strategy in contrast to the commonly used “PUSH” strategy. The data volume transported is reduced by a factor of 10, however the data must be temporarily stored at the entry to the DAQ system. The input layer consists of approx. 160 ROS read-out units comprising 1 PC and 4 ROBIN modules. Each ROBIN device acquires detector data via 3 input channels and performs local buffering. Board control is done via a 64-bit PCI interface. Event selection and data transmission runs via PCI in the baseline bus-based ROS. Alternatively, a local GE interface can take over part or all of the data traffic in the switch-based ROS, in order to reduce the load on the host PC. The performance of the ROBIN module stems from the close cooperation of a fast embedded processor with a complex FPGA. The efficient task-distribution lets the processor handle all complex management functionality, programmed in “C” while all movement of data is performed by the FPGA via multiple, concurrently operating DMA engines. The ROBIN-project was carried-out by and international team and comprises the design specification, the development of the ROBIN hardware, firmware (VHDL and C-Code), host-code (C++), prototyping, volume production and installation of 700 boards. The project was led by the author of this thesis. The hardware platform is an evolution of a FPGA processor previously designed by the author. He has contributed elementary concepts of the communication mechanisms and the “C”-coded embedded application software. He also organised and supervised the prototype and series productions including the various design reports and presentations. The results show that the ROBIN-module is able to meet its ambitious requirements of 100kHz incoming fragment rate per channel with a concurrent outgoing fragment rate of 21kHz per channel. At the system level, each ROS unit (12 channels) operates at the same rates, however for a subset of the channels only. The ATLAS DAQ system – with 640 ROBIN modules installed – has performed a successful data-taking phase at the start-up of the LHC in September

    Data Movement Challenges and Solutions with Software Defined Networking

    Get PDF
    With the recent rise in cloud computing, applications are routinely accessing and interacting with data on remote resources. Interaction with such remote resources for the operation of media-rich applications in mobile environments is also on the rise. As a result, the performance of the underlying network infrastructure can have a significant impact on the quality of service experienced by the user. Despite receiving significant attention from both academia and industry, computer networks still face a number of challenges. Users oftentimes report and complain about poor experiences with their devices and applications, which can oftentimes be attributed to network performance when downloading or uploading application data. This dissertation investigates problems that arise with data movement across computer networks and proposes novel solutions to address these issues through software defined networking (SDN). SDN is lauded to be the paradigm of choice for next generation networks. While academia explores use cases in various contexts, industry has focused on data center and wide area networks. There is a significant range of complex and application-specific network services that can potentially benefit from SDN, but introduction and adoption of such solutions remains slow in production networks. One impeding factor is the lack of a simple yet expressive enough framework applicable to all SDN services across production network domains. Without a uniform framework, SDN developers create disjoint solutions, resulting in untenable management and maintenance overhead. The SDN-based solutions developed in this dissertation make use of a common agent-based approach. The architecture facilitates application-oriented SDN design with an abstraction composed of software agents on top of the underlying network. There are three key components modern and future networks require to deliver exceptional data transfer performance to the end user: (1) user and application mobility, (2) high throughput data transfer, and (3) efficient and scalable content distribution. Meeting these key components will not only ensure the network can provide robust and reliable end-to-end connectivity, but also that network resources will be used efficiently. First, mobility support is critical for user applications to maintain connectivity to remote, cloud-based resources. Today\u27s network users are frequently accessing such resources while on the go, transitioning from network to network with the expectation that their applications will continue to operate seamlessly. As users perform handovers between heterogeneous networks or between networks across administrative domains, the application becomes responsible for maintaining or establishing new connections to remote resources. Although application developers often account for such handovers, the result is oftentimes visible to the user through diminished quality of service (e.g. rebuffering in video streaming applications). Many intra-domain handover solutions exist for handovers in WiFi and cellular networks, such as mobile IP, but they are architecturally complex and have not been integrated to form a scalable, inter-domain solution. A scalable framework is proposed that leverages SDN features to implement both horizontal and vertical handovers for heterogeneous wireless networks within and across administrative domains. User devices can select an appropriate network using an on-board virtual SDN implementation that manages available network interfaces. An SDN-based counterpart operates in the network core and edge to handle user migrations as they transition from one edge attachment point to another. The framework was developed and deployed as an extension to the Global Environment for Network Innovations (GENI) testbed; however, the framework can be deployed on any OpenFlow enabled network. Evaluation revealed users can maintain existing application connections without breaking the sockets and requiring the application to recover. Second, high throughput data transfer is essential for user applications to acquire large remote data sets. As data sizes become increasingly large, often combined with their locations being far from the applications, the well known impact of lower Transmission Control Protocol (TCP) throughput over large delay-bandwidth product paths becomes more significant to these applications. While myriads of solutions exist to alleviate the problem, they require specialized software and/or network stacks at both the application host and the remote data server, making it hard to scale up to a large range of applications and execution environments. This results in high throughput data transfer that is available to only a select subset of network users who have access to such specialized software. An SDN based solution called Steroid OpenFlow Service (SOS) has been proposed as a network service that transparently increases the throughput of TCP-based data transfers across large networks. SOS shifts the complexity of high performance data transfer from the end user to the network; users do not need to configure anything on the client and server machines participating in the data transfer. The SOS architecture supports seamless high performance data transfer at scale for multiple users and for high bandwidth connections. Emphasis is placed on the use of SOS as a part of a larger, richer data transfer ecosystem, complementing and compounding the efforts of existing data transfer solutions. Non-TCP-based solutions, such as Aspera, can operate seamlessly alongside an SOS deployment, while those based on TCP, such as wget, curl, and GridFTP, can leverage SOS for throughput improvement beyond what a single TCP connection can provide. Through extensive evaluation in real-world environments, the SOS architecture is proven to be flexibly deployable on a variety of network architectures, from cloud-based, to production networks, to scaled up, high performance data center environments. Evaluation showed that the SOS architecture scales linearly through the addition of SOS “agents†to the SOS deployment, providing data transfer performance improvement to multiple users simultaneously. An individual data transfer enhanced by SOS was shown to have increased throughput nearly forty times the same data transfer without SOS assistance. Third, efficient and scalable video content distribution is imperative as the demand for multimedia content over the Internet increases. Current state of the art solutions consist of vast content distribution networks (CDNs) where content is oftentimes hosted in duplicate at various geographically distributed locations. Although CDNs are useful for the dissemination of static content, they do not provide a clear and scalable model for the on demand production and distribution of live, streaming content. IP multicast is a popular solution to scalable video content distribution; however, it is seldom used due to deployment and operational complexity. Inspired from the distributed design of todays CDNs and the distribution trees used by IP multicast, a SDN based framework called GENI Cinema (GC) is proposed to allow for the distribution of live video content at scale. GC allows for the efficient management and distribution of live video content at scale without the added architectural complexity and inefficiencies inherent to contemporary solutions such as IP multicast. GC has been deployed as an experimental, nation-wide live video distribution service using the GENI network, broadcasting live and prerecorded video streams from conferences for remote attendees, from the classroom for distance education, and for live sporting events. GC clients can easily and efficiently switch back and forth between video streams with improved switching latency latency over cable, satellite, and other live video providers. The real world dep loyments and evaluation of the proposed solutions show how SDN can be used as a novel way to solve current data transfer problems across computer networks. In addition, this dissertation is expected to provide guidance for designing, deploying, and debugging SDN-based applications across a variety of network topologies

    Deux défis des Réseaux Logiciels : Relayage par le Nom et Vérification des Tables

    Get PDF
    The Internet changed the lives of network users: not only it affects users' habits, but it is also increasingly being shaped by network users' behavior.Several new services have been introduced during the past decades (i.e. file sharing, video streaming, cloud computing) to meet users' expectation.As a consequence, although the Internet infrastructure provides a good best-effort service to exchange information in a point-to-point fashion, this is not the principal need that todays users request. Current networks necessitate some major architectural changes in order to follow the upcoming requirements, but the experience of the past decades shows that bringing new features to the existing infrastructure may be slow.In this thesis work, we identify two main aspects of the Internet evolution: a “behavioral” aspect, which refers to a change occurred in the way users interact with the network, and a “structural” aspect, related to the evolution problem from an architectural point of view.The behavioral perspective states that there is a mismatch between the usage of the network and the actual functions it provides. While network devices implement the simple primitives of sending and receiving generic packets, users are really interested in different primitives, such as retrieving or consuming content. The structural perspective suggests that the problem of the slow evolution of the Internet infrastructure lies in its architectural design, that has been shown to be hardly upgradeable.On the one hand, to encounter the new network usage, the research community proposed the Named-data networking paradigm (NDN), which brings the content-based functionalities to network devices.On the other hand Software-defined networking (SDN) can be adopted to simplify the architectural evolution and shorten the upgrade-time thanks to its centralized software control plane, at the cost of a higher network complexity that can easily introduce some bugs. SDN verification is a novel research direction aiming to check the consistency and safety of network configurations by providing formal or empirical validation.The talk consists of two parts. In the first part, we focus on the behavioral aspect by presenting the design and evaluation of “Caesar”, a content router that advances the state-of-the-art by implementing content-based functionalities which may coexist with real network environments.In the second part, we target network misconfiguration diagnosis, and we present a framework for the analysis of the network topology and forwarding tables, which can be used to detect the presence of a loop at real-time and in real network environments.Cette thèse aborde des problèmes liés à deux aspects majeurs de l’évolution d’Internet : l’aspect >, qui correspond aux nouvelles interactions entre les utilisateurs et le réseau, et l’aspect >, lié aux changements d’Internet d’un point de vue architectural.Le manuscrit est composé d’un chapitre introductif qui donne les grandes lignes de recherche de ce travail de thèse, suivi d’un chapitre consacré à la description de l’état de l’art sur les deux aspects mentionnés ci-dessus. Parmi les solutions proposées par la communauté scientifique pour s'adapter à l’évolution d’Internet, deux nouveaux paradigmes réseaux sont particulièrement décrits : Information- Centric Networking (ICN) et Software-Defined Networking (SDN).La thèse continue avec la proposition de >, un dispositif réseau, inspiré par ICN, capable de gérer la distribution de contenus à partir de primitives de routage basées sur le nom des données et non les adresses des serveurs. Caesar est présenté dans deux chapitres, qui décrivent l’architecture et deux des principaux modules : le relayage et la gestion de la traçabilité des requêtes.La suite du manuscrit décrit un outil mathématique pour la détection efficace de boucles dans un réseau SDN d’un point de vue théorique. Les améliorations de l’algorithme proposé par rapport à l’état de l’art sont discutées.La thèse se conclue par un résumé des principaux résultats obtenus et une présentation des travaux en cours et futurs
    corecore