412 research outputs found

    マルチレベル並列化とアプリケーション指向データレイアウトを用いるハードウェアアクセラレータの設計と実装

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 稲葉 雅幸, 東京大学教授 須田 礼仁, 東京大学教授 五十嵐 健夫, 東京大学教授 山西 健司, 東京大学准教授 稲葉 真理, 東京大学講師 中山 英樹University of Tokyo(東京大学

    Data Movement Challenges and Solutions with Software Defined Networking

    Get PDF
    With the recent rise in cloud computing, applications are routinely accessing and interacting with data on remote resources. Interaction with such remote resources for the operation of media-rich applications in mobile environments is also on the rise. As a result, the performance of the underlying network infrastructure can have a significant impact on the quality of service experienced by the user. Despite receiving significant attention from both academia and industry, computer networks still face a number of challenges. Users oftentimes report and complain about poor experiences with their devices and applications, which can oftentimes be attributed to network performance when downloading or uploading application data. This dissertation investigates problems that arise with data movement across computer networks and proposes novel solutions to address these issues through software defined networking (SDN). SDN is lauded to be the paradigm of choice for next generation networks. While academia explores use cases in various contexts, industry has focused on data center and wide area networks. There is a significant range of complex and application-specific network services that can potentially benefit from SDN, but introduction and adoption of such solutions remains slow in production networks. One impeding factor is the lack of a simple yet expressive enough framework applicable to all SDN services across production network domains. Without a uniform framework, SDN developers create disjoint solutions, resulting in untenable management and maintenance overhead. The SDN-based solutions developed in this dissertation make use of a common agent-based approach. The architecture facilitates application-oriented SDN design with an abstraction composed of software agents on top of the underlying network. There are three key components modern and future networks require to deliver exceptional data transfer performance to the end user: (1) user and application mobility, (2) high throughput data transfer, and (3) efficient and scalable content distribution. Meeting these key components will not only ensure the network can provide robust and reliable end-to-end connectivity, but also that network resources will be used efficiently. First, mobility support is critical for user applications to maintain connectivity to remote, cloud-based resources. Today\u27s network users are frequently accessing such resources while on the go, transitioning from network to network with the expectation that their applications will continue to operate seamlessly. As users perform handovers between heterogeneous networks or between networks across administrative domains, the application becomes responsible for maintaining or establishing new connections to remote resources. Although application developers often account for such handovers, the result is oftentimes visible to the user through diminished quality of service (e.g. rebuffering in video streaming applications). Many intra-domain handover solutions exist for handovers in WiFi and cellular networks, such as mobile IP, but they are architecturally complex and have not been integrated to form a scalable, inter-domain solution. A scalable framework is proposed that leverages SDN features to implement both horizontal and vertical handovers for heterogeneous wireless networks within and across administrative domains. User devices can select an appropriate network using an on-board virtual SDN implementation that manages available network interfaces. An SDN-based counterpart operates in the network core and edge to handle user migrations as they transition from one edge attachment point to another. The framework was developed and deployed as an extension to the Global Environment for Network Innovations (GENI) testbed; however, the framework can be deployed on any OpenFlow enabled network. Evaluation revealed users can maintain existing application connections without breaking the sockets and requiring the application to recover. Second, high throughput data transfer is essential for user applications to acquire large remote data sets. As data sizes become increasingly large, often combined with their locations being far from the applications, the well known impact of lower Transmission Control Protocol (TCP) throughput over large delay-bandwidth product paths becomes more significant to these applications. While myriads of solutions exist to alleviate the problem, they require specialized software and/or network stacks at both the application host and the remote data server, making it hard to scale up to a large range of applications and execution environments. This results in high throughput data transfer that is available to only a select subset of network users who have access to such specialized software. An SDN based solution called Steroid OpenFlow Service (SOS) has been proposed as a network service that transparently increases the throughput of TCP-based data transfers across large networks. SOS shifts the complexity of high performance data transfer from the end user to the network; users do not need to configure anything on the client and server machines participating in the data transfer. The SOS architecture supports seamless high performance data transfer at scale for multiple users and for high bandwidth connections. Emphasis is placed on the use of SOS as a part of a larger, richer data transfer ecosystem, complementing and compounding the efforts of existing data transfer solutions. Non-TCP-based solutions, such as Aspera, can operate seamlessly alongside an SOS deployment, while those based on TCP, such as wget, curl, and GridFTP, can leverage SOS for throughput improvement beyond what a single TCP connection can provide. Through extensive evaluation in real-world environments, the SOS architecture is proven to be flexibly deployable on a variety of network architectures, from cloud-based, to production networks, to scaled up, high performance data center environments. Evaluation showed that the SOS architecture scales linearly through the addition of SOS “agents†to the SOS deployment, providing data transfer performance improvement to multiple users simultaneously. An individual data transfer enhanced by SOS was shown to have increased throughput nearly forty times the same data transfer without SOS assistance. Third, efficient and scalable video content distribution is imperative as the demand for multimedia content over the Internet increases. Current state of the art solutions consist of vast content distribution networks (CDNs) where content is oftentimes hosted in duplicate at various geographically distributed locations. Although CDNs are useful for the dissemination of static content, they do not provide a clear and scalable model for the on demand production and distribution of live, streaming content. IP multicast is a popular solution to scalable video content distribution; however, it is seldom used due to deployment and operational complexity. Inspired from the distributed design of todays CDNs and the distribution trees used by IP multicast, a SDN based framework called GENI Cinema (GC) is proposed to allow for the distribution of live video content at scale. GC allows for the efficient management and distribution of live video content at scale without the added architectural complexity and inefficiencies inherent to contemporary solutions such as IP multicast. GC has been deployed as an experimental, nation-wide live video distribution service using the GENI network, broadcasting live and prerecorded video streams from conferences for remote attendees, from the classroom for distance education, and for live sporting events. GC clients can easily and efficiently switch back and forth between video streams with improved switching latency latency over cable, satellite, and other live video providers. The real world dep loyments and evaluation of the proposed solutions show how SDN can be used as a novel way to solve current data transfer problems across computer networks. In addition, this dissertation is expected to provide guidance for designing, deploying, and debugging SDN-based applications across a variety of network topologies

    PiCo: A Domain-Specific Language for Data Analytics Pipelines

    Get PDF
    In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models—for which only informal (and often confusing) semantics is generally provided—all share a common under- lying model, namely, the Dataflow model. Using this model as a starting point, it is possible to categorize and analyze almost all aspects about Big Data analytics tools from a high level perspective. This analysis can be considered as a first step toward a formal model to be exploited in the design of a (new) framework for Big Data analytics. By putting clear separations between all levels of abstraction (i.e., from the runtime to the user API), it is easier for a programmer or software designer to avoid mixing low level with high level aspects, as we are often used to see in state-of-the-art Big Data analytics frameworks. From the user-level perspective, we think that a clearer and simple semantics is preferable, together with a strong separation of concerns. For this reason, we use the Dataflow model as a starting point to build a programming environment with a simplified programming model implemented as a Domain-Specific Language, that is on top of a stack of layers that build a prototypical framework for Big Data analytics. The contribution of this thesis is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm, Google Dataflow), thus making it easier to understand high-level data-processing applications written in such frameworks. As result of this analysis, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level. Second, we propose a programming environment based on such layered model in the form of a Domain-Specific Language (DSL) for processing data collections, called PiCo (Pipeline Composition). The main entity of this programming model is the Pipeline, basically a DAG-composition of processing elements. This model is intended to give the user an unique interface for both stream and batch processing, hiding completely data management and focusing only on operations, which are represented by Pipeline stages. Our DSL will be built on top of the FastFlow library, exploiting both shared and distributed parallelism, and implemented in C++11/14 with the aim of porting C++ into the Big Data world

    Joustavaa ja läpinäkyvää puskurointia radiotähtitieteen mittauksissa: VLBI-streamer ja Flexbuff

    Get PDF
    Radio astronomical data recording poses three challenges for its users: High volume, high bandwidth and large geographical distances. The solutions have been custom hardware and physical shipping of data sets. Performance of commercial off-the-shelf hardware for streaming data recording has been steadily improving and is starting to surpass the challenges. If the use of custom hardware can be dropped, the solution can benefit from successive hardware generations without added development efforts. This thesis will present VLBI-streamer software for simultaneous read and write data streaming and Flexbuff as the hardware housing this software. Volatile and non-volatile memory are used as pools of resources in a user space confined besteffort relaxed architecture. Since hardware will keep changing and it takes time, developers and money to port software to different hardware, VLBI-streamer was not bound to any specific hardware. VLBI-streamer will run on virtually any Linux-system with a few package requirements and was made modular to support different stream types (UDP/TCP/RDMA) and disk I/O backends. In this thesis Flexbuff combined with the software VLBI-streamer developed by the author is shown to be capable of 10 Gb/s and beyond recording and streaming, meeting current radio astronomical data recording needs.Radiotähtitieteellisten kokeiden nauhoittamisessa on kolme haastetta: Suuri datamäärä, korkea kaistanleveys ja suuret maantieteelliset etäisyydet. Ratkaisuna ovat ennen olleet räätälöidyt komponentit ja tallenteiden fyysinen lähetys. Helposti saatavilla olevien tietotekniikan yleiskomponenttien suorituskyky on noussut tasaisesti ja tallennuskyky alkaa lähestyä asetettuja haasteita. Jos räätälöidyistä komponenteista voidaan luopua, yleiskomponenteille tehdyt ohjelmistoratkaisut voisivat hyötyä jatkuvasti kehittyvistä komponenttisukupolvista ilman ylimääräistä ohjelmistokehityspanostusta. Tämä diplomityö esittelee VLBI-streamer ohjelman yhtaikaiseen datan tallennukseen ja lähetykseen, sekä Flexbuff komponenttimäärittelyn, jolla ohjelma ajetaan. Lyhyt- ja pitkäkestoisia muisteja käytetään varattavissa olevina resursseina käyttäjätilaan rajatussa, parhaan yrityksen arkkitehtuurissa. Koska fyysiset komponentit tulevat aina vaihtumaan, VLBI-streamer toteutettiin ilman siteitä mihinkään tiettyyn rautakomponenttiin, jolloin säästytään ylimääräiseltä ohjelmistokehitykseltä komponenttien vaihtuessa. VLBI-streamer on ajettavissa melkein millä tahansa Linux-järjestelmällä ja tarvitsee vain muutaman ulkoisen vapaasti saatavilla olevan ohjelmistokomponentin. VLBI-streamer on myös modulaarinen ja tukee erilaisia verkkoprotokollia (TCP/UDP/RDMA) ja kovalevytallennustekniikoita. Tässä diplomityössä Flexbuff ja kehittämäni VLBI-streamer osoitetaan pystyvän nauhoittamaan ja lähettämään yli 10Gb/s tietovirtoja, täten täyttäen nykyisten radiotähtitieteellisten kokeiden tarpeet

    Study on the Performance of TCP over 10Gbps High Speed Networks

    Get PDF
    Internet traffic is expected to grow phenomenally over the next five to ten years. To cope with such large traffic volumes, high-speed networks are expected to scale to capacities of terabits-per-second and beyond. Increasing the role of optics for packet forwarding and transmission inside the high-speed networks seems to be the most promising way to accomplish this capacity scaling. Unfortunately, unlike electronic memory, it remains a formidable challenge to build even a few dozen packets of integrated all-optical buffers. On the other hand, many high-speed networks depend on the TCP/IP protocol for reliability which is typically implemented in software and is sensitive to buffer size. For example, TCP requires a buffer size of bandwidth delay product in switches/routers to maintain nearly 100\% link utilization. Otherwise, the performance will be much downgraded. But such large buffer will challenge hardware design and power consumption, and will generate queuing delay and jitter which again cause problems. Therefore, improve TCP performance over tiny buffered high-speed networks is a top priority. This dissertation studies the TCP performance in 10Gbps high-speed networks. First, a 10Gbps reconfigurable optical networking testbed is developed as a research environment. Second, a 10Gbps traffic sniffing tool is developed for measuring and analyzing TCP performance. New expressions for evaluating TCP loss synchronization are presented by carefully examining the congestion events of TCP. Based on observation, two basic reasons that cause performance problems are studied. We find that minimize TCP loss synchronization and reduce flow burstiness impact are critical keys to improve TCP performance in tiny buffered networks. Finally, we present a new TCP protocol called Multi-Channel TCP and a new congestion control algorithm called Desynchronized Multi-Channel TCP (DMCTCP). Our algorithm implementation takes advantage of a potential parallelism from the Multi-Path TCP in Linux. Over an emulated 10Gbps network ruled by routers with only a few dozen packets of buffers, our experimental results confirm that bottleneck link utilization can be much better improved by DMCTCP than by many other TCP variants. Our study is a new step towards the deployment of optical packet switching/routing networks

    High Performance Network Evaluation and Testing

    Get PDF

    Service-oriented models for audiovisual content storage

    No full text
    What are the important topics to understand if involved with storage services to hold digital audiovisual content? This report takes a look at how content is created and moves into and out of storage; the storage service value networks and architectures found now and expected in the future; what sort of data transfer is expected to and from an audiovisual archive; what transfer protocols to use; and a summary of security and interface issues

    Management, Optimization and Evolution of the LHCb Online Network

    Get PDF
    The LHCb experiment is one of the four large particle detectors running at the Large Hadron Collider (LHC) at CERN. It is a forward single-arm spectrometer dedicated to test the Standard Model through precision measurements of Charge-Parity (CP) violation and rare decays in the b quark sector. The LHCb experiment will operate at a luminosity of 2x10^32cm-2s-1, the proton-proton bunch crossings rate will be approximately 10 MHz. To select the interesting events, a two-level trigger scheme is applied: the rst level trigger (L0) and the high level trigger (HLT). The L0 trigger is implemented in custom hardware, while HLT is implemented in software runs on the CPUs of the Event Filter Farm (EFF). The L0 trigger rate is dened at about 1 MHz, and the event size for each event is about 35 kByte. It is a serious challenge to handle the resulting data rate (35 GByte/s). The Online system is a key part of the LHCb experiment, providing all the IT services. It consists of three major components: the Data Acquisition (DAQ) system, the Timing and Fast Control (TFC) system and the Experiment Control System (ECS). To provide the services, two large dedicated networks based on Gigabit Ethernet are deployed: one for DAQ and another one for ECS, which are referred to Online network in general. A large network needs sophisticated monitoring for its successful operation. Commercial network management systems are quite expensive and dicult to integrate into the LHCb ECS. A custom network monitoring system has been implemented based on a Supervisory Control And Data Acquisition (SCADA) system called PVSS which is used by LHCb ECS. It is a homogeneous part of the LHCb ECS. In this thesis, it is demonstrated how a large scale network can be monitored and managed using tools originally made for industrial supervisory control. The thesis is organized as the follows: Chapter 1 gives a brief introduction to LHC and the B physics on LHC, then describes all sub-detectors and the trigger and DAQ system of LHCb from structure to performance. Chapter 2 first introduces the LHCb Online system and the dataflow, then focuses on the Online network design and its optimization. In Chapter 3, the SCADA system PVSS is introduced briefly, then the architecture and implementation of the network monitoring system are described in detail, including the front-end processes, the data communication and the supervisory layer. Chapter 4 first discusses the packet sampling theory and one of the packet sampling mechanisms: sFlow, then demonstrates the applications of sFlow for the network trouble-shooting, the traffic monitoring and the anomaly detection. In Chapter 5, the upgrade of LHC and LHCb is introduced, the possible architecture of DAQ is discussed, and two candidate internetworking technologies (high speed Ethernet and InfniBand) are compared in different aspects for DAQ. Three schemes based on 10 Gigabit Ethernet are presented and studied. Chapter 6 is a general summary of the thesis
    corecore