6,257 research outputs found

    CloudJet4BigData: Streamlining Big Data via an Accelerated Socket Interface

    Get PDF
    Big data needs to feed users with fresh processing results and cloud platforms can be used to speed up big data applications. This paper describes a new data communication protocol (CloudJet) for long distance and large volume big data accessing operations to alleviate the large latencies encountered in sharing big data resources in the clouds. It encapsulates a dynamic multi-stream/multi-path engine at the socket level, which conforms to Portable Operating System Interface (POSIX) and thereby can accelerate any POSIX-compatible applications across IP based networks. It was demonstrated that CloudJet accelerates typical big data applications such as very large database (VLDB), data mining, media streaming and office applications by up to tenfold in real-world tests

    An occam Style Communications System for UNIX Networks

    Get PDF
    This document describes the design of a communications system which provides occam style communications primitives under a Unix environment, using TCP/IP protocols, and any number of other protocols deemed suitable as underlying transport layers. The system will integrate with a low overhead scheduler/kernel without incurring significant costs to the execution of processes within the run time environment. A survey of relevant occam and occam3 features and related research is followed by a look at the Unix and TCP/IP facilities which determine our working constraints, and a description of the T9000 transputer's Virtual Channel Processor, which was instrumental in our formulation. Drawing from the information presented here, a design for the communications system is subsequently proposed. Finally, a preliminary investigation of methods for lightweight access control to shared resources in an environment which does not provide support for critical sections, semaphores, or busy waiting, is made. This is presented with relevance to mutual exclusion problems which arise within the proposed design. Future directions for the evolution of this project are discussed in conclusion

    Performance Optimization and Dynamics Control for Large-scale Data Transfer in Wide-area Networks

    Get PDF
    Transport control plays an important role in the performance of large-scale scientific and media streaming applications involving transfer of large data sets, media streaming, online computational steering, interactive visualization, and remote instrument control. In general, these applications have two distinctive classes of transport requirements: large-scale scientific applications require high bandwidths to move bulk data across wide-area networks, while media streaming applications require stable bandwidths to ensure smooth media playback. Unfortunately, the widely deployed Transmission Control Protocol is inadequate for such tasks due to its performance limitations. The purpose of this dissertation is to conduct rigorous analytical study of the design and performance of transport solutions, and develop an integrated transport solution in a systematical way to overcome the limitations of current transport methods. One of the primary challenges is to explore and compose a set of feasible route options with multiple constraints. Another challenge essentially arises from the randomness inherent in wide-area networks, particularly the Internet. This randomness must be explicitly accounted for to achieve both goodput maximization and stabilization over the constructed routes by suitably adjusting the source rate in response to both network and host dynamics.The superior and robust performance of the proposed transport solution is extensively evaluated in a simulated environment and further verified through real-life implementations and deployments over both Internet and dedicated connections under disparate network conditions in comparison with existing transport methods

    An integrated transport solution to big data movement in high-performance networks

    Get PDF
    Extreme-scale e-Science applications in various domains such as earth science and high energy physics among multiple national institutions within the U.S. are generating colossal amounts of data, now frequently termed as “big data”. The big data must be stored, managed and moved to different geographical locations for distributed data processing and analysis. Such big data transfers require stable and high-speed network connections, which are not readily available in traditional shared IP networks such as the Internet. High-performance networking technologies and services featuring high bandwidth and advance reservation are being rapidly developed and deployed across the nation and around the globe to support such scientific applications. However, these networking technologies and services have not been fully utilized, mainly because: i) the use of these technologies and services often requires considerable domain knowledge and many application users are even not aware of their existence; and ii) the end-to-end data transfer performance largely depends on the transport protocol being used on the end hosts. The high-speed network path with reserved bandwidth in High-performance Networks has shifted the data transfer bottleneck from network segments in traditional IP networks to end hosts, which most existing transport protocols are not well suited to handle. In this dissertation, an integrated transport solution is proposed in support of data- and network-intensive applications in various science domains. This solution integrates three major components, i.e., i) transport-support workflow optimization, ii) transport profile generation, and iii) transport protocol design, into a unified framework. Firstly, a class of transport-support workflow optimization problems are formulated, where an appropriate set of resources and services are selected to compose the best transport-support workflow to meet user’s data transfer request in terms of various performance requirements. Secondly, a transport profiler named Transport Profile Generator (TPG) and its extended and accelerated version named FastProf are designed and implemented to characterize and enhance the end-to-end data transfer performance of a selected transport method over an established network path. Finally, several approaches based on rate and error threshold control are proposed to design a suite of data transfer protocols specifically tailored for big data transfer over dedicated connections. The proposed integrated transport solution is implemented and evaluated in: i) a local testbed with a single 10 Gb/s back-to-back connection and dual 10 Gb/s NIC-to-NIC connections; and ii) several wide-area networks with 10 Gb/s long-haul connections at collaborative sites including Oak Ridge National Laboratory, Argonne National Laboratory, and University of Chicago

    A study of publish/subscribe systems for real-time grid monitoring

    Get PDF
    Monitoring and controlling a large number of geographically distributed scientific instruments is a challenging task. Some operations on these instruments require real-time (or quasi real-time) response which make it even more difficult. In this paper, we describe the requirements of distributed monitoring for a possible future electrical power grid based on real-time extensions to grid computing. We examine several standards and publish/subscribe middleware candidates, some of which were specially designed and developed for grid monitoring. We analyze their architecture and functionality, and discuss the advantages and disadvantages. We report on a series of tests to measure their real-time performance and scalability
    • …
    corecore