2,081 research outputs found

    CloudJet4BigData: Streamlining Big Data via an Accelerated Socket Interface

    Get PDF
    Big data needs to feed users with fresh processing results and cloud platforms can be used to speed up big data applications. This paper describes a new data communication protocol (CloudJet) for long distance and large volume big data accessing operations to alleviate the large latencies encountered in sharing big data resources in the clouds. It encapsulates a dynamic multi-stream/multi-path engine at the socket level, which conforms to Portable Operating System Interface (POSIX) and thereby can accelerate any POSIX-compatible applications across IP based networks. It was demonstrated that CloudJet accelerates typical big data applications such as very large database (VLDB), data mining, media streaming and office applications by up to tenfold in real-world tests

    Distributed Training Large-Scale Deep Architectures

    Full text link
    Scale of data and scale of computation infrastructures together enable the current deep learning renaissance. However, training large-scale deep architectures demands both algorithmic improvement and careful system configuration. In this paper, we focus on employing the system approach to speed up large-scale training. Via lessons learned from our routine benchmarking effort, we first identify bottlenecks and overheads that hinter data parallelism. We then devise guidelines that help practitioners to configure an effective system and fine-tune parameters to achieve desired speedup. Specifically, we develop a procedure for setting minibatch size and choosing computation algorithms. We also derive lemmas for determining the quantity of key components such as the number of GPUs and parameter servers. Experiments and examples show that these guidelines help effectively speed up large-scale deep learning training

    Performance Analysis of Multiple Virtualized Servers

    Get PDF
    Server virtualization is considered as one of the most significant changes in IT operations in the past decade, making it possible to manage groups of servers with a greater degree of reliability at a lower cost. It is driven by the goal of reducing the total number of physical servers in an organization by consolidating multiple applications on shared servers. In this paper we construct several x86_64 servers based on VMware vSphere, and then analyze their performances using open source analyzing tools Pylot and Curl-loader. The results show that despite the enormous potential benefits of virtualization techniques, the efficiency decreased by increasing the number of virtual machines. So, a trade-off is needed between number of virtual machines and expected efficiency of servers
    • …
    corecore