33,969 research outputs found

    CloudJet4BigData: Streamlining Big Data via an Accelerated Socket Interface

    Get PDF
    Big data needs to feed users with fresh processing results and cloud platforms can be used to speed up big data applications. This paper describes a new data communication protocol (CloudJet) for long distance and large volume big data accessing operations to alleviate the large latencies encountered in sharing big data resources in the clouds. It encapsulates a dynamic multi-stream/multi-path engine at the socket level, which conforms to Portable Operating System Interface (POSIX) and thereby can accelerate any POSIX-compatible applications across IP based networks. It was demonstrated that CloudJet accelerates typical big data applications such as very large database (VLDB), data mining, media streaming and office applications by up to tenfold in real-world tests

    Integrated Green Cloud Computing Architecture

    Full text link
    Arbitrary usage of cloud computing, either private or public, can lead to uneconomical energy consumption in data processing, storage and communication. Hence, green cloud computing solutions aim not only to save energy but also reduce operational costs and carbon footprints on the environment. In this paper, an Integrated Green Cloud Architecture (IGCA) is proposed that comprises of a client-oriented Green Cloud Middleware to assist managers in better overseeing and configuring their overall access to cloud services in the greenest or most energy-efficient way. Decision making, whether to use local machine processing, private or public clouds, is smartly handled by the middleware using predefined system specifications such as service level agreement (SLA), Quality of service (QoS), equipment specifications and job description provided by IT department. Analytical model is used to show the feasibility to achieve efficient energy consumption while choosing between local, private and public Cloud service provider (CSP).Comment: 6 pages, International Conference on Advanced Computer Science Applications and Technologies, ACSAT 201

    Efficient HTTP based I/O on very large datasets for high performance computing with the libdavix library

    Full text link
    Remote data access for data analysis in high performance computing is commonly done with specialized data access protocols and storage systems. These protocols are highly optimized for high throughput on very large datasets, multi-streams, high availability, low latency and efficient parallel I/O. The purpose of this paper is to describe how we have adapted a generic protocol, the Hyper Text Transport Protocol (HTTP) to make it a competitive alternative for high performance I/O and data analysis applications in a global computing grid: the Worldwide LHC Computing Grid. In this work, we first analyze the design differences between the HTTP protocol and the most common high performance I/O protocols, pointing out the main performance weaknesses of HTTP. Then, we describe in detail how we solved these issues. Our solutions have been implemented in a toolkit called davix, available through several recent Linux distributions. Finally, we describe the results of our benchmarks where we compare the performance of davix against a HPC specific protocol for a data analysis use case.Comment: Presented at: Very large Data Bases (VLDB) 2014, Hangzho
    • …
    corecore