3 research outputs found

    A Two-Stage Multi-Objective Optimization of Erasure Coding in Overlay Networks

    Get PDF
    In the recent years, overlay networks have emerged as a crucial platform for deployment of various distributed applications. Many of these applications rely on data redundancy techniques, such as erasure coding, to achieve higher fault tolerance. However, erasure coding applied in large scale overlay networks entails various overheads in terms of storage, latency and data rebuilding costs. These overheads are largely attributed to the selected erasure coding scheme and the encoded chunk placement in the overlay network. This paper explores a multi-objective optimization approach for identifying appropriate erasure coding schemes and encoded chunk placement in overlay networks. The uniqueness of our approach lies in the consideration of multiple erasure coding objectives such as encoding rate and redundancy factor, with overlay network performance characteristics like storage consumption, latency and system reliability. Our approach enables a variety of tradeoff solutions with respect to these objectives to be identified in the form of a Pareto front. To solve this problem, we propose a novel two stage multiobjective evolutionary algorithm, where the first stage determines the optimal set of encoding schemes, while the second stage optimizes placement of the corresponding encoded data chunks in overlay networks of varying sizes. We study the performance of our method by generating and analyzing the Pareto optimal sets of tradeoff solutions. Experimental results demonstrate that the Pareto optimal set produced by our multi-objective approach includes and even dominates the chunk placements delivered by a related state-of-the-art weighted sum method

    Data Placement for Privacy-Aware Applications over Big Data in Hybrid Clouds

    Get PDF
    Nowadays, a large number of groups choose to deploy their applications to cloud platforms, especially for the big data era. Currently, the hybrid cloud is one of the most popular computing paradigms for holding the privacy-aware applications driven by the requirements of privacy protection and cost saving. However, it is still a challenge to realize data placement considering both the energy consumption in private cloud and the cost for renting the public cloud services. In view of this challenge, a cost and energy aware data placement method, named CEDP, for privacy-aware applications over big data in hybrid cloud is proposed. Technically, formalized analysis of cost, access time, and energy consumption is conducted in the hybrid cloud environment. Then a corresponding data placement method is designed to accomplish the cost saving for renting the public cloud services and energy savings for task execution within the private cloud platforms. Experimental evaluations validate the efficiency and effectiveness of our proposed method

    Optimizing Data Placement for Cost Effective and High Available Multi-Cloud Storage

    Get PDF
    With the advent of big data age, data volume has been changed from trillionbyte to petabyte with incredible speed. Owing to the fact that cloud storage offers the vision of a virtually infinite pool of storage resources, data can be stored and accessed with high scalability and availability. But a single cloud-based data storage has risks like vendor lock-in, privacy leakage, and unavailability. Multi-cloud storage can mitigate these risks with geographically located cloud storage providers. In this storage scheme, one important challenge is how to place a user's data cost-effectively with high availability. In this paper, an architecture for multi-cloud storage is presented. Next, a multi-objective optimization problem is defined to minimize total cost and maximize data availability simultaneously, which can be solved by an approach based on the non-dominated sorting genetic algorithm II (NSGA-II) and obtain a set of non-dominated solutions called the Pareto-optimal set. Then, a method is proposed which is based on the entropy method to determine the most suitable solution for users who cannot choose one from the Pareto-optimal set directly. Finally, the performance of the proposed algorithm is validated by extensive experiments based on real-world multiple cloud storage scenarios
    corecore