4 research outputs found

    Unifying data and replica placement for data-intensive services in geographically distributed clouds

    Get PDF
    The increased reliance of data management applications on cloud computing technologies has rendered research in identifying solutions to the data placement problem to be of paramount importance. The objective of the classical data placement problem is to optimally partition, while also allowing for replication, the set of data-items into distributed data centers to minimize the overall network communication cost. Despite significant advancement in data placement research, replica placement has seldom been studied in unison with data placement. More specifically, most of the existing solutions employ a two-phase approach: 1) data placement, followed by 2) replication. Replication should however be seen as an integral part of data placement, and should be studied as a joint optimization problem with the latter. In this paper, we propose a unified paradigm of data placement, called CPR, which combines data placement and replication of data-intensive services into geographically distributed clouds as a joint optimization problem. Underneath CPR, lies an overlapping correlation clustering algorithm capable of assigning a data-item to multiple data centers, thereby enabling us to jointly solve data placement and replication. Experiments on a real-world trace-based online social network dataset show that CPR is effective and scalable. Empirically, it is approximate to 35% better in efficacy on the evaluated metrics, while being up to 8 times faster in execution time when compared to state-of-the-art techniques

    Adaptive Data Placement for Improving Performance of Online Social Network Services in a Multicloud Environment

    No full text
    The existing online social network (OSN) services in a multiple-cloud (Multicloud) environment use replications to store user data for improving the service performance. However, it not only generates tremendous traffic for synchronization between data but also stores considerable redundant data, thus causing large storage costs. In addition, it does not provide dynamic load balancing considering the resource status of each cloud. As a result, it cannot cope with the degradation of performance caused by the resource contention. We introduce an adaptive data placement algorithm without the replications for improving the performance of the OSN services in the Multicloud environment. Our approach is designed to avoid server overhead using data balancing technique, which locates data from a cloud to another according to the amount of traffic. To provide acceptable latency delay, it also considers the relationship between users and the distance between user and cloud when transferring data. To validate our approach, we experimented with actual users’ locations and times of use collected from OSN services. Our findings indicate that this approach can reduce the resource contention by an average of more than 59%, reduce storage volume to at least 50%, and maintain the latency delay under 50 ms
    corecore