1 research outputs found

    Balancing the Storage in a Deduplication Cluster

    No full text
    We consider an in-line data deduplication system to backup data from many clients in a cluster of storage servers. We propose a centralized synchronous approach, denoted as GateD, that orchestrates the deduplication operations. According to GateD, the deduplication requests from multiple clients are gathered in a time window and then processed all together. This allows the centralized controller to exploit a higher space of solutions to allocate the data to the deduplication nodes in order to balance the storage occupancy across the nodes, with a beneficial effects on the final performance perceived at the clients and without sacrificing the deduplication efficiency. We investigate the performance through a detailed simulation model applied to real deduplication traces and show that GateD outperforms other state-of-art deduplication schemes
    corecore