1 research outputs found
Multi-level Forwarding and Scheduling Recovery Algorithm in Rapidly-changing Network for Erasure-coded Clusters
A key design goal of erasure-coded clusters is to reduce the repair time. The
existing Erasure-coded data repair schemes are roughly classified into two
categories: 1. Designing rapid data repair (e.g., PPR) in a homogeneous
environment. 2. Constructing data repair (e.g., PPT) based on bandwidth in a
heterogeneous environment. However, these solutions are difficult to cope with
the heterogeneous and Rapidly-changing network in erasure-coded clusters. To
address this problem, a bandwidth-aware multi-level forwarding repair
algorithm, called BMFRepair, is proposed. BMFRepair monitors the network
bandwidth in real time when data is forwarded, and selects idle nodes with
high-bandwidth links to assist in forwarding. Thus, it can reduce the time
bottleneck caused by low link transmission. At the same time, multi-node repair
becomes very complicated when the bandwidth changes drastically. A multi-node
scheduling repairing algorithm, called MSRepair, is proposed for multi-node
repairing problems, which can repair multiple failed blocks in parallel by
scheduling node resources. The two algorithms can flexibly adapt to the rapidly
changing network environment and make full use of the bandwidth resources of
idle nodes. Most importantly, algorithms can continuously adjust the repair
plan according to the bandwidth change in fast and dynamic network. The
algorithms have been evaluated by both simulations on Mininet and real
experiments on Aliyun cloud platform ECS. Results show that compared with the
state-of-the-art repair schemes PPR and PPT, the algorithms can significantly
reduce the repair time in rapidly-changing network.Comment: We have modified the algorithm of this article, and a submission
meeting is required. However, the meeting request cannot be published, so we
apply to withdraw the articl