1,554 research outputs found
Explicit Construction of Minimum Bandwidth Rack-Aware Regenerating Codes
In large data centers, storage nodes are organized in racks, and the
cross-rack communication dominates the system bandwidth. We explicitly
construct codes for exact repair of single node failures that achieve the
optimal tradeoff between the storage redundancy and cross-rack repair bandwidth
at the minimum bandwidth point (i.e., the cross-rack bandwidth equals the
storage size per node). Moreover, we explore the node repair when only a few
number of helper racks are connected. Thus we provide explicit constructions of
codes for rack-aware storage with the minimum cross-rack repair bandwidth,
lowest possible redundancy, and small repair degree (i.e., the number of helper
racks connected for repair).Comment: 4 pages, 1 figure. arXiv admin note: text overlap with
arXiv:2101.0873
Global repair bandwidth cost optimization of generalized regenerating codes in clustered distributed storage systems
In clustered distributed storage systems (CDSSs), one of the main design goals is minimizing the transmission cost during the failed storage nodes repairing. Generalized regenerating codes (GRCs) are proposed to balance the intra-cluster repair bandwidth and the inter-cluster repair bandwidth for guaranteeing data availability. The trade-off performance of GRCs illustrates that, it can reduce storage overhead and inter-cluster repair bandwidths simultaneously. However, in practical big data storage scenarios, GRCs cannot give an effective solution to handle the heterogeneity of bandwidth costs among different clusters for node failures recovery. This paper proposes an asymmetric bandwidth allocation strategy (ABAS) of GRCs for the inter-cluster repair in heterogeneous CDSSs. Furthermore, an upper bound of the achievable capacity of ABAS is derived based on the information flow graph (IFG), and the constraints of storage capacity and intra-cluster repair bandwidth are also elaborated. Then, a metric termed global repair bandwidth cost (GRBC), which can be minimized regarding of the inter-cluster repair bandwidths by solving a linear programming problem, is defined. The numerical results demonstrate that, maintaining the same data availability and storage overhead, the proposed ABAS of GRCs can effectively reduce the GRBC compared to the traditional symmetric bandwidth allocation schemes
Base Station-Assisted Cooperative Network Coding for Cellular Systems with Link Constraints
We consider a novel distributed data storage/caching scenario in a cellular network, where multiple nodes may fail/depart simultaneously To meet reliability, we allow cooperative regeneration of lost nodes with the help of base stations allocated in a set of hierarchical layers1. Due to this layered structure, a symbol download from each base station has a different cost, while the link capacities between the nodes of the cellular system and the base stations are also constrained. Under such a setting, we formulate the fundamental trade-off with closed form expressions between repair bandwidth cost and the storage space per node. Particularly, the minimum storage as well as bandwidth cost points are formulated. Finally, we provide an explicit optimal code construction for the minimum storage regeneration point for a special set of system parameters.Scopus - Affiliation ID: 60105072Hazira
Active Data Replica Recovery for Quality-Assurance Big Data Analysis in IC-IoT
QoS-aware big data analysis is critical in Information-Centric Internet of Things (IC-IoT) system to support various applications like smart city, smart grid, smart health, intelligent transportation systems, and so on. The employment of non-volatile memory (NVM) in cloud or edge system provides good opportunity to improve quality of data analysis tasks. However, we have to face the data recovery problem led by NVM failure due to the limited write endurance. In this paper, we investigate the data recovery problem for QoS guarantee and system robustness, followed by proposing a rarity-aware data recovery algorithm. The core idea is to establish the rarity indicator to evaluate the replica distribution and service requirement comprehensively. With this idea, we give the lost replicas with distinguishing priority and eliminate the unnecessary replicas. Then, the data replicas are recovered stage by stage to guarantee QoS and provide system robustness. From our extensive experiments and simulations, it is shown that the proposed algorithm has significant performance improvement on QoS and robustness than the traditional direct data recovery method. Besides, the algorithm gives an acceptable data recovery time
- …