4 research outputs found
On Minimizing Data-read and Download for Storage-Node Recovery
We consider the problem of efficient recovery of the data stored in any
individual node of a distributed storage system, from the rest of the nodes.
Applications include handling failures and degraded reads. We measure
efficiency in terms of the amount of data-read and the download required. To
minimize the download, we focus on the minimum bandwidth setting of the
'regenerating codes' model for distributed storage. Under this model, the
system has a total of n nodes, and the data stored in any node must be
(efficiently) recoverable from any d of the other (n-1) nodes. Lower bounds on
the two metrics under this model were derived previously; it has also been
shown that these bounds are achievable for the amount of data-read and download
when d=n-1, and for the amount of download alone when d<n-1.
In this paper, we complete this picture by proving the converse result, that
when d<n-1, these lower bounds are strictly loose with respect to the amount of
read required. The proof is information-theoretic, and hence applies to
non-linear codes as well. We also show that under two (practical) relaxations
of the problem setting, these lower bounds can be met for both read and
download simultaneously.Comment: IEEE Communications Letter
Signal Processing for Caching Networks and Non-volatile Memories
The recent information explosion has created a pressing need for faster and more reliable data storage and transmission schemes. This thesis focuses on two systems: caching networks and non-volatile storage systems. It proposes network protocols to improve the efficiency of information delivery and signal processing schemes to reduce errors at the physical layer as well. This thesis first investigates caching and delivery strategies for content delivery networks. Caching has been investigated as a useful technique to reduce the network burden by prefetching some contents during oË™-peak hours. Coded caching [1] proposed by Maddah-Ali and Niesen is the foundation of our algorithms and it has been shown to be a useful technique which can reduce peak traffic rates by encoding transmissions so that different users can extract different information from the same packet. Content delivery networks store information distributed across multiple servers, so as to balance the load and avoid unrecoverable losses in case of node or disk failures. On one hand, distributed storage limits the capability of combining content from different servers into a single message, causing performance losses in coded caching schemes. But, on the other hand, the inherent redundancy existing in distributed storage systems can be used to improve the performance of those schemes through parallelism. This thesis proposes a scheme combining distributed storage of the content in multiple servers and an efficient coded caching algorithm for delivery to the users. This scheme is shown to reduce the peak transmission rate below that of state-of-the-art algorithms