15 research outputs found

    Optimal Locally Repairable Codes and Connections to Matroid Theory

    Full text link
    Petabyte-scale distributed storage systems are currently transitioning to erasure codes to achieve higher storage efficiency. Classical codes like Reed-Solomon are highly sub-optimal for distributed environments due to their high overhead in single-failure events. Locally Repairable Codes (LRCs) form a new family of codes that are repair efficient. In particular, LRCs minimize the number of nodes participating in single node repairs during which they generate small network traffic. Two large-scale distributed storage systems have already implemented different types of LRCs: Windows Azure Storage and the Hadoop Distributed File System RAID used by Facebook. The fundamental bounds for LRCs, namely the best possible distance for a given code locality, were recently discovered, but few explicit constructions exist. In this work, we present an explicit and optimal LRCs that are simple to construct. Our construction is based on grouping Reed-Solomon (RS) coded symbols to obtain RS coded symbols over a larger finite field. We then partition these RS symbols in small groups, and re-encode them using a simple local code that offers low repair locality. For the analysis of the optimality of the code, we derive a new result on the matroid represented by the code generator matrix.Comment: Submitted for publication, a shorter version was presented at ISIT 201

    A Repair Framework for Scalar MDS Codes

    Full text link
    Several works have developed vector-linear maximum-distance separable (MDS) storage codes that min- imize the total communication cost required to repair a single coded symbol after an erasure, referred to as repair bandwidth (BW). Vector codes allow communicating fewer sub-symbols per node, instead of the entire content. This allows non trivial savings in repair BW. In sharp contrast, classic codes, like Reed- Solomon (RS), used in current storage systems, are deemed to suffer from naive repair, i.e. downloading the entire stored message to repair one failed node. This mainly happens because they are scalar-linear. In this work, we present a simple framework that treats scalar codes as vector-linear. In some cases, this allows significant savings in repair BW. We show that vectorized scalar codes exhibit properties that simplify the design of repair schemes. Our framework can be seen as a finite field analogue of real interference alignment. Using our simplified framework, we design a scheme that we call clique-repair which provably identifies the best linear repair strategy for any scalar 2-parity MDS code, under some conditions on the sub-field chosen for vectorization. We specify optimal repair schemes for specific (5,3)- and (6,4)-Reed- Solomon (RS) codes. Further, we present a repair strategy for the RS code currently deployed in the Facebook Analytics Hadoop cluster that leads to 20% of repair BW savings over naive repair which is the repair scheme currently used for this code.Comment: 10 Pages; accepted to IEEE JSAC -Distributed Storage 201

    Locality and Availability in Distributed Storage

    Full text link
    This paper studies the problem of code symbol availability: a code symbol is said to have (r,t)(r, t)-availability if it can be reconstructed from tt disjoint groups of other symbols, each of size at most rr. For example, 33-replication supports (1,2)(1, 2)-availability as each symbol can be read from its t=2t= 2 other (disjoint) replicas, i.e., r=1r=1. However, the rate of replication must vanish like 1t+1\frac{1}{t+1} as the availability increases. This paper shows that it is possible to construct codes that can support a scaling number of parallel reads while keeping the rate to be an arbitrarily high constant. It further shows that this is possible with the minimum distance arbitrarily close to the Singleton bound. This paper also presents a bound demonstrating a trade-off between minimum distance, availability and locality. Our codes match the aforementioned bound and their construction relies on combinatorial objects called resolvable designs. From a practical standpoint, our codes seem useful for distributed storage applications involving hot data, i.e., the information which is frequently accessed by multiple processes in parallel.Comment: Submitted to ISIT 201

    MCMC methods for integer least-squares problems

    Get PDF
    We consider the problem of finding the least-squares solution to a system of linear equations where the unknown vector has integer entries (or, more precisely, has entries belonging to a subset of the integers), yet where the coefficient matrix and given vector are comprised of real numbers. Geometrically, this problem is equivalent to finding the closest lattice point to a given point and is known to be NP hard. In communication applications, however, the given vector is not arbitrary, but is a lattice point perturbed by some noise vector. Therefore it is of interest to study the computational complexity of various algorithms as a function of the noise variance or, often more appropriately, the SNR. In this paper, we apply a particular version of the Monte Carlo Markov chain (MCMC) approach to solving this problem, which is called a "heat bath". We show that there is a trade-off between the mixing time of the Markov chain (how long it takes until the chain reaches its stationary distribution) and how long it takes for the algorithm to find the optimal solution once the chain has mixed. The complexity of the algorithm is essentially the sum of these two times. More specifically, the higher the temperature, the faster the mixing, yet the slower the discovery of the optimal solution in steady state. Conversely, the lower the temperature, the slower the mixing, yet the faster the discovery of the optimal solution once the chain is mixed. We first show that for the probability of error of the maximum-likelihood (ML) solution to go to zero the SNR must scale at least as 2 ln N + α(N), where N is the ambient problem dimension and α(N) is any sequence that tends to positive infinity. We further obtain the optimal value of the temperature such that the average time required to encounter the optimal solution in steady state is polynomial. Simulations show that, with this choice of the temperature parameter, the optimal solution can be found in reasonable time- - . This suggests that the Markov chain mixes in polynomial-time, though we have not been able to prove this. It seems reasonable to conjecture that for SNR scaling as O((ln(N))1+∈), and for appropriate choice of the temperature parameter, the heat bath algorithm finds the optimal solution in polynomial-time

    Orthogonal NMF through Subspace Exploration

    Get PDF
    Abstract Orthogonal Nonnegative Matrix Factorization (ONMF) aims to approximate a nonnegative matrix as the product of two k-dimensional nonnegative factors, one of which has orthonormal columns. It yields potentially useful data representations as superposition of disjoint parts, while it has been shown to work well for clustering tasks where traditional methods underperform. Existing algorithms rely mostly on heuristics, which despite their good empirical performance, lack provable performance guarantees. We present a new ONMF algorithm with provable approximation guarantees. For any constant dimension k, we obtain an additive EPTAS without any assumptions on the input. Our algorithm relies on a novel approximation to the related Nonnegative Principal Component Analysis (NNPCA) problem; given an arbitrary data matrix, NNPCA seeks k nonnegative components that jointly capture most of the variance. Our NNPCA algorithm is of independent interest and generalizes previous work that could only obtain guarantees for a single component. We evaluate our algorithms on several real and synthetic datasets and show that their performance matches or outperforms the state of the art
    corecore