28 research outputs found

    Self-repairing Homomorphic Codes for Distributed Storage Systems

    Full text link
    Erasure codes provide a storage efficient alternative to replication based redundancy in (networked) storage systems. They however entail high communication overhead for maintenance, when some of the encoded fragments are lost and need to be replenished. Such overheads arise from the fundamental need to recreate (or keep separately) first a copy of the whole object before any individual encoded fragment can be generated and replenished. There has been recently intense interest to explore alternatives, most prominent ones being regenerating codes (RGC) and hierarchical codes (HC). We propose as an alternative a new family of codes to improve the maintenance process, which we call self-repairing codes (SRC), with the following salient features: (a) encoded fragments can be repaired directly from other subsets of encoded fragments without having to reconstruct first the original data, ensuring that (b) a fragment is repaired from a fixed number of encoded fragments, the number depending only on how many encoded blocks are missing and independent of which specific blocks are missing. These properties allow for not only low communication overhead to recreate a missing fragment, but also independent reconstruction of different missing fragments in parallel, possibly in different parts of the network. We analyze the static resilience of SRCs with respect to traditional erasure codes, and observe that SRCs incur marginally larger storage overhead in order to achieve the aforementioned properties. The salient SRC properties naturally translate to low communication overheads for reconstruction of lost fragments, and allow reconstruction with lower latency by facilitating repairs in parallel. These desirable properties make self-repairing codes a good and practical candidate for networked distributed storage systems

    Self-Repairing Codes for Distributed Storage - A Projective Geometric Construction

    Full text link
    Self-Repairing Codes (SRC) are codes designed to suit the need of coding for distributed networked storage: they not only allow stored data to be recovered even in the presence of node failures, they also provide a repair mechanism where as little as two live nodes can be contacted to regenerate the data of a failed node. In this paper, we propose a new instance of self-repairing codes, based on constructions of spreads coming from projective geometry. We study some of their properties to demonstrate the suitability of these codes for distributed networked storage.Comment: 5 pages, 2 figure

    Homomorphic Self-repairing Codes for Agile Maintenance of Distributed Storage Systems

    Full text link
    Distributed data storage systems are essential to deal with the need to store massive volumes of data. In order to make such a system fault-tolerant, some form of redundancy becomes crucial, incurring various overheads - most prominently in terms of storage space and maintenance bandwidth requirements. Erasure codes, originally designed for communication over lossy channels, provide a storage efficient alternative to replication based redundancy, however entailing high communication overhead for maintenance, when some of the encoded fragments need to be replenished in news ones after failure of some storage devices. We propose as an alternative a new family of erasure codes called self-repairing codes (SRC) taking into account the peculiarities of distributed storage systems, specifically the maintenance process. SRC has the following salient features: (a) encoded fragments can be repaired directly from other subsets of encoded fragments by downloading less data than the size of the complete object, ensuring that (b) a fragment is repaired from a fixed number of encoded fragments, the number depending only on how many encoded blocks are missing and independent of which specific blocks are missing. This paper lays the foundations by defining the novel self-repairing codes, elaborating why the defined characteristics are desirable for distributed storage systems. Then homomorphic self-repairing codes (HSRC) are proposed as a concrete instance, whose various aspects and properties are studied and compared - quantitatively or qualitatively with respect to other codes including traditional erasure codes as well as other recent codes designed specifically for storage applications.Comment: arXiv admin note: significant text overlap with arXiv:1008.006

    Combined Forward-Backward Asymmetry Measurements in Top-Antitop Quark Production at the Tevatron

    Get PDF
    The CDF and D0 experiments at the Fermilab Tevatron have measured the asymmetry between yields of forward- and backward-produced top and antitop quarks based on their rapidity difference and the asymmetry between their decay leptons. These measurements use the full data sets collected in proton-antiproton collisions at a center-of-mass energy of s=1.96\sqrt s =1.96 TeV. We report the results of combinations of the inclusive asymmetries and their differential dependencies on relevant kinematic quantities. The combined inclusive asymmetry is AFBttˉ=0.128±0.025A_{\mathrm{FB}}^{t\bar{t}} = 0.128 \pm 0.025. The combined inclusive and differential asymmetries are consistent with recent standard model predictions

    Storage codes : managing big data with small overheads

    No full text
    Erasure coding provides a mechanism to store data redundantly for fault-tolerance in a cost-effective manner. Recently, there has been a renewed interest in designing new erasure coding techniques with different desirable properties, including good repairability and degraded read performance, or efficient redundancy generation processes. Very often, these novel techniques exploit the computational resources available ‘in the network’, i.e., leverage on storage units which are not passive entities supporting only read/write of data, but also can carry out some computations. This article accompanies an identically titled tutorial at the IEEE International Symposium on Network Coding (NetCod 2013), and portrays a big picture of some of the important processes within distributed storage systems, where erasure codes designed by explicitly taking into account the nuances of distributed storage systems can provide significant performance boosts.Accepted versio

    Redundantly grouped cross-object coding for repairable storage

    No full text
    The problem of replenishing redundancy in erasure code based fault-tolerant storage has received a great deal of attention recently, leading to the design of several new coding techniques [3], aiming at a better repairability. In this paper, we adopt a different point of view, by proposing to code across different already encoded objects to alleviate the repair problem. We show that the addition of parity pieces - the simplest form of coding - significantly boosts repairability without sacrificing fault-tolerance for equivalent storage overhead. The simplicity of our approach as well as its reliance on time-tested techniques makes it readily deployable.Accepted versio

    Concurrency control and consistency over erasure coded data

    No full text
    For over a decade, erasure codes have become an integral part of large-scale data storage solutions and data-centers. However, in commercial systems, they are, so far, used predominantly for static data. In the meanwhile, there has also been almost a decade and a half of research on mutable erasure coded data, looking at various associated issues, including update computation, concurrency control and consistency, which has led to a variety of reasonably mature techniques. In this work we aim at curating and systematizing this knowledge on managing mutable erasure coded data. We believe the time is right, both because of the richness and maturity of the literature itself, and also, given the pervasiveness of erasure codes in data-centers, because it is natural to expect a transition to accommodate mutable content using erasure coded redundancy in order to support more diverse and versatile overlying applications, while benefiting from the advantages (particularly, that of significantly lower storage overhead) of erasure codes.Ministry of Education (MOE)Nanyang Technological UniversityPublished versionThe work of Anwitaman Datta was supported by the Ministry of Education (MoE), Singapore, under its Academic Research Fund Tier 1 though the Project Title ‘‘StorEdge: Data Store Along a Cloud-To-Thing Continuum with Integrity and Availability’’ under Project 2018-T1-002-076. The work of Frédérique Oggier was supported by Nanyang Technological University, Singapore, Start-Up Grant

    A modular framework for centrality and clustering in complex networks

    No full text
    The structure of many complex networks includes edge directionality and weights on top of their topology. Network analysis that can seamlessly consider combination of these properties are desirable. In this paper, we study two important such network analysis techniques, namely, centrality and clustering. An information-flow based model is adopted for clustering, which itself builds upon an information theoretic measure for computing centrality. Our principal contributions include (1) a generalized model of Markov entropic centrality with the flexibility to tune the importance of node degrees, edge weights and directions, with a closed-form asymptotic analysis, which (2) leads to a novel two-stage graph clustering algorithm. The centrality analysis helps reason about the suitability of our approach to cluster a given graph, and determine 'query' nodes, around which to explore local community structures, leading to an agglomerative clustering mechanism. Our clustering algorithm naturally inherits the flexibility to accommodate edge directionality, as well as different interpretations and interplay between edge weights and node degrees. Extensive benchmarking experiments are provided, using both real-world networks with ground truth and synthetic networks.Ministry of Education (MOE)Nanyang Technological UniversityPublished versionThe work of Frédérique Oggier was supported by Nanyang Technological University (NTU), Singapore, under the Start-Up Grant. The work of Silivanxay Phetsouvanh was supported by the Ph.D. Scholarship through NTU funded by the Ministry of Education, Singapore

    An ego network analysis of sextortionists

    No full text
    We consider a particular instance of user interactions in the Bitcoin network, that of interactions among wallet addresses belonging to scammers. Aggregation of multiple inputs and change addresses are common heuristics used to establish relationships among addresses and analyze transaction amounts in the Bitcoin network. We propose a flow centric approach that complements such heuristics, by studying the branching, merger and propagation of Bitcoin flows. We study a recent sextortion campaign by exploring the ego network of known offending wallet addresses. We compare and combine different existing and new heuristics, which allows us to identify (1) Bitcoin addresses of interest (including possible recurrent go-to addresses for the scammers) and (2) relevant Bitcoin flows, from scam Bitcoin addresses to a Binance exchange and to other other scam addresses, that suggest connections among prima facie disparate waves of similar scams.Accepted versio