25 research outputs found

    Self-repairing Homomorphic Codes for Distributed Storage Systems

    Full text link
    Erasure codes provide a storage efficient alternative to replication based redundancy in (networked) storage systems. They however entail high communication overhead for maintenance, when some of the encoded fragments are lost and need to be replenished. Such overheads arise from the fundamental need to recreate (or keep separately) first a copy of the whole object before any individual encoded fragment can be generated and replenished. There has been recently intense interest to explore alternatives, most prominent ones being regenerating codes (RGC) and hierarchical codes (HC). We propose as an alternative a new family of codes to improve the maintenance process, which we call self-repairing codes (SRC), with the following salient features: (a) encoded fragments can be repaired directly from other subsets of encoded fragments without having to reconstruct first the original data, ensuring that (b) a fragment is repaired from a fixed number of encoded fragments, the number depending only on how many encoded blocks are missing and independent of which specific blocks are missing. These properties allow for not only low communication overhead to recreate a missing fragment, but also independent reconstruction of different missing fragments in parallel, possibly in different parts of the network. We analyze the static resilience of SRCs with respect to traditional erasure codes, and observe that SRCs incur marginally larger storage overhead in order to achieve the aforementioned properties. The salient SRC properties naturally translate to low communication overheads for reconstruction of lost fragments, and allow reconstruction with lower latency by facilitating repairs in parallel. These desirable properties make self-repairing codes a good and practical candidate for networked distributed storage systems

    Self-Repairing Codes for Distributed Storage - A Projective Geometric Construction

    Full text link
    Self-Repairing Codes (SRC) are codes designed to suit the need of coding for distributed networked storage: they not only allow stored data to be recovered even in the presence of node failures, they also provide a repair mechanism where as little as two live nodes can be contacted to regenerate the data of a failed node. In this paper, we propose a new instance of self-repairing codes, based on constructions of spreads coming from projective geometry. We study some of their properties to demonstrate the suitability of these codes for distributed networked storage.Comment: 5 pages, 2 figure

    Homomorphic Self-repairing Codes for Agile Maintenance of Distributed Storage Systems

    Full text link
    Distributed data storage systems are essential to deal with the need to store massive volumes of data. In order to make such a system fault-tolerant, some form of redundancy becomes crucial, incurring various overheads - most prominently in terms of storage space and maintenance bandwidth requirements. Erasure codes, originally designed for communication over lossy channels, provide a storage efficient alternative to replication based redundancy, however entailing high communication overhead for maintenance, when some of the encoded fragments need to be replenished in news ones after failure of some storage devices. We propose as an alternative a new family of erasure codes called self-repairing codes (SRC) taking into account the peculiarities of distributed storage systems, specifically the maintenance process. SRC has the following salient features: (a) encoded fragments can be repaired directly from other subsets of encoded fragments by downloading less data than the size of the complete object, ensuring that (b) a fragment is repaired from a fixed number of encoded fragments, the number depending only on how many encoded blocks are missing and independent of which specific blocks are missing. This paper lays the foundations by defining the novel self-repairing codes, elaborating why the defined characteristics are desirable for distributed storage systems. Then homomorphic self-repairing codes (HSRC) are proposed as a concrete instance, whose various aspects and properties are studied and compared - quantitatively or qualitatively with respect to other codes including traditional erasure codes as well as other recent codes designed specifically for storage applications.Comment: arXiv admin note: significant text overlap with arXiv:1008.006

    Storage codes : managing big data with small overheads

    No full text
    Erasure coding provides a mechanism to store data redundantly for fault-tolerance in a cost-effective manner. Recently, there has been a renewed interest in designing new erasure coding techniques with different desirable properties, including good repairability and degraded read performance, or efficient redundancy generation processes. Very often, these novel techniques exploit the computational resources available ‘in the network’, i.e., leverage on storage units which are not passive entities supporting only read/write of data, but also can carry out some computations. This article accompanies an identically titled tutorial at the IEEE International Symposium on Network Coding (NetCod 2013), and portrays a big picture of some of the important processes within distributed storage systems, where erasure codes designed by explicitly taking into account the nuances of distributed storage systems can provide significant performance boosts.Accepted versio

    An ego network analysis of sextortionists

    No full text
    We consider a particular instance of user interactions in the Bitcoin network, that of interactions among wallet addresses belonging to scammers. Aggregation of multiple inputs and change addresses are common heuristics used to establish relationships among addresses and analyze transaction amounts in the Bitcoin network. We propose a flow centric approach that complements such heuristics, by studying the branching, merger and propagation of Bitcoin flows. We study a recent sextortion campaign by exploring the ego network of known offending wallet addresses. We compare and combine different existing and new heuristics, which allows us to identify (1) Bitcoin addresses of interest (including possible recurrent go-to addresses for the scammers) and (2) relevant Bitcoin flows, from scam Bitcoin addresses to a Binance exchange and to other other scam addresses, that suggest connections among prima facie disparate waves of similar scams.Accepted versio

    QLOC : quorums with local reconstruction codes

    No full text
    In this paper we study the problem of consistency in distributed storage systems relying on erasure coding for storage efficient fault-tolerance. We propose QLOC - a flexible framework for supporting the storage of warm data, i.e., data which, while not being very frequently in use, nevertheless continues to be accessed for reads or writes regularly. QLOC builds upon (1) a generic family of local reconstruction codes with guarantees in terms of fault-tolerance, efficient recovery from failures and degraded mode operations, and can be instantiated with parameters customized to requirements such as storage overhead and reliability dictated by user needs and operational environments, and (2) quorum-based consistency mechanisms with support for read-modify-write operations without any underlying atomic primitives, providing deployment choices trading-off fault-tolerance, consistency and concurrency requirements. We carry out a theoretical analysis of the code properties, and experimentally benchmark the performance of the consistency enforcement mechanisms, demonstrating the practicality of the proposed approach.Ministry of Education (MOE)Published versionThe work of Anwitaman Datta and Adamas Aqsa Fahreza was supported by the Ministry of Education (MoE), Singapore, through the Academic Research Fund Tier 1 for the project titled ‘StorEdge: Data store along a cloud-to-thing continuum with integrity and availability’ under Project 2018-T1-002-076

    Data insertion and archiving in erasure-coding based large-scale storage systems

    No full text
    Given the vast volume of data that needs to be stored reliably, many data-centers and large-scale file systems have started using erasure codes to achieve reliable storage while keeping the storage overhead low. This has invigorated the research on erasure codes tailor made to achieve different desirable storage system properties such as efficient redundancy replenishment mechanisms, resilience against data corruption, degraded reads, to name a few prominent ones. A problem that has mainly been overlooked until recently is that of how the storage system can be efficiently populated with erasure coded data to start with. In this paper, we will look at two distinct but related scenarios: (i) migration to archival -leveraging on existing replicated data to create an erasure encoded archive, and (ii) data insertion - new data being inserted in the system directly in erasure coded format. We will elaborate on coding techniques to achieve better throughput for data insertion and migration, and in doing so, explore the connection of these techniques with recently proposed locally repairable codes such as self-repairing codes.Accepted versio
    corecore