9,652 research outputs found

    Auditing for Distributed Storage Systems

    Full text link
    Distributed storage codes have recently received a lot of attention in the community. Independently, another body of work has proposed integrity checking schemes for cloud storage, none of which, however, is customized for coding-based storage or can efficiently support repair. In this work, we bridge the gap between these two currently disconnected bodies of work. We propose NC-Audit, a novel cryptography-based remote data integrity checking scheme, designed specifically for network coding-based distributed storage systems. NC-Audit combines, for the first time, the following desired properties: (i) efficient checking of data integrity, (ii) efficient support for repairing failed nodes, and (iii) protection against information leakage when checking is performed by a third party. The key ingredient of the design of NC-Audit is a novel combination of SpaceMac, a homomorphic message authentication code (MAC) scheme for network coding, and NCrypt, a novel chosen-plaintext attack (CPA) secure encryption scheme that is compatible with SpaceMac. Our evaluation of a Java implementation of NC-Audit shows that an audit costs the storage node and the auditor a modest amount computation time and lower bandwidth than prior work.Comment: ToN 2014 Submission with Data Dynamic

    Limit theorems for a random walk with memory perturbed by a dynamical system

    Full text link
    We introduce a new random walk with unbounded memory obtained as a mixture of the Elephant Random Walk and the Dynamic Random Walk which we call the Dynamic Elephant Random Walk (DERW). As a consequence of this mixture the distribution of the increments of the resulting random process is time dependent. We prove a strong law of large numbers for the DERW and, in a particular case, we provide an explicit expression for its speed. Finally, we give sufficient conditions for the central limit theorem and the law of the iterated logarithm to hold.Comment: We corrected a typo in the definition of the ER

    Universal Psychometrics Tasks: difficulty, composition and decomposition

    Full text link
    This note revisits the concepts of task and difficulty. The notion of cognitive task and its use for the evaluation of intelligent systems is still replete with issues. The view of tasks as MDP in the context of reinforcement learning has been especially useful for the formalisation of learning tasks. However, this alternate interaction does not accommodate well for some other tasks that are usual in artificial intelligence and, most especially, in animal and human evaluation. In particular, we want to have a more general account of episodes, rewards and responses, and, most especially, the computational complexity of the algorithm behind an agent solving a task. This is crucial for the determination of the difficulty of a task as the (logarithm of the) number of computational steps required to acquire an acceptable policy for the task, which includes the exploration of policies and their verification. We introduce a notion of asynchronous-time stochastic tasks. Based on this interpretation, we can see what task difficulty is, what instance difficulty is (relative to a task) and also what task compositions and decompositions are.Comment: 30 page

    Computability Logic: a formal theory of interaction

    Full text link
    Computability logic is a formal theory of (interactive) computability in the same sense as classical logic is a formal theory of truth. This approach was initiated very recently in "Introduction to computability logic" (Annals of Pure and Applied Logic 123 (2003), pp.1-99). The present paper reintroduces computability logic in a more compact and less technical way. It is written in a semitutorial style with a general computer science, logic or mathematics audience in mind. An Internet source on the subject is available at http://www.cis.upenn.edu/~giorgi/cl.html, and additional material at http://www.csc.villanova.edu/~japaridz/CL/gsoll.html

    Isolating Mice and Elephant in Data Centers

    Full text link
    Data centers traffic is composed by numerous latency-sensitive "mice" flows, which is consisted of only several packets, and a few throughput-sensitive "elephant" flows, which occupy more than 80% of overall load. Generally, the short-lived "mice" flows induce transient congestion and the long-lived "elephant" flows cause persistent congestion. The network congestion is a major performance inhibitor. Conventionally, the hop-by-hop and end-to-end flow control mechanisms are employed to relief transient and persistent congestion, respectively. However, in face of the mixture of elephants and mice, we find the hybrid congestion control scheme including hop-by-hop and end-to-end flow control mechanisms suffers from serious performance impairments. As a step further, our in-depth analysis reveals that the hybrid scheme performs poor at latency of mice and throughput of elephant. Motivated by this understanding, we argue for isolating mice and elephants in different queues, such that the hop-by-hop and end-to-end flow control mechanisms are independently imposed to short-lived and long-lived flows, respectively. Our solution is readily-deployable and compatible with current commodity network devices and can leverage various congestion control mechanisms. Extensive simulations show that our proposal of isolation can simultaneously improve the latency of mice by at least 30% and the link utilization to almost 100%

    Joint Latency and Cost Optimization for Erasure-coded Data Center Storage

    Full text link
    Modern distributed storage systems offer large capacity to satisfy the exponentially increasing need of storage space. They often use erasure codes to protect against disk and node failures to increase reliability, while trying to meet the latency requirements of the applications and clients. This paper provides an insightful upper bound on the average service delay of such erasure-coded storage with arbitrary service time distribution and consisting of multiple heterogeneous files. Not only does the result supersede known delay bounds that only work for a single file or homogeneous files, it also enables a novel problem of joint latency and storage cost minimization over three dimensions: selecting the erasure code, placement of encoded chunks, and optimizing scheduling policy. The problem is efficiently solved via the computation of a sequence of convex approximations with provable convergence. We further prototype our solution in an open-source, cloud storage deployment over three geographically distributed data centers. Experimental results validate our theoretical delay analysis and show significant latency reduction, providing valuable insights into the proposed latency-cost tradeoff in erasure-coded storage.Comment: 14 pages, presented in part at IFIP Performance, Oct 201

    An Improved Cooperative Repair Scheme for Reed-Solomon Codes

    Full text link
    Dau et al. recently extend Guruswami and Wootters' scheme (STOC'2016) to cooperatively repair two or three erasures in Reed-Solomon (RS) codes. However, their scheme restricts to either the case that the characteristic of FF divides the extension degree [F ⁣: ⁣B][F\!:\!B] or some special failure patterns, where FF is the base field of the RS code and BB is the subfield of the repair symbols. In this paper, we derive an improved cooperative repair scheme that removes all these restrictions. That is, our scheme applies to any characteristic of FF and can repair all failure patterns of two or three erasures.Comment: submitted to ISIT201

    Power-law of Aggregate-size Spectra in Natural Systems

    Full text link
    Patterns of animate and inanimate systems show remarkable similarities in their aggregation. One similarity is the double-Pareto distribution of the aggregate-size of system components. Different models have been developed to predict aggregates of system components. However, not many models have been developed to describe probabilistically the aggregate-size distribution of any system regardless of the intrinsic and extrinsic drivers of the aggregation process. Here we consider natural animate systems, from one of the greatest mammals - the African elephant (\textit{Loxodonta africana}) - to the \textit{Escherichia coli} bacteria, and natural inanimate systems in river basins. Considering aggregates as islands and their perimeter as a curve mirroring the sculpting network of the system, the probability of exceedence of the drainage area, and the Hack's law are shown to be the the Kor\v{c}ak's law and the perimeter-area relationship for river basins. The perimeter-area relationship, and the probability of exceedence of the aggregate-size provide a meaningful estimate of the same fractal dimension. Systems aggregate because of the influence exerted by a physical or processes network within the system domain. The aggregate-size distribution is accurately derived using the null-method of box-counting on the occurrences of system components. The importance of the aggregate-size spectrum relies on its ability to reveal system form, function, and dynamics also as a function of other coupled systems. Variations of the fractal dimension and of the aggregate-size distribution are related to changes of systems that are meaningful to monitor because potentially critical for these systems.Comment: ICST Transactions on Complex System

    Binary Linear Locally Repairable Codes

    Full text link
    Locally repairable codes (LRCs) are a class of codes designed for the local correction of erasures. They have received considerable attention in recent years due to their applications in distributed storage. Most existing results on LRCs do not explicitly take into consideration the field size qq, i.e., the size of the code alphabet. In particular, for the binary case, only a few results are known. In this work, we present an upper bound on the minimum distance dd of linear LRCs with availability, based on the work of Cadambe and Mazumdar. The bound takes into account the code length nn, dimension kk, locality rr, availability tt, and field size qq. Then, we study binary linear LRCs in three aspects. First, we focus on analyzing the locality of some classical codes, i.e., cyclic codes and Reed-Muller codes, and their modified versions, which are obtained by applying the operations of extend, shorten, expurgate, augment, and lengthen. Next, we construct LRCs using phantom parity-check symbols and multi-level tensor product structure, respectively. Compared to other previous constructions of binary LRCs with fixed locality or minimum distance, our construction is much more flexible in terms of code parameters, and gives various families of high-rate LRCs, some of which are shown to be optimal with respect to their minimum distance. Finally, availability of LRCs is studied. We investigate the locality and availability properties of several classes of one-step majority-logic decodable codes, including cyclic simplex codes, cyclic difference-set codes, and 44-cycle free regular low-density parity-check (LDPC) codes. We also show the construction of a long LRC with availability from a short one-step majority-logic decodable code
    corecore