1,518 research outputs found

    Functional broadcast repair of multiple partial failures in wireless distributed storage systems

    Get PDF
    We consider a distributed storage system with n nodes, where a user can recover the stored file from any k nodes, and study the problem of repairing r partially failed nodes. We consider broadcast repair , that is, d surviving nodes transmit broadcast messages on an error-free wireless channel to the r nodes being repaired, which are then used, together with the surviving data in the local memories of the failed nodes, to recover the lost content. First, we derive the trade-off between the storage capacity and the repair bandwidth for partial repair of multiple failed nodes, based on the cut-set bound for information flow graphs. It is shown that utilizing the broadcast nature of the wireless medium and the surviving contents at the partially failed nodes reduces the repair bandwidth per node significantly. Then, we list a set of invariant conditions that are sufficient for a functional repair code to be feasible. We further propose a scheme for functional repair of multiple failed nodes that satisfies the invariant conditions with high probability, and its extension to the repair of partial failures. The performance of the proposed scheme meets the cut-set bound on all the points on the trade-off curve for all admissible parameters when k is divisible by r , while employing linear subpacketization, which is an important practical consideration in the design of distributed storage codes. Unlike random linear codes, which are conventionally used for functional repair of failed nodes, the proposed repair scheme has lower overhead, lower input-output cost, and lower computational complexity during repair

    Functional broadcast repair of multiple partial failures in wireless distributed storage systems

    Get PDF
    We consider a distributed storage system with n nodes, where a user can recover the stored file from any k nodes, and study the problem of repairing r partially failed nodes. We consider broadcast repair , that is, d surviving nodes transmit broadcast messages on an error-free wireless channel to the r nodes being repaired, which are then used, together with the surviving data in the local memories of the failed nodes, to recover the lost content. First, we derive the trade-off between the storage capacity and the repair bandwidth for partial repair of multiple failed nodes, based on the cut-set bound for information flow graphs. It is shown that utilizing the broadcast nature of the wireless medium and the surviving contents at the partially failed nodes reduces the repair bandwidth per node significantly. Then, we list a set of invariant conditions that are sufficient for a functional repair code to be feasible. We further propose a scheme for functional repair of multiple failed nodes that satisfies the invariant conditions with high probability, and its extension to the repair of partial failures. The performance of the proposed scheme meets the cut-set bound on all the points on the trade-off curve for all admissible parameters when k is divisible by r , while employing linear subpacketization, which is an important practical consideration in the design of distributed storage codes. Unlike random linear codes, which are conventionally used for functional repair of failed nodes, the proposed repair scheme has lower overhead, lower input-output cost, and lower computational complexity during repair

    RAID Organizations for Improved Reliability and Performance: A Not Entirely Unbiased Tutorial (1st revision)

    Full text link
    RAID proposal advocated replacing large disks with arrays of PC disks, but as the capacity of small disks increased 100-fold in 1990s the production of large disks was discontinued. Storage dependability is increased via replication or erasure coding. Cloud storage providers store multiple copies of data obviating for need for further redundancy. Varitaions of RAID based on local recovery codes, partial MDS reduce recovery cost. NAND flash Solid State Disks - SSDs have low latency and high bandwidth, are more reliable, consume less power and have a lower TCO than Hard Disk Drives, which are more viable for hyperscalers.Comment: Submitted to ACM Computing Surveys. arXiv admin note: substantial text overlap with arXiv:2306.0876

    Modelling and performability evaluation of Wireless Sensor Networks

    Get PDF
    This thesis presents generic analytical models of homogeneous clustered Wireless Sensor Networks (WSNs) with a centrally located Cluster Head (CH) coordinating cluster communication with the sink directly or through other intermediate nodes. The focus is to integrate performance and availability studies of WSNs in the presence of sensor nodes and channel failures and repair/replacement. The main purpose is to enhance improvement of WSN Quality of Service (QoS). Other research works also considered in this thesis include modelling of packet arrival distribution at the CH and intermediate nodes, and modelling of energy consumption at the sensor nodes. An investigation and critical analysis of wireless sensor network architectures, energy conservation techniques and QoS requirements are performed in order to improve performance and availability of the network. Existing techniques used for performance evaluation of single and multi-server systems with several operative states are investigated and analysed in details. To begin with, existing approaches for independent (pure) performance modelling are critically analysed with highlights on merits and drawbacks. Similarly, pure availability modelling approaches are also analysed. Considering that pure performance models tend to be too optimistic and pure availability models are too conservative, performability, which is the integration of performance and availability studies is used for the evaluation of the WSN models developed in this study. Two-dimensional Markov state space representations of the systems are used for performability modelling. Following critical analysis of the existing solution techniques, spectral expansion method and system of simultaneous linear equations are developed and used to solving the proposed models. To validate the results obtained with the two techniques, a discrete event simulation tool is explored. In this research, open queuing networks are used to model the behaviour of the CH when subjected to streams of traffic from cluster nodes in addition to dynamics of operating in the various states. The research begins with a model of a CH with an infinite queue capacity subject to failures and repair/replacement. The model is developed progressively to consider bounded queue capacity systems, channel failures and sleep scheduling mechanisms for performability evaluation of WSNs. Using the developed models, various performance measures of the considered system including mean queue length, throughput, response time and blocking probability are evaluated. Finally, energy models considering mean power consumption in each of the possible operative states is developed. The resulting models are in turn employed for the evaluation of energy saving for the proposed case study model. Numerical solutions and discussions are presented for all the queuing models developed. Simulation is also performed in order to validate the accuracy of the results obtained. In order to address issues of performance and availability of WSNs, current research present independent performance and availability studies. The concerns resulting from such studies have therefore remained unresolved over the years hence persistence poor system performance. The novelty of this research is a proposed integrated performance and availability modelling approach for WSNs meant to address challenges of independent studies. In addition, a novel methodology for modelling and evaluation of power consumption is also offered. Proposed model results provide remarkable improvement on system performance and availability in addition to providing tools for further optimisation studies. A significant power saving is also observed from the proposed model results. In order to improve QoS for WSN, it is possible to improve the proposed models by incorporating priority queuing in a mixed traffic environment. A model of multi-server system is also appropriate for addressing traffic routing. It is also possible to extend the proposed energy model to consider other sleep scheduling mechanisms other than On-demand proposed herein. Analysis and classification of possible arrival distribution of WSN packets for various application environments would be a great idea for enabling robust scientific research

    On the design and optimization of heterogeneous distributed storage systems

    Get PDF
    Durant la última dècada, la demanda d’emmagatzematge de dades ha anat creixent exponencialment any rere any. Apart de demanar més capacitat d’emmagatzematge, el usuaris actualment també demanen poder accedir a les seves dades des de qualsevol lloc i des de qualsevol dispositiu. Degut a aquests nous requeriments, els usuaris estan actualment movent les seves dades personals (correus electrònics, documents, fotografies, etc.) cap a serveis d’emmagatzematge en línia com ara Gmail, Facebook, Flickr o Dropbox. Malauradament, aquests serveis d’emmagatzematge en línia estan sostinguts per unes grans infraestructures informàtiques que poques empreses poden finançar. Per tal de reduir el costs d’aquestes grans infraestructures informàtiques, ha sorgit una nova onada de serveis d’emmagatzematge en línia que obtenen grans infraestructures d’emmagatzematge a base d’integrar els recursos petits centres de dades, o fins i tot a base d’integrar els recursos d’emmagatzematge del usuaris finals. No obstant això, els recursos que formen aquestes noves infraestructures d’emmagatzematge són molt heterogenis, cosa que planteja un repte per al dissenyadors d’aquests sistemes: Com es poden dissenyar sistemes d’emmagatzematge en línia, fiables i eficients, quan la infraestructura emprada és tan heterogènia? Aquesta tesis presenta un estudi dels principals problemes que sorgeixen quan un vol respondre a aquesta pregunta. A més proporciona diferents eines per tal d’optimitzar el disseny de sistemes d’emmagatzematge distribuïts i heterogenis. Les principals contribucions són: Primer, creem un marc d’anàlisis per estudiar els efectes de la redundància de dades en el cost dels sistemes d’emmagatzematge distribuïts. Donat un esquema de redundància específic, el marc d’anàlisis presentat permet predir el cost mitjà d’emmagatzematge i el cost mitjà de comunicació d’un sistema d’emmagatzematge implementat sobre qualsevol infraestructura informàtica distribuïda. Segon, analitzem els impactes que la redundància de dades té en la disponibilitat de les dades, i en els temps de recuperació. Donada una redundància, i donat un sistema d’emmagatzematge heterogeni, creem un grup d’algorismes per a determinar la disponibilitat de les dades esperada, i els temps de recuperació esperats. Tercer, dissenyem diferents polítiques d’assignació de dades per a diferents sistemes d’emmagatzematge. Diferenciem entre aquells escenaris on la totalitat de la infraestructura està administrada per una sola organització, i els escenaris on diferents parts auto administrades contribueixen els seus recursos. Els objectius de les nostres polítiques d’assignació de dades són: (i) minimitzar la redundància necessària, (ii) garantir la equitat entre totes les parts que participen al sistema, i (iii) incentivar a les parts perquè contribueixin els seus recursos al sistema.Over the last decade, users’ storage demands have been growing exponentially year over year. Besides demanding more storage capacity and more data reliability, today users also demand the possibility to access their data from any location and from any device. These new needs encourage users to move their personal data (e.g., E-mails, documents, pictures, etc.) to online storage services such as Gmail, Facebook, Flickr or Dropbox. Unfortunately, these online storage services are built upon expensive large datacenters that only a few big enterprises can afford. To reduce the costs of these large datacenters, a new wave of online storage services has recently emerged integrating storage resources from different small datacenters, or even integrating user storage resources into the provider’s storage infrastructure. However, the storage resources that compose these new storage infrastructures are highly heterogeneous, which poses a challenging problem to storage systems designers: How to design reliable and efficient distributed storage systems over heterogeneous storage infrastructures? This thesis provides an analysis of the main problems that arise when one aims to answer this question. Besides that, this thesis provides different tools to optimize the design of heterogeneous distributed storage systems. The contribution of this thesis is threefold: First, we provide a novel framework to analyze the effects that data redundancy has on the storage and communication costs of distributed storage systems. Given a generic redundancy scheme, the presented framework can predict the average storage costs and the average communication costs of a storage system deployed over a specific storage infrastructure. Second, we analyze the impacts that data redundancy has on data availability and retrieval times. For a given redundancy and a heterogeneous storage infrastructure, we provide a set of algorithms that allow to determine the expected data availability and expected retrieval times. Third, we design different data assignment policies for different storage scenarios. We differentiate between scenarios where the entire storage infrastructure is managed by the same organization, and scenarios where different parties contribute their storage resources. The aims of our assignment policies are: (i) to minimize the required redundancy, (ii) to guarantee fairness among all parties, and (iii) to encourage different parties to contribute their local storage resources to the system
    • …
    corecore