2,938 research outputs found

    Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments

    Full text link
    Data centres that use consumer-grade disks drives and distributed peer-to-peer systems are unreliable environments to archive data without enough redundancy. Most redundancy schemes are not completely effective for providing high availability, durability and integrity in the long-term. We propose alpha entanglement codes, a mechanism that creates a virtual layer of highly interconnected storage devices to propagate redundant information across a large scale storage system. Our motivation is to design flexible and practical erasure codes with high fault-tolerance to improve data durability and availability even in catastrophic scenarios. By flexible and practical, we mean code settings that can be adapted to future requirements and practical implementations with reasonable trade-offs between security, resource usage and performance. The codes have three parameters. Alpha increases storage overhead linearly but increases the possible paths to recover data exponentially. Two other parameters increase fault-tolerance even further without the need of additional storage. As a result, an entangled storage system can provide high availability, durability and offer additional integrity: it is more difficult to modify data undetectably. We evaluate how several redundancy schemes perform in unreliable environments and show that alpha entanglement codes are flexible and practical codes. Remarkably, they excel at code locality, hence, they reduce repair costs and become less dependent on storage locations with poor availability. Our solution outperforms Reed-Solomon codes in many disaster recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN

    Network of excellence in internet science: D13.2.1 Internet science – going forward: internet science roadmap (preliminary version)

    No full text

    InterCloud: Utility-Oriented Federation of Cloud Computing Environments for Scaling of Application Services

    Full text link
    Cloud computing providers have setup several data centers at different geographical locations over the Internet in order to optimally serve needs of their customers around the world. However, existing systems do not support mechanisms and policies for dynamically coordinating load distribution among different Cloud-based data centers in order to determine optimal location for hosting application services to achieve reasonable QoS levels. Further, the Cloud computing providers are unable to predict geographic distribution of users consuming their services, hence the load coordination must happen automatically, and distribution of services must change in response to changes in the load. To counter this problem, we advocate creation of federated Cloud computing environment (InterCloud) that facilitates just-in-time, opportunistic, and scalable provisioning of application services, consistently achieving QoS targets under variable workload, resource and network conditions. The overall goal is to create a computing environment that supports dynamic expansion or contraction of capabilities (VMs, services, storage, and database) for handling sudden variations in service demands. This paper presents vision, challenges, and architectural elements of InterCloud for utility-oriented federation of Cloud computing environments. The proposed InterCloud environment supports scaling of applications across multiple vendor clouds. We have validated our approach by conducting a set of rigorous performance evaluation study using the CloudSim toolkit. The results demonstrate that federated Cloud computing model has immense potential as it offers significant performance gains as regards to response time and cost saving under dynamic workload scenarios.Comment: 20 pages, 4 figures, 3 tables, conference pape

    On the design and optimization of heterogeneous distributed storage systems

    Get PDF
    Durant la Ășltima dĂšcada, la demanda d’emmagatzematge de dades ha anat creixent exponencialment any rere any. Apart de demanar mĂ©s capacitat d’emmagatzematge, el usuaris actualment tambĂ© demanen poder accedir a les seves dades des de qualsevol lloc i des de qualsevol dispositiu. Degut a aquests nous requeriments, els usuaris estan actualment movent les seves dades personals (correus electrĂČnics, documents, fotografies, etc.) cap a serveis d’emmagatzematge en lĂ­nia com ara Gmail, Facebook, Flickr o Dropbox. Malauradament, aquests serveis d’emmagatzematge en lĂ­nia estan sostinguts per unes grans infraestructures informĂ tiques que poques empreses poden finançar. Per tal de reduir el costs d’aquestes grans infraestructures informĂ tiques, ha sorgit una nova onada de serveis d’emmagatzematge en lĂ­nia que obtenen grans infraestructures d’emmagatzematge a base d’integrar els recursos petits centres de dades, o fins i tot a base d’integrar els recursos d’emmagatzematge del usuaris finals. No obstant aixĂČ, els recursos que formen aquestes noves infraestructures d’emmagatzematge sĂłn molt heterogenis, cosa que planteja un repte per al dissenyadors d’aquests sistemes: Com es poden dissenyar sistemes d’emmagatzematge en lĂ­nia, fiables i eficients, quan la infraestructura emprada Ă©s tan heterogĂšnia? Aquesta tesis presenta un estudi dels principals problemes que sorgeixen quan un vol respondre a aquesta pregunta. A mĂ©s proporciona diferents eines per tal d’optimitzar el disseny de sistemes d’emmagatzematge distribuĂŻts i heterogenis. Les principals contribucions sĂłn: Primer, creem un marc d’anĂ lisis per estudiar els efectes de la redundĂ ncia de dades en el cost dels sistemes d’emmagatzematge distribuĂŻts. Donat un esquema de redundĂ ncia especĂ­fic, el marc d’anĂ lisis presentat permet predir el cost mitjĂ  d’emmagatzematge i el cost mitjĂ  de comunicaciĂł d’un sistema d’emmagatzematge implementat sobre qualsevol infraestructura informĂ tica distribuĂŻda. Segon, analitzem els impactes que la redundĂ ncia de dades tĂ© en la disponibilitat de les dades, i en els temps de recuperaciĂł. Donada una redundĂ ncia, i donat un sistema d’emmagatzematge heterogeni, creem un grup d’algorismes per a determinar la disponibilitat de les dades esperada, i els temps de recuperaciĂł esperats. Tercer, dissenyem diferents polĂ­tiques d’assignaciĂł de dades per a diferents sistemes d’emmagatzematge. Diferenciem entre aquells escenaris on la totalitat de la infraestructura estĂ  administrada per una sola organitzaciĂł, i els escenaris on diferents parts auto administrades contribueixen els seus recursos. Els objectius de les nostres polĂ­tiques d’assignaciĂł de dades sĂłn: (i) minimitzar la redundĂ ncia necessĂ ria, (ii) garantir la equitat entre totes les parts que participen al sistema, i (iii) incentivar a les parts perquĂš contribueixin els seus recursos al sistema.Over the last decade, users’ storage demands have been growing exponentially year over year. Besides demanding more storage capacity and more data reliability, today users also demand the possibility to access their data from any location and from any device. These new needs encourage users to move their personal data (e.g., E-mails, documents, pictures, etc.) to online storage services such as Gmail, Facebook, Flickr or Dropbox. Unfortunately, these online storage services are built upon expensive large datacenters that only a few big enterprises can afford. To reduce the costs of these large datacenters, a new wave of online storage services has recently emerged integrating storage resources from different small datacenters, or even integrating user storage resources into the provider’s storage infrastructure. However, the storage resources that compose these new storage infrastructures are highly heterogeneous, which poses a challenging problem to storage systems designers: How to design reliable and efficient distributed storage systems over heterogeneous storage infrastructures? This thesis provides an analysis of the main problems that arise when one aims to answer this question. Besides that, this thesis provides different tools to optimize the design of heterogeneous distributed storage systems. The contribution of this thesis is threefold: First, we provide a novel framework to analyze the effects that data redundancy has on the storage and communication costs of distributed storage systems. Given a generic redundancy scheme, the presented framework can predict the average storage costs and the average communication costs of a storage system deployed over a specific storage infrastructure. Second, we analyze the impacts that data redundancy has on data availability and retrieval times. For a given redundancy and a heterogeneous storage infrastructure, we provide a set of algorithms that allow to determine the expected data availability and expected retrieval times. Third, we design different data assignment policies for different storage scenarios. We differentiate between scenarios where the entire storage infrastructure is managed by the same organization, and scenarios where different parties contribute their storage resources. The aims of our assignment policies are: (i) to minimize the required redundancy, (ii) to guarantee fairness among all parties, and (iii) to encourage different parties to contribute their local storage resources to the system

    Optical Networks and Interconnects

    Full text link
    The rapid evolution of communication technologies such as 5G and beyond, rely on optical networks to support the challenging and ambitious requirements that include both capacity and reliability. This chapter begins by giving an overview of the evolution of optical access networks, focusing on Passive Optical Networks (PONs). The development of the different PON standards and requirements aiming at longer reach, higher client count and delivered bandwidth are presented. PON virtualization is also introduced as the flexibility enabler. Triggered by the increase of bandwidth supported by access and aggregation network segments, core networks have also evolved, as presented in the second part of the chapter. Scaling the physical infrastructure requires high investment and hence, operators are considering alternatives to optimize the use of the existing capacity. This chapter introduces different planning problems such as Routing and Spectrum Assignment problems, placement problems for regenerators and wavelength converters, and how to offer resilience to different failures. An overview of control and management is also provided. Moreover, motivated by the increasing importance of data storage and data processing, this chapter also addresses different aspects of optical data center interconnects. Data centers have become critical infrastructure to operate any service. They are also forced to take advantage of optical technology in order to keep up with the growing capacity demand and power consumption. This chapter gives an overview of different optical data center network architectures as well as some expected directions to improve the resource utilization and increase the network capacity
