2,938 research outputs found
Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments
Data centres that use consumer-grade disks drives and distributed
peer-to-peer systems are unreliable environments to archive data without enough
redundancy. Most redundancy schemes are not completely effective for providing
high availability, durability and integrity in the long-term. We propose alpha
entanglement codes, a mechanism that creates a virtual layer of highly
interconnected storage devices to propagate redundant information across a
large scale storage system. Our motivation is to design flexible and practical
erasure codes with high fault-tolerance to improve data durability and
availability even in catastrophic scenarios. By flexible and practical, we mean
code settings that can be adapted to future requirements and practical
implementations with reasonable trade-offs between security, resource usage and
performance. The codes have three parameters. Alpha increases storage overhead
linearly but increases the possible paths to recover data exponentially. Two
other parameters increase fault-tolerance even further without the need of
additional storage. As a result, an entangled storage system can provide high
availability, durability and offer additional integrity: it is more difficult
to modify data undetectably. We evaluate how several redundancy schemes perform
in unreliable environments and show that alpha entanglement codes are flexible
and practical codes. Remarkably, they excel at code locality, hence, they
reduce repair costs and become less dependent on storage locations with poor
availability. Our solution outperforms Reed-Solomon codes in many disaster
recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially
supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018
48th Annual IEEE/IFIP International Conference on Dependable Systems and
Networks (DSN
InterCloud: Utility-Oriented Federation of Cloud Computing Environments for Scaling of Application Services
Cloud computing providers have setup several data centers at different
geographical locations over the Internet in order to optimally serve needs of
their customers around the world. However, existing systems do not support
mechanisms and policies for dynamically coordinating load distribution among
different Cloud-based data centers in order to determine optimal location for
hosting application services to achieve reasonable QoS levels. Further, the
Cloud computing providers are unable to predict geographic distribution of
users consuming their services, hence the load coordination must happen
automatically, and distribution of services must change in response to changes
in the load. To counter this problem, we advocate creation of federated Cloud
computing environment (InterCloud) that facilitates just-in-time,
opportunistic, and scalable provisioning of application services, consistently
achieving QoS targets under variable workload, resource and network conditions.
The overall goal is to create a computing environment that supports dynamic
expansion or contraction of capabilities (VMs, services, storage, and database)
for handling sudden variations in service demands.
This paper presents vision, challenges, and architectural elements of
InterCloud for utility-oriented federation of Cloud computing environments. The
proposed InterCloud environment supports scaling of applications across
multiple vendor clouds. We have validated our approach by conducting a set of
rigorous performance evaluation study using the CloudSim toolkit. The results
demonstrate that federated Cloud computing model has immense potential as it
offers significant performance gains as regards to response time and cost
saving under dynamic workload scenarios.Comment: 20 pages, 4 figures, 3 tables, conference pape
On the design and optimization of heterogeneous distributed storage systems
Durant la Ășltima dĂšcada, la demanda dâemmagatzematge de dades ha anat creixent exponencialment any rere any. Apart de demanar mĂ©s capacitat dâemmagatzematge, el usuaris actualment tambĂ© demanen poder accedir a les seves dades des de qualsevol lloc i des de qualsevol dispositiu. Degut a aquests nous requeriments, els usuaris estan actualment movent les seves dades personals (correus electrĂČnics, documents, fotografies, etc.) cap a serveis dâemmagatzematge en lĂnia com ara Gmail, Facebook, Flickr o Dropbox. Malauradament, aquests serveis dâemmagatzematge en lĂnia estan sostinguts per unes grans infraestructures informĂ tiques que poques empreses poden finançar.
Per tal de reduir el costs dâaquestes grans infraestructures informĂ tiques, ha sorgit una nova onada de serveis dâemmagatzematge en lĂnia que obtenen grans infraestructures dâemmagatzematge a base dâintegrar els recursos petits centres de dades, o fins i tot a base dâintegrar els recursos dâemmagatzematge del usuaris finals. No obstant aixĂČ, els recursos que formen aquestes noves infraestructures dâemmagatzematge sĂłn molt heterogenis, cosa que planteja un repte per al dissenyadors dâaquests sistemes: Com es poden dissenyar sistemes dâemmagatzematge en lĂnia, fiables i eficients, quan la infraestructura emprada Ă©s tan heterogĂšnia?
Aquesta tesis presenta un estudi dels principals problemes que sorgeixen quan un vol respondre a aquesta pregunta. A mĂ©s proporciona diferents eines per tal dâoptimitzar el disseny de sistemes dâemmagatzematge distribuĂŻts i heterogenis. Les principals contribucions sĂłn:
Primer, creem un marc dâanĂ lisis per estudiar els efectes de la redundĂ ncia de dades en el cost dels sistemes dâemmagatzematge distribuĂŻts. Donat un esquema de redundĂ ncia especĂfic, el marc dâanĂ lisis presentat permet predir el cost mitjĂ dâemmagatzematge i el cost mitjĂ de comunicaciĂł dâun sistema dâemmagatzematge implementat sobre qualsevol infraestructura informĂ tica distribuĂŻda.
Segon, analitzem els impactes que la redundĂ ncia de dades tĂ© en la disponibilitat de les dades, i en els temps de recuperaciĂł. Donada una redundĂ ncia, i donat un sistema dâemmagatzematge heterogeni, creem un grup dâalgorismes per a determinar la disponibilitat de les dades esperada, i els temps de recuperaciĂł esperats.
Tercer, dissenyem diferents polĂtiques dâassignaciĂł de dades per a diferents sistemes dâemmagatzematge. Diferenciem entre aquells escenaris on la totalitat de la infraestructura estĂ administrada per una sola organitzaciĂł, i els escenaris on diferents parts auto administrades contribueixen els seus recursos. Els objectius de les nostres polĂtiques dâassignaciĂł de dades sĂłn: (i) minimitzar la redundĂ ncia necessĂ ria, (ii) garantir la equitat entre totes les parts que participen al sistema, i (iii) incentivar a les parts perquĂš contribueixin els seus recursos al sistema.Over the last decade, usersâ storage demands have been growing exponentially year over year. Besides demanding more storage capacity and more data reliability, today users also demand the possibility to access their data from any location and from any device. These new needs encourage users to move their personal data (e.g., E-mails, documents, pictures, etc.) to online storage services such as Gmail, Facebook, Flickr or Dropbox. Unfortunately, these online storage services are built upon expensive large datacenters that only a few big enterprises can afford.
To reduce the costs of these large datacenters, a new wave of online storage services has recently emerged integrating storage resources from different small datacenters, or even integrating user storage resources into the providerâs storage infrastructure. However, the storage resources that compose these new storage infrastructures are highly heterogeneous, which poses a challenging problem to storage systems designers: How to design reliable and efficient distributed storage systems over heterogeneous storage infrastructures?
This thesis provides an analysis of the main problems that arise when one aims to answer this question. Besides that, this thesis provides different tools to optimize the design of heterogeneous distributed storage systems. The contribution of this thesis is threefold:
First, we provide a novel framework to analyze the effects that data redundancy has on the storage and communication costs of distributed storage systems. Given a generic redundancy scheme, the presented framework can predict the average storage costs and the average communication costs of a storage system deployed over a specific storage infrastructure.
Second, we analyze the impacts that data redundancy has on data availability and retrieval times. For a given redundancy and a heterogeneous storage infrastructure, we provide a set of algorithms that allow to determine the expected data availability and expected retrieval times.
Third, we design different data assignment policies for different storage scenarios. We differentiate between scenarios where the entire storage infrastructure is managed by the same organization, and scenarios where different parties contribute their storage resources. The aims of our assignment policies are: (i) to minimize the required redundancy, (ii) to guarantee fairness among all parties, and (iii) to encourage different parties to contribute their local storage resources to the system
Optical Networks and Interconnects
The rapid evolution of communication technologies such as 5G and beyond, rely
on optical networks to support the challenging and ambitious requirements that
include both capacity and reliability. This chapter begins by giving an
overview of the evolution of optical access networks, focusing on Passive
Optical Networks (PONs). The development of the different PON standards and
requirements aiming at longer reach, higher client count and delivered
bandwidth are presented. PON virtualization is also introduced as the
flexibility enabler. Triggered by the increase of bandwidth supported by access
and aggregation network segments, core networks have also evolved, as presented
in the second part of the chapter. Scaling the physical infrastructure requires
high investment and hence, operators are considering alternatives to optimize
the use of the existing capacity. This chapter introduces different planning
problems such as Routing and Spectrum Assignment problems, placement problems
for regenerators and wavelength converters, and how to offer resilience to
different failures. An overview of control and management is also provided.
Moreover, motivated by the increasing importance of data storage and data
processing, this chapter also addresses different aspects of optical data
center interconnects. Data centers have become critical infrastructure to
operate any service. They are also forced to take advantage of optical
technology in order to keep up with the growing capacity demand and power
consumption. This chapter gives an overview of different optical data center
network architectures as well as some expected directions to improve the
resource utilization and increase the network capacity
- âŠ