Search CORE

97 research outputs found

Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments

Author: Estrada-Galiñanes Vero
Felber Pascal
Miller Ethan
Pâris Jehan-François
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/10/2018
Field of study

Data centres that use consumer-grade disks drives and distributed peer-to-peer systems are unreliable environments to archive data without enough redundancy. Most redundancy schemes are not completely effective for providing high availability, durability and integrity in the long-term. We propose alpha entanglement codes, a mechanism that creates a virtual layer of highly interconnected storage devices to propagate redundant information across a large scale storage system. Our motivation is to design flexible and practical erasure codes with high fault-tolerance to improve data durability and availability even in catastrophic scenarios. By flexible and practical, we mean code settings that can be adapted to future requirements and practical implementations with reasonable trade-offs between security, resource usage and performance. The codes have three parameters. Alpha increases storage overhead linearly but increases the possible paths to recover data exponentially. Two other parameters increase fault-tolerance even further without the need of additional storage. As a result, an entangled storage system can provide high availability, durability and offer additional integrity: it is more difficult to modify data undetectably. We evaluate how several redundancy schemes perform in unreliable environments and show that alpha entanglement codes are flexible and practical codes. Remarkably, they excel at code locality, hence, they reduce repair costs and become less dependent on storage locations with poor availability. Our solution outperforms Reed-Solomon codes in many disaster recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN

arXiv.org e-Print Archive

Crossref

A Survey on Array Storage, Query Languages, and Systems

Author: Cheng Yu
Rusu Florin
Publication venue
Publication date: 19/02/2013
Field of study

Since scientific investigation is one of the most important providers of massive amounts of ordered data, there is a renewed interest in array data processing in the context of Big Data. To the best of our knowledge, a unified resource that summarizes and analyzes array processing research over its long existence is currently missing. In this survey, we provide a guide for past, present, and future research in array processing. The survey is organized along three main topics. Array storage discusses all the aspects related to array partitioning into chunks. The identification of a reduced set of array operators to form the foundation for an array query language is analyzed across multiple such proposals. Lastly, we survey real systems for array processing. The result is a thorough survey on array data storage and processing that should be consulted by anyone interested in this research topic, independent of experience level. The survey is not complete though. We greatly appreciate pointers towards any work we might have forgotten to mention.Comment: 44 page

arXiv.org e-Print Archive

CiteSeerX

Data partitioning and load balancing in parallel disk systems

Author: Scheuermann Peter
Weikum Gerhard
Zabback Peter
Publication venue: Sonstige Einrichtungen. Sonstige Einrichtungen
Publication date: 01/01/1996
Field of study

Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via inter-request and intra-request parallelism. In this paper we discuss the main issues in performance tuning of such systems, namely striping and load balancing, and show their relationship to response time and throughput. We outline the main components of an intelligent file system that optimizes striping by taking into account the requirements of the applications, and performs load balancing by judicious file allocation and dynamic redistributions of the data when access patterns change. Our system uses simple but effective heuristics that incur only little overhead. We present performance experiments based on synthetic workloads and real-life traces

NASA Technical Reports Server

Introduction to Multiprocessor I/O Architecture

Author: Kotz David
Publication venue: Dartmouth Digital Commons
Publication date: 01/01/1996
Field of study

The computational performance of multiprocessors continues to improve by leaps and bounds, fueled in part by rapid improvements in processor and interconnection technology. I/O performance thus becomes ever more critical, to avoid becoming the bottleneck of system performance. In this paper we provide an introduction to I/O architectural issues in multiprocessors, with a focus on disk subsystems. While we discuss examples from actual architectures and provide pointers to interesting research in the literature, we do not attempt to provide a comprehensive survey. We concentrate on a study of the architectural design issues, and the effects of different design alternatives

Dartmouth Digital Commons (Dartmouth College)

Dynamic load balancing in parallel database systems

Author: D.J. DeWitt
E. Rahm
E. Rahm
E. Rahm
G. Graefe
H. Lu
J.L. Wolf
K.A. Hua
K.A. Hua
M.J. Carey
P. Valduriez
P.M. Chen
P.S. Yu
S. Ghandeharizadeh
W. Kim
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Parallel replication for distributed video-on-demand systems.

Author
Publication venue
Publication date: 01/01/1997
Field of study

Lie, Wai-Kwok Peter.Thesis (M.Phil.)--Chinese University of Hong Kong, 1997.Includes bibliographical references (leaves 79-83).Abstract --- p.iAcknowledgments --- p.iiChapter 1 --- Introduction --- p.1Chapter 2 --- Background & Related Work --- p.5Chapter 2.1 --- Early Work on Multimedia Servers --- p.6Chapter 2.2 --- Compression of Multimedia Data --- p.6Chapter 2.3 --- Multimedia File Systems --- p.7Chapter 2.4 --- Scheduling Support for Multimedia Systems --- p.8Chapter 2.5 --- Inter-media Synchronization --- p.9Chapter 2.6 --- Related Work on Replication in VOD Systems --- p.9Chapter 3 --- System Model --- p.12Chapter 4 --- Replication Methodology --- p.15Chapter 4.1 --- Replication Triggering Policy --- p.16Chapter 4.2 --- Source & Target Nodes Selection Policies --- p.17Chapter 4.3 --- Replication Policies --- p.18Chapter 4.3.1 --- Policy 1: Injected Sequential Replication --- p.20Chapter 4.3.2 --- Policy 2: Piggybacked Sequential Replication --- p.22Chapter 4.3.3 --- Policy 3: Injected Parallel Replication --- p.25Chapter 4.3.4 --- Policy 4: Piggybacked Parallel Replication --- p.28Chapter 4.3.5 --- Policy 5: Injected & Piggybacked Parallel Replication --- p.34Chapter 4.3.6 --- Policy 6: Multi-Source Injected & Piggybacked Parallel Replication --- p.36Chapter 4.4 --- Dereplication Policy --- p.37Chapter 5 --- Distributed Architecture for VOD Server --- p.39Chapter 5.1 --- Server Node --- p.40Chapter 5.2 --- Movie Manager --- p.42Chapter 5.3 --- Metadata Manager --- p.42Chapter 5.4 --- Protocols for Distributed VOD Architecture --- p.43Chapter 5.4.1 --- Protocol for Servicing New Customers --- p.43Chapter 5.4.2 --- Protocol for Servicing Existing Customers --- p.45Chapter 5.4.3 --- Protocol for Single/Multi-Source Injected & Parallel Replication --- p.46Chapter 5.4.4 --- Protocol for Dereplication --- p.48Chapter 5.5 --- Failure Handling --- p.49Chapter 5.5.1 --- Handling of Server Node Failures --- p.50Chapter 5.5.2 --- Handling of Movie Manager Failures --- p.52Chapter 6 --- Results --- p.55Chapter 6.1 --- Performance Metric --- p.56Chapter 6.2 --- Simulation Environment --- p.58Chapter 6.3 --- Results of Experiments without Dereplication --- p.59Chapter 6.3.1 --- Comparison of Different Replication Policies --- p.60Chapter 6.3.2 --- Effect of Early Acceptance/Migration --- p.61Chapter 6.3.3 --- Answer to the Resources Consumption Tradeoff issue --- p.62Chapter 6.3.4 --- Effect of Varying Movie Popularity Skewness --- p.64Chapter 6.3.5 --- Effect of Varying Replication Threshold --- p.64Chapter 6.3.6 --- Comparison of Different Target Node Selection Policies --- p.65Chapter 6.4 --- Overall Impact of Dynamic Replication --- p.66Chapter 7 --- Comparison with BSR-based Policy --- p.71Chapter 8 --- Conclusions --- p.75Chapter 8.1 --- Summary --- p.75Chapter 8.2 --- Future Research Directions --- p.76Bibliography --- p.7

CUHK Digital Repository

Scalable Storage for Digital Libraries

Author: Mather Paul
Publication venue
Publication date: 01/10/2002
Field of study

I propose a storage system optimised for digital libraries. Its key features are its heterogeneous scalability; its integration and exploitation of rich semantic metadata associated with digital objects; its use of a name space; and its aggressive performance optimisation in the digital library domain

Computer Science Technical Reports @Virginia Tech

Data partitioning and load balancing in parallel disk systems

Author: Scheuermann Peter
Weikum Gerhard
Zabback Peter
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 03/04/2014
Field of study

Universaar

Acronym

Analysis and Comparison of Replicated Declustering Schemes

Author: Ali Saman Tosun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref