781 research outputs found
LDPC Codes with Local and Global Decoding
This paper presents a theoretical study of a new type of LDPC codes motivated
by practical storage applications. LDPCL codes (suffix L represents locality)
are LDPC codes that can be decoded either as usual over the full code block, or
locally when a smaller sub-block is accessed (to reduce latency). LDPCL codes
are designed to maximize the error-correction performance vs. rate in the usual
(global) mode, while at the same time providing a certain performance in the
local mode. We develop a theoretical framework for the design of LDPCL codes.
Our results include a design tool to construct an LDPC code with two
data-protection levels: local and global. We derive theoretical results
supporting this tool and we show how to achieve capacity with it. A trade-off
between the gap to capacity and the number of full-block accesses is studied,
and a finite-length analysis of ML decoding is performed to exemplify a
trade-off between the locality capability and the full-block error-correcting
capability.Comment: 41 page
Coding for Fast Content Download
We study the fundamental trade-off between storage and content download time.
We show that the download time can be significantly reduced by dividing the
content into chunks, encoding it to add redundancy and then distributing it
across multiple disks. We determine the download time for two content access
models - the fountain and fork-join models that involve simultaneous content
access, and individual access from enqueued user requests respectively. For the
fountain model we explicitly characterize the download time, while in the
fork-join model we derive the upper and lower bounds. Our results show that
coding reduces download time, through the diversity of distributing the data
across more disks, even for the total storage used.Comment: 8 pages, 6 figures, conferenc
ElfStore: A Resilient Data Storage Service for Federated Edge and Fog Resources
Edge and fog computing have grown popular as IoT deployments become
wide-spread. While application composition and scheduling on such resources are
being explored, there exists a gap in a distributed data storage service on the
edge and fog layer, instead depending solely on the cloud for data persistence.
Such a service should reliably store and manage data on fog and edge devices,
even in the presence of failures, and offer transparent discovery and access to
data for use by edge computing applications. Here, we present Elfstore, a
first-of-its-kind edge-local federated store for streams of data blocks. It
uses reliable fog devices as a super-peer overlay to monitor the edge
resources, offers federated metadata indexing using Bloom filters, locates data
within 2-hops, and maintains approximate global statistics about the
reliability and storage capacity of edges. Edges host the actual data blocks,
and we use a unique differential replication scheme to select edges on which to
replicate blocks, to guarantee a minimum reliability and to balance storage
utilization. Our experiments on two IoT virtual deployments with 20 and 272
devices show that ElfStore has low overheads, is bound only by the network
bandwidth, has scalable performance, and offers tunable resilience.Comment: 24 pages, 14 figures, To appear in IEEE International Conference on
Web Services (ICWS), Milan, Italy, 201
GPUs as Storage System Accelerators
Massively multicore processors, such as Graphics Processing Units (GPUs),
provide, at a comparable price, a one order of magnitude higher peak
performance than traditional CPUs. This drop in the cost of computation, as any
order-of-magnitude drop in the cost per unit of performance for a class of
system components, triggers the opportunity to redesign systems and to explore
new ways to engineer them to recalibrate the cost-to-performance relation. This
project explores the feasibility of harnessing GPUs' computational power to
improve the performance, reliability, or security of distributed storage
systems. In this context, we present the design of a storage system prototype
that uses GPU offloading to accelerate a number of computationally intensive
primitives based on hashing, and introduce techniques to efficiently leverage
the processing power of GPUs. We evaluate the performance of this prototype
under two configurations: as a content addressable storage system that
facilitates online similarity detection between successive versions of the same
file and as a traditional system that uses hashing to preserve data integrity.
Further, we evaluate the impact of offloading to the GPU on competing
applications' performance. Our results show that this technique can bring
tangible performance gains without negatively impacting the performance of
concurrently running applications.Comment: IEEE Transactions on Parallel and Distributed Systems, 201
- …