11,238 research outputs found
Coding for Fast Content Download
We study the fundamental trade-off between storage and content download time.
We show that the download time can be significantly reduced by dividing the
content into chunks, encoding it to add redundancy and then distributing it
across multiple disks. We determine the download time for two content access
models - the fountain and fork-join models that involve simultaneous content
access, and individual access from enqueued user requests respectively. For the
fountain model we explicitly characterize the download time, while in the
fork-join model we derive the upper and lower bounds. Our results show that
coding reduces download time, through the diversity of distributing the data
across more disks, even for the total storage used.Comment: 8 pages, 6 figures, conferenc
When Queueing Meets Coding: Optimal-Latency Data Retrieving Scheme in Storage Clouds
In this paper, we study the problem of reducing the delay of downloading data
from cloud storage systems by leveraging multiple parallel threads, assuming
that the data has been encoded and stored in the clouds using fixed rate
forward error correction (FEC) codes with parameters (n, k). That is, each file
is divided into k equal-sized chunks, which are then expanded into n chunks
such that any k chunks out of the n are sufficient to successfully restore the
original file. The model can be depicted as a multiple-server queue with
arrivals of data retrieving requests and a server corresponding to a thread.
However, this is not a typical queueing model because a server can terminate
its operation, depending on when other servers complete their service (due to
the redundancy that is spread across the threads). Hence, to the best of our
knowledge, the analysis of this queueing model remains quite uncharted.
Recent traces from Amazon S3 show that the time to retrieve a fixed size
chunk is random and can be approximated as a constant delay plus an i.i.d.
exponentially distributed random variable. For the tractability of the
theoretical analysis, we assume that the chunk downloading time is i.i.d.
exponentially distributed. Under this assumption, we show that any
work-conserving scheme is delay-optimal among all on-line scheduling schemes
when k = 1. When k > 1, we find that a simple greedy scheme, which allocates
all available threads to the head of line request, is delay optimal among all
on-line scheduling schemes. We also provide some numerical results that point
to the limitations of the exponential assumption, and suggest further research
directions.Comment: Original accepted by IEEE Infocom 2014, 9 pages. Some statements in
the Infocom paper are correcte
SEARS: Space Efficient And Reliable Storage System in the Cloud
Today's cloud storage services must offer storage reliability and fast data
retrieval for large amount of data without sacrificing storage cost. We present
SEARS, a cloud-based storage system which integrates erasure coding and data
deduplication to support efficient and reliable data storage with fast user
response time. With proper association of data to storage server clusters,
SEARS provides flexible mixing of different configurations, suitable for
real-time and archival applications.
Our prototype implementation of SEARS over Amazon EC2 shows that it
outperforms existing storage systems in storage efficiency and file retrieval
time. For 3 MB files, SEARS delivers retrieval time of s compared to
s with existing systems.Comment: 4 pages, IEEE LCN 201
- …