836 research outputs found
Towards Data Optimization in Storages and Networks
Title from PDF of title page, viewed on August 7, 2015Dissertation advisors: Sejun Song and Baek-Young ChoiVitaIncludes bibliographic references (pages 132-140)Thesis (Ph.D.)--School of Computing and Engineering. University of Missouri--Kansas City, 2015We are encountering an explosion of data volume, as a study estimates that data
will amount to 40 zeta bytes by the end of 2020. This data explosion poses significant
burden not only on data storage space but also access latency, manageability, and processing
and network bandwidth. However, large portions of the huge data volume contain
massive redundancies that are created by users, applications, systems, and communication
models. Deduplication is a technique to reduce data volume by removing redundancies.
Reliability will be even improved when data is replicated after deduplication.
Many deduplication studies such as storage data deduplication and network redundancy
elimination have been proposed to reduce storage consumption and network
bandwidth consumption. However, existing solutions are not efficient enough to optimize
data delivery path from clients to servers through network. Hence we propose a holistic
deduplication framework to optimize data in their path. Our deduplication framework
consists of three components including data sources or clients, networks, and servers. The
client component removes local redundancies in clients, the network component removes
redundant transfers coming from different clients, and the server component removes redundancies
coming from different networks.
We designed and developed components for the proposed deduplication framework.
For the server component, we developed the Hybrid Email Deduplication System
that achieves a trade-off of space savings and overhead for email systems. For the client
component, we developed the Structure Aware File and Email Deduplication for Cloudbased
Storage Systems that is very fast as well as having good space savings by using
structure-based granularity. For the network component, we developed a system called
Software-defined Deduplication as a Network and Storage service that is in-network deduplication,
and that chains storage data deduplication and network redundancy elimination
functions by using Software Defined Network to achieve both storage space and network
bandwidth savings with low processing time and memory size. We also discuss mobile
deduplication for image and video files in mobile devices. Through system implementations
and experiments, we show that the proposed framework effectively and efficiently
optimizes data volume in a holistic manner encompassing the entire data path of clients,
networks and storage servers.Introduction -- Deduplication technology -- Existing deduplication approaches -- HEDS: Hybrid Email Deduplication System -- SAFE: Structure-aware File and Email Deduplication for cloud-based storage systems -- SoftDance: Software-defined Deduplication as a Network and Storage Service -- Moblie de-duplication -- Conclusion
- …