Search CORE

1 research outputs found

A PARTIAL REPLICATION LOAD BALANCING TECHNIQUE FOR DISTRIBUTED DATA AS A SERVICE ON THE CLOUD

Author: Al Nuaimi Klaithem Saeed
Publication venue: Scholarworks@UAEU
Publication date: 01/05/2015
Field of study

Data as a service (DaaS) is an important model on the Cloud, as DaaS provides clients with different types of large files and data sets in fields like finance, science, health, geography, astronomy, and many others. This includes all types of files with varying sizes from a few kilobytes to hundreds of terabytes. DaaS can be implemented and provided using multiple data centers located at different locations and usually connected via the Internet. When data is provided using multiple data centers it is referred to as distributed DaaS. DaaS providers must ensure that their services are fast, reliable, and efficient. However, ensuring these requirements needs to be done while considering the cost associated and will be carried by the DaaS provider and most likely by the users as well. One traditional approach to support a large number of clients is to replicate the services on different servers. However, this requires full replication of all stored data sets, which requires a huge amount of storage. The huge storage consumption will result in increased costs. Therefore, the aim of this research is to provide a fast, efficient distributed DaaS for the clients, while reducing the storage consumption on the Cloud servers used by the DaaS providers. The method I utilize in this research for fast distributed DaaS is the collaborative dual-direction download of a file or dataset partitions from multiple servers to the client, which will enhance the speed of the download process significantly. Moreover, I partially replicate the file partitions among Cloud servers using the previous download experiences I obtain for each partition. As a result, I generate partial sections of the data sets that will collectively be smaller than the total size needed if full replicas are stored on each server. My method is self-managed; and operates only when more storage is needed. I evaluated my approach against other existing approaches and demonstrated that it provides an important enhancement to current approaches in both download performance and storage consumption. I also developed and analyzed the mathematical model supporting my approach and validated its accuracy

United Arab Emirates University: Scholarworks@UAEU / جامعة الامارات