312 research outputs found

    A PARTIAL REPLICATION LOAD BALANCING TECHNIQUE FOR DISTRIBUTED DATA AS A SERVICE ON THE CLOUD

    Get PDF
    Data as a service (DaaS) is an important model on the Cloud, as DaaS provides clients with different types of large files and data sets in fields like finance, science, health, geography, astronomy, and many others. This includes all types of files with varying sizes from a few kilobytes to hundreds of terabytes. DaaS can be implemented and provided using multiple data centers located at different locations and usually connected via the Internet. When data is provided using multiple data centers it is referred to as distributed DaaS. DaaS providers must ensure that their services are fast, reliable, and efficient. However, ensuring these requirements needs to be done while considering the cost associated and will be carried by the DaaS provider and most likely by the users as well. One traditional approach to support a large number of clients is to replicate the services on different servers. However, this requires full replication of all stored data sets, which requires a huge amount of storage. The huge storage consumption will result in increased costs. Therefore, the aim of this research is to provide a fast, efficient distributed DaaS for the clients, while reducing the storage consumption on the Cloud servers used by the DaaS providers. The method I utilize in this research for fast distributed DaaS is the collaborative dual-direction download of a file or dataset partitions from multiple servers to the client, which will enhance the speed of the download process significantly. Moreover, I partially replicate the file partitions among Cloud servers using the previous download experiences I obtain for each partition. As a result, I generate partial sections of the data sets that will collectively be smaller than the total size needed if full replicas are stored on each server. My method is self-managed; and operates only when more storage is needed. I evaluated my approach against other existing approaches and demonstrated that it provides an important enhancement to current approaches in both download performance and storage consumption. I also developed and analyzed the mathematical model supporting my approach and validated its accuracy

    Efficient data reliability management of cloud storage systems for big data applications

    Get PDF
    Cloud service providers are consistently striving to provide efficient and reliable service, to their client's Big Data storage need. Replication is a simple and flexible method to ensure reliability and availability of data. However, it is not an efficient solution for Big Data since it always scales in terabytes and petabytes. Hence erasure coding is gaining traction despite its shortcomings. Deploying erasure coding in cloud storage confronts several challenges like encoding/decoding complexity, load balancing, exponential resource consumption due to data repair and read latency. This thesis has addressed many challenges among them. Even though data durability and availability should not be compromised for any reason, client's requirements on read performance (access latency) may vary with the nature of data and its access pattern behaviour. Access latency is one of the important metrics and latency acceptance range can be recorded in the client's SLA. Several proactive recovery methods, for erasure codes are proposed in this research, to reduce resource consumption due to recovery. Also, a novel cache based solution is proposed to mitigate the access latency issue of erasure coding

    Managing Population and Workload Imbalance in Structured Overlays

    Get PDF
    Every day the number of data produced by networked devices increases. The current paradigm is to offload the data produced to data centers to be processed. However as more and more devices are offloading their data do cloud centers, accessing data becomes increasingly more challenging. To combat this problem, systems are bringing data closer to the consumer and distributing network responsibilities among the end devices. We are witnessing a change in networking paradigm, where data storage and computation that was once only handled in the cloud, is being processed by Internet of Things (IoT) and mobile devices, thanks to the ever increasing technological capabilities of these devices. One approach, leverages devices into a structured overlay network. Structured Overlays are a common approach to address the organization and distri- bution of data in peer-to-peer distributed systems. Due to their nature, indexing and searching for elements of the system becomes trivial, thus structured overlays become ideal building blocks of resource location based applications. Such overlays assume that the data is distributed evenly over the peers, and that the popularity of those data items is also evenly balanced. However in many systems, due to many factors outside of the system domain, popularity may behave rather randomly, al- lowing for some nodes to spare more resources looking for the popular items than others. In this work we intend to exploit the properties of cluster-based structured overlays propose to address this problem by improving a structure overlay with the mechanisms to manage the population and workload imbalance and achieve more uniform use of resources. Our approach focus on implementing a Group-Based Distributed Hash Table (DHT) capable of dynamically changing its groups to accommodate the changes in churn in the network. With the conclusion of our work we believe that we have indeed created a network capable of withstanding high levels of churn, while ensuring fairness to all members of the network.Todos os dias aumenta o número de dados produzidos por dispositivos em rede. O pa- radigma atual é descarregar os dados produzidos para centros de dados para serem pro- cessados. No entanto com o aumento do número de dispositivos a descarregar dados para estes centros, o acesso aos dados torna-se cada vez mais desafiante. Para combater este problema, os sistemas estão a aproximar os dados dos consumidores e a distribuir responsabilidades de rede entre os dispositivos. Estamos a assistir a uma mudança no paradigma de redes, onde o armazenamento de dados e a computação que antes eram da responsabilidade dos centros de dados, está a ser processado por dispositivos móveis IoT, graças às crescentes capacidades tecnológicas destes dispositivos. Uma abordagem, junta os dispositivos em redes estruturadas. As redes estruturadas são o meio mais comum de organizar e distribuir dados em redes peer-to-peer. Gradas às suas propriedades, indexar e procurar por elementos torna- se trivial, assim, as redes estruturadas tornam-se o bloco de construção ideal para sistemas de procura de ficheiros. Estas redes assumem que os dados estão distribuídos equitativamente por todos os participantes e que todos esses dados são igualmente procurados. no entanto em muitos sistemas, por factores externos a popularidade tem um comportamento volátil e imprevi- sível sobrecarregando os participantes que guardam os dados mais populares. Este trabalho tenta explorar as propriedades das redes estruturadas em grupo para confrontar o problema, vamos equipar uma destas redes com os mecanismos necessários para coordenar os participantes e a sua carga. A nossa abordagem focasse na implementação de uma DHT baseado em grupos capaz de alterar dinamicamente os grupos para acomodar as mudanças de membros da rede. Com a conclusão de nosso trabalho, acreditamos que criamos uma rede capaz de suportar altos níveis de instabilidade, enquanto garante justiça a todos os membros da rede

    High Energy Physics Forum for Computational Excellence: Working Group Reports (I. Applications Software II. Software Libraries and Tools III. Systems)

    Full text link
    Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence (HEP-FCE) initiated a roadmap planning activity with two key overlapping drivers -- 1) software effectiveness, and 2) infrastructure and expertise advancement. The HEP-FCE formed three working groups, 1) Applications Software, 2) Software Libraries and Tools, and 3) Systems (including systems software), to provide an overview of the current status of HEP computing and to present findings and opportunities for the desired HEP computational roadmap. The final versions of the reports are combined in this document, and are presented along with introductory material.Comment: 72 page

    Linear Scalability of Distributed Applications

    Get PDF
    The explosion of social applications such as Facebook, LinkedIn and Twitter, of electronic commerce with companies like Amazon.com and Ebay.com, and of Internet search has created the need for new technologies and appropriate systems to manage effectively a considerable amount of data and users. These applications must run continuously every day of the year and must be capable of surviving sudden and abrupt load increases as well as all kinds of software, hardware, human and organizational failures. Increasing (or decreasing) the allocated resources of a distributed application in an elastic and scalable manner, while satisfying requirements on availability and performance in a cost-effective way, is essential for the commercial viability but it poses great challenges in today's infrastructures. Indeed, Cloud Computing can provide resources on demand: it now becomes easy to start dozens of servers in parallel (computational resources) or to store a huge amount of data (storage resources), even for a very limited period, paying only for the resources consumed. However, these complex infrastructures consisting of heterogeneous and low-cost resources are failure-prone. Also, although cloud resources are deemed to be virtually unlimited, only adequate resource management and demand multiplexing can meet customer requirements and avoid performance deteriorations. In this thesis, we deal with adaptive management of cloud resources under specific application requirements. First, in the intra-cloud environment, we address the problem of cloud storage resource management with availability guarantees and find the optimal resource allocation in a decentralized way by means of a virtual economy. Data replicas migrate, replicate or delete themselves according to their economic fitness. Our approach responds effectively to sudden load increases or failures and makes best use of the geographical distance between nodes to improve application-specific data availability. We then propose a decentralized approach for adaptive management of computational resources for applications requiring high availability and performance guarantees under load spikes, sudden failures or cloud resource updates. Our approach involves a virtual economy among service components (similar to the one among data replicas) and an innovative cascading scheme for setting up the performance goals of individual components so as to meet the overall application requirements. Our approach manages to meet application requirements with the minimum resources, by allocating new ones or releasing redundant ones. Finally, as cloud storage vendors offer online services at different rates, which can vary widely due to second-degree price discrimination, we present an inter-cloud storage resource allocation method to aggregate resources from different storage vendors and provide to the user a system which guarantees the best rate to host and serve its data, while satisfying the user requirements on availability, durability, latency, etc. Our system continuously optimizes the placement of data according to its type and usage pattern, and minimizes migration costs from one provider to another, thereby avoiding vendor lock-in

    Resolution strategies for serverless computing in information centric networking

    Get PDF
    Named Function Networking (NFN) offers to compute and deliver results of computations in the context of Information Centric Networking (ICN). While ICN offers data delivery without specifying the location where these data are stored, NFN offers the production of results without specifying where the actual computation is executed. In NFN, computation workflows are encoded in (ICN style) Interest Messages using the lambda calculus and based on these workflows, the network will distribute computations and find execution locations. Depending on the use case of the actual network, the decision where to execute a compuation can be different: A resolution strategy running on each node decides if a computation should be forwarded, split into subcomputations or executed locally. This work focuses on the design of resolution strategies for selected scenarios and the online derivation of "execution plans" based on network status and history. Starting with a simple resolution strategy suitable for data centers, we focus on improving load distribution within the data center or even between multiple data centers. We have designed resolution strategies that consider the size of input data and the load on nodes, leading to priced execution plans from which one can select the ones with the least costs. Moreover, we use these plans to create execution templates: Templates can be used to create a resolution strategy by simulating the execution using the planning system, tailored to the specific use case at hand. Finally we designed a resolution strategy for edge computing which is able to handle mobile scenarios typical for vehicular networking. This “mobile edge computing resolution strategy” handles the problem of frequent handovers to a sequence of road-side units without creating additional overhead for the non-mobile use case. All these resolution strategies were evaluated using a simulation system and were compared to the state of the art behavior of data center execution environments and/or cloud configurations. In the case of the vehicular networking strategy, we enhanced existing road-side units and implemented our NFN-based system and plan derivation such that we were able to run and validate our solution in real world tests for mobile edge computing

    Service based virtual RAN architecture for next generation cellular systems

    Get PDF
    Service based architecture (SBA) is a paradigm shift from Service-Oriented Architecture (SOA) to microservices, combining their principles. Network virtualization enables the application of SBA in cellular systems. To better guide the software design of this virtualized cellular system with SBA, this paper presents a software perspective and a positional approach to using fundamental development principles for adapting SBA in virtualized Radio Access Networks (vRANs). First, we present the motivation for using an SBA in cellular radio systems. Then, we explore the critical requirements, key principles, and components for the software to provide radio services in SBA. We also explore the potential of applying SBA-based Radio Access Network (RAN) by comparing the functional split requirements of 5G RAN with existing open-source software and accelerated hardware implementations of service bus, and discuss the limitations of SBA. Finally, we present some discussions, future directions, and a roadmap of applying such a high-level design perspective of SBA to next-generation RAN infrastructure.This work was supported in part by the European Union (EU) H2020 5GROWTH Project under Grant 856709, in part by the Generalitat de Catalunya under Grant 2017 SGR 1195, and in part by the National Program on Equipment and Scientific and Technical Infrastructure under the European Regional Development Fund (FEDER) under Grant EQC2018-005257-P

    Towards Energy-Efficient, Fault-Tolerant, and Load-Balanced Mobile Cloud

    Get PDF
    Recent advances in mobile technologies have enabled a new computing paradigm in which large amounts of data are generated and accessed from mobile devices. However, running resource-intensive applications (e.g., video/image storage and processing or map-reduce type) on a single mobile device still remains off bounds since it requires large computation and storage capabilities. Computer scientists overcome this issue by exploiting the abundant computation and storage resources from traditional cloud to enhance the capabilities of end-user mobile devices. Nevertheless, the designs that rely on remote cloud services sometimes underlook the available resources (e.g., storage, communication, and processing) on mobile devices. In particular, when the remote cloud services are unavailable (due to service provider or network issues) these smart devices become unusable. For mobile devices deployed in an infrastructureless network where nodes can move, join, or leave the network dynamically, the challenges on energy-efficiency, reliability, and load-balance are still largely unexplored. This research investigates challenges and proposes solutions for deploying mobile application in such environments. In particular, we focus on a distributed data storage and data processing framework for mobile cloud. The proposed mobile cloud computing (MCC) framework provides data storage and data processing services to MCC applications such as video storage and processing or map-reduce type. These services ensure the mobile cloud is energy-efficient, fault-tolerant, and load-balanced by intelligently allocating and managing the stored data and processing tasks accounting for the limited resources on mobile devices. When considering the load-balance, the framework also incorporates the heterogeneous characteristics of mobile cloud in which nodes may have various energy, communication, and processing capabilities. All the designs are built on the k-out-of-n computing theoretical foundation. The novel formulations produce a reliability-compliant, energy-efficient data storage solution and a deadline-compliant, energy-efficient job scheduler. From the promising outcomes of this research, a future where mobile cloud offers real-time computation capabilities in complex environments such as disaster relief or warzone is certainly not far
    corecore