4,489 research outputs found

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Exploiting Traffic Balancing and Multicast Efficiency in Distributed Video-on-Demand Architectures

    Get PDF
    Distributed Video-on-Demand (DVoD) systems are proposed as a solution to the limited streaming capacity and null scalability of centralized systems. In a previous work, we proposed a fully distributed large-scale VoD architecture, called Double P-Tree, which has shown itself to be a good approach to the design of flexible and scalable DVoD systems. In this paper, we present relevant design aspects related to video mapping and traffic balancing in order to improve Double P-Tree architecture performance. Our simulation results demonstrate that these techniques yield a more efficient system and considerably increase its streaming capacity. The results also show the crucial importance of topology connectivity in improving multicasting performance in DVoD systems. Finally, a comparison among several DVoD architectures was performed using simulation, and the results show that the Double P-Tree architecture incorporating mapping and load balancing policies outperforms similar DVoD architectures.This work was supported by the MCyT-Spain under contract TIC 2001-2592 and partially supported by the Generalitat de Catalunya- Grup de Recerca Consolidat 2001SGR-00218

    Web Replica Hosting Systems

    Get PDF

    Impact of QoS on Replica Placement in Tree Networks

    Get PDF
    This paper discusses and compares several policies to place replicas in tree networks, subject to server capacity and QoS constraints. The client requests are known beforehand, while the number and location of the servers are to be determined. We study three strategies. The first two strategies assign each client to a unique server while the third allows requests of a client to be processed by multiple servers. The main contribution of this paper is to assess the impact of QoS constraints on the total replication cost. In this paper, we establish the NP-completeness of the problem on homogeneous networks when the requests of a given client can be processed by multiple servers. We provide several efficient polynomial heuristic algorithms for NP-complete instances of the problem. These heuristics are compared to the optimal solution provided by the formulation of the problem in terms of the solution of an integer linear program.Dans ce rapport, on discute et compare plusieurs politiques de placement de répliques dans les arbres, en prenant en compte à la fois des contraintes de capacité de traitement de chaque serveur et des contraintes de type QoS (Qualité de Service). Les requêtes des clients sont connues avant exécution, alors que le nombre et l’emplacement des répliques (serveurs) sont déterminés par l’algorithme de placement. Nous étudions trois stratégies. Les deux premières stratégies assignent chaque client à un serveur unique alors que la troisième permet que les requêtes d’un client soient traitées par plusieurs serveurs. L’objectif principal de ce travail est l’étude de l’impact des contraintes de qualité de service sur le coût total. Nous établissons la NP-complétude du problème sur des réseaux homogènes quand les requêtes d’un client peuvent être traitées par des serveurs multiples. Nous présentons plusieurs heuristiques polynomiales et efficaces pour les instances NP-complètes du problème sur plateformes hétérogènes. Ces heuristiques sont comparées à la solution optimale obtenue grâce à la formulation du problème en terme d’un programme linéaire en nombres entiers

    Revisiting core traffic growth in the presence of expanding CDNs

    Get PDF
    Traffic growth forecasts announce a dramatic future for core networks, struggling to keep the pace of traffic augmentation. Internet traffic growth primarily stems from the proliferation of cloud services and the massive amounts of data distributed by the content delivery networks (CDNs) hosting these services. In this paper, we investigate the evolution of core traffic in the presence of growing CDNs. Expanding the capacities of existing data centers (DCs) directly translates the forecasted compound-annual-growth-rate (CAGR) of user traffic to the CAGR of carried core link traffic. On the other hand, expanding CDNs by building new geographically dispersed DCs can significantly reduce the predicted core traffic growth rates by placing content closer to the users. However, reducing DC-to-user traffic by building new DCs comes at a trade-off with increasing inter-DC content synchronization traffic. Thus, the resulting overall core traffic growth will depend on the types of services supported and their associated synchronization requirements. In this paper, we present a long-term evolution study to assess the implications of different CDN expansion strategies on core network traffic growth considering a mix of services in proportions and growth rates corresponding to well-known traffic forecasts. Our simulations indicate that CDNs may have significant incentive to build more DCs, depending on the service types they offer, and that current alarming traffic predictions may be somewhat overestimated in core networks in the presence of expanding CDNs. (C) 2019 The Authors. Published by Elsevier B.V.The research leading to these results has received funding from the European Commission for the H2020-ICT-2016-2 METRO-HAUL project (G.A. 761727) and it has been partially funded by the Spanish national project ONOFRE-2(TEC2017-84423-C3-1-P, MINECO/AEI/FEDER, UE)

    Scalable download protocols

    Get PDF
    Scalable on-demand content delivery systems, designed to effectively handle increasing request rates, typically use service aggregation or content replication techniques. Service aggregation relies on one-to-many communication techniques, such as multicast, to efficiently deliver content from a single sender to multiple receivers. With replication, multiple geographically distributed replicas of the service or content share the load of processing client requests and enable delivery from a nearby server.Previous scalable protocols for downloading large, popular files from a single server include batching and cyclic multicast. Analytic lower bounds developed in this thesis show that neither of these protocols consistently yields performance close to optimal. New hybrid protocols are proposed that achieve within 20% of the optimal delay in homogeneous systems, as well as within 25% of the optimal maximum client delay in all heterogeneous scenarios considered.In systems utilizing both service aggregation and replication, well-designed policies determining which replica serves each request must balance the objectives of achieving high locality of service, and high efficiency of service aggregation. By comparing classes of policies, using both analysis and simulations, this thesis shows that there are significant performance advantages in using current system state information (rather than only proximities and average loads) and in deferring selection decisions when possible. Most of these performance gains can be achieved using only “local” (rather than global) request information.Finally, this thesis proposes adaptations of already proposed peer-assisted download techniques to support a streaming (rather than download) service, enabling playback to begin well before the entire media file is received. These protocols split each file into pieces, which can be downloaded from multiple sources, including other clients downloading the same file. Using simulations, a candidate protocol is presented and evaluated. The protocol includes both a piece selection technique that effectively mediates the conflict between achieving high piece diversity and the in-order requirements of media file playback, as well as a simple on-line rule for deciding when playback can safely commence
    • …
    corecore