109 research outputs found

    Scalable consistency maintenance in content distribution networks using cooperative leases

    Full text link

    Policy-Based Dynamic Proxy Framework: An Application Level Infrastructure For Active Service Creation And Contents Delivery

    Get PDF
    This thesis focuses on the dynamic proxy framework named the Chek Proxy Framework (CPF). The main objectives are to scale the existing Internet architecture by conserving the backbone bandwidth, reducing server loads, and improving the overall networking system performance, particularly the client receiving rate. These were achieved by deploying application-level proxy services within the network, to accelerate and customise the delivery of contents. The CPF is based on the 3-tier distributed computing architecture with the client and server residing at the ends of the respective networks. A dynamically appointed middle-tier system, the Dynamic Application Proxy Server (DAPS) is created ondemand and resides at the client-side network based on the designed clustering policy. The uniqueness of CPF lies on the use of voluntary client machines, instead of static and dedicated machines to host DAPS services created at runtime

    Latency-driven replication for globally distributed systems

    Get PDF
    Steen, M.R. van [Promotor]Pierre, G.E.O. [Copromotor

    Hierarchical network topographical routing

    Get PDF
    Within the last 10 years the content consumption model that underlies many of the assumptions about traffic aggregation within the Internet has changed; the previous short burst transfer followed by longer periods of inactivity that allowed for statistical aggregation of traffic has been increasingly replaced by continuous data transfer models. Approaching this issue from a clean slate perspective; this work looks at the design of a network routing structure and supporting protocols for assisting in the delivery of large scale content services. Rather than approaching a content support model through existing IP models the work takes a fresh look at Internet routing through a hierarchical model in order to highlight the benefits that can be gained with a new structural Internet or through similar modifications to the existing IP model. The work is divided into three major sections: investigating the existing UK based Internet structure as compared to the traditional Autonomous System (AS) Internet structural model; a localised hierarchical network topographical routing model; and intelligent distributed localised service models. The work begins by looking at the United Kingdom (UK) Internet structure as an example of a current generation technical and economic model with shared access to the last mile connectivity and a large scale wholesale network between Internet Service Providers (ISPs) and the end user. This model combined with the Internet Protocol (IP) address allocation and transparency of the wholesale network results in an enforced inefficiency within the overall network restricting the ability of ISPs to collaborate. From this model a core / edge separation hierarchical virtual tree based routing protocol based on the physical network topography (layers 2 and 3) is developed to remove this enforced inefficiency by allowing direct management and control at the lowest levels of the network. This model acts as the base layer for further distributed intelligent services such as management and content delivery to enable both ISPs and third parties to actively collaborate and provide content from the most efficient source

    A data-oriented network architecture

    Get PDF
    In the 25 years since becoming commercially available, the Internet has grown into a global communication infrastructure connecting a significant part of mankind and has become an important part of modern society. Its impressive growth has been fostered by innovative applications, many of which were completely unforeseen by the Internet's inventors. While fully acknowledging ingenuity and creativity of application designers, it is equally impressive how little the core architecture of the Internet has evolved during this time. However, the ever evolving applications and growing importance of the Internet have resulted in increasing discordance between the Internet's current use and its original design. In this thesis, we focus on four sources of discomfort caused by this divergence. First, the Internet was developed around host-to-host applications, such as telnet and ftp, but the vast majority of its current usage is service access and data retrieval. Second, while the freedom to connect from any host to any other host was a major factor behind the success of the Internet, it provides little protection for connected hosts today. As a result, distributed denial of service attacks against Internet services have become a common nuisance, and are difficult to resolve within the current architecture. Third, Internet connectivity is becoming nearly ubiquitous and reaches increasingly often mobile devices. Moreover, connectivity is expected to extend its reach to even most extreme places. Hence, applications' view to network has changed radically; it's commonplace that they are offered intermittent connectivity at best and required to be smart enough to use heterogeneous network technologies. Finally, modern networks deploy so-called middleboxes both to improve performance and provide protection. However, when doing so, the middleboxes have to impose themselves between the communication end-points, which is against the design principles of the original Internet and a source of complications both for the management of networks and design of application protocols. In this thesis, we design a clean-slate network architecture that is a better fit with the current use of the Internet. We present a name resolution system based on name-based routing. It matches with the service access and data retrieval oriented usage of the Internet, and takes the network imposed middleboxes properly into account. We then propose modest addressing-related changes to the network layer as a remedy for the denial of service attacks. Finally, we take steps towards a data-oriented communications API that provides better decoupling for applications from the network stack than the original Sockets API does. The improved decoupling both simplifies applications and allows them to be unaffected by evolving network technologies: in this architecture, coping with intermittent connectivity and heterogenous network technologies is a burden of the network stack

    Cooperative caching for object storage

    Full text link
    Data is increasingly stored in data lakes, vast immutable object stores that can be accessed from anywhere in the data center. By providing low cost and scalable storage, today immutable object-storage based data lakes are used by a wide range of applications with diverse access patterns. Unfortunately, performance can suffer for applications that do not match the access patterns for which the data lake was designed. Moreover, in many of today's (non-hyperscale) data centers, limited bisectional bandwidth will limit data lake performance. Today many computer clusters integrate caches both to address the mismatch between application performance requirements and the capabilities of the shared data lake, and to reduce the demand on the data center network. However, per-cluster caching; i) means the expensive cache resources cannot be shifted between clusters based on demand, ii) makes sharing expensive because data accessed by multiple clusters is independently cached by each of them, and iii) makes it difficult for clusters to grow and shrink if their servers are being used to cache storage. In this dissertation, we present two novel data-center wide cooperative cache architectures, Datacenter-Data-Delivery Network (D3N) and Directory-Based Datacenter-Data-Delivery Network (D4N) that are designed to be part of the data lake itself rather than part of the computer clusters that use it. D3N and D4N distribute caches across the data center to enable data sharing and elasticity of cache resources where requests are transparently directed to nearby cache nodes. They dynamically adapt to changes in access patterns and accelerate workloads while providing the same consistency, trust, availability, and resilience guarantees as the underlying data lake. We nd that exploiting the immutability of object stores significantly reduces the complexity and provides opportunities for cache management strategies that were not feasible for previous cooperative cache systems for le or block-based storage. D3N is a multi-layer cooperative cache that targets workloads with large read-only datasets like big data analytics. It is designed to be easily integrated into existing data lakes with only limited support for write caching of intermediate data, and avoiding any global state by, for example, using consistent hashing for locating blocks and making all caching decisions based purely on local information. Our prototype is performant enough to fully exploit the (5 GB/s read) SSDs and (40, Gbit/s) NICs in our system and improve the runtime of realistic workloads by up to 3x. The simplicity of D3N has enabled us, in collaboration with industry partners, to upstream the two-layer version of D3N into the existing code base of the Ceph object store as a new experimental feature, making it available to the many data lakes around the world based on Ceph. D4N is a directory-based cooperative cache that provides a reliable write tier and a distributed directory that maintains a global state. It explores the use of global state to implement more sophisticated cache management policies and enables application-specific tuning of caching policies to support a wider range of applications than D3N. In contrast to previous cache systems that implement their own mechanism for maintaining dirty data redundantly, D4N re-uses the existing data lake (Ceph) software for implementing a write tier and exploits the semantics of immutable objects to move aged objects to the shared data lake. This design greatly reduces the barrier to adoption and enables D4N to take advantage of sophisticated data lake features such as erasure coding. We demonstrate that D4N is performant enough to saturate the bandwidth of the SSDs, and it automatically adapts replication to the working set of the demands and outperforms the state of art cluster cache Alluxio. While it will be substantially more complicated to integrate the D4N prototype into production quality code that can be adopted by the community, these results are compelling enough that our partners are starting that effort. D3N and D4N demonstrate that cooperative caching techniques, originally designed for file systems, can be employed to integrate caching into today’s immutable object-based data lakes. We find that the properties of immutable object storage greatly simplify the adoption of these techniques, and enable integration of caching in a fashion that enables re-use of existing battle tested software; greatly reducing the barrier of adoption. In integrating the caching in the data lake, and not the compute cluster, this research opens the door to efficient data center wide sharing of data and resources

    Video-on-Demand over Internet: a survey of existing systems and solutions

    Get PDF
    Video-on-Demand is a service where movies are delivered to distributed users with low delay and free interactivity. The traditional client/server architecture experiences scalability issues to provide video streaming services, so there have been many proposals of systems, mostly based on a peer-to-peer or on a hybrid server/peer-to-peer solution, to solve this issue. This work presents a survey of the currently existing or proposed systems and solutions, based upon a subset of representative systems, and defines selection criteria allowing to classify these systems. These criteria are based on common questions such as, for example, is it video-on-demand or live streaming, is the architecture based on content delivery network, peer-to-peer or both, is the delivery overlay tree-based or mesh-based, is the system push-based or pull-based, single-stream or multi-streams, does it use data coding, and how do the clients choose their peers. Representative systems are briefly described to give a summarized overview of the proposed solutions, and four ones are analyzed in details. Finally, it is attempted to evaluate the most promising solutions for future experiments. Résumé La vidéo à la demande est un service où des films sont fournis à distance aux utilisateurs avec u
    corecore