13 research outputs found

    Supporting Practical Content-Addressable Caching with CZIP Compression Abstract

    No full text
    Content-based naming (CBN) enables content sharing across similar files by breaking files into positionindependent chunks and naming these chunks using hashes of their contents. While a number of research systems have recently used custom CBN approaches internally to good effect, there has not yet been any mechanism to use CBN in a general-purpose way. In this paper, we demonstrate a practical approach to applying CBN without requiring disruptive changes to end systems. We develop CZIP, a CBN compression scheme which reduces data sizes by eliminating redundant chunks, compresses chunks using existing schemes, and facilitates sharing within files, across files, and across machines by explicitly exposing CBN chunk hashes. CZIPaware caching systems can exploit the CBN information to reduce storage space, reduce bandwidth consumption, and increase performance, while content providers and middleboxes can selectively encode their most suitable content. We show that CZIP compares well to standalone compression schemes, that a CBN cache for CZIP is easily implemented, and that a CZIP-aware CDN produces significant benefits.

    Towards understanding modern web traffic

    No full text
    As Web sites move from relatively static displays of simple pages to rich media applications with heavy client-side interaction, the nature of the resulting Web traffic changes as well. Understanding this change is necessary in order to improve response time, evaluate caching effectiveness, and design intermediary systems, such as firewalls, security analyzers, and reporting/management systems. Unfortunately, we have little understanding of the underlying nature of today’s Web traffic. In this paper, we analyze five years (2006-2010) of real Web traffic from a globally-distributed proxy system, which captures the browsing behavior of over 70,000 daily users from 187 countries. Using this data set, we examine major changes in Web traffic characteristics that occurred during this period. We also present a new Web page analysis algorithm that is better suited for modern Web page interactions by grouping requests into streams and exploiting the structure of the pages. Using this algorithm, we analyze various aspects of page-level changes, and characterize modern Web pages. Finally, we investigate the redundancy of this traffic, using both traditional object-level caching as well as content-based approaches

    Towards understanding developing world traffic

    No full text
    While many projects aim to provide network access to the developing world or improve existing network access, relatively little data exists regarding the behavior of traffic in these environments, especially in regards to the characteristics of traffic in the developing world. In this paper, we provide a first glimpse into the traffic gathered by a worldwide proxy network, and try to observe differences in first-world and developing-world traffic characteristics. What sets this work apart from similar research is the scope and level of detail – we capture more than 3TB of content representing one week’s browsing by 348K users across 190 countries. Capturing the content, rather than just access logs, also allows us to perform similarity analysis at the content level

    Wide-area Network Acceleration for the Developing World

    No full text
    Wide-area network (WAN) accelerators operate by compressing redundant network traffic from point-to-point communications, enabling higher effective bandwidth. Unfortunately, while network bandwidth is scarce and expensive in the developing world, current WAN accelerators are designed for enterprise use, and are a poor fit in these environments. We present Wanax, a WAN accelerator designed for developing-world deployments. It uses a novel multiresolution chunking (MRC) scheme that provides high compression rates and high disk performance for a variety of content, while using much less memory than existing approaches. Wanax exploits the design of MRC to perform intelligent load shedding to maximize throughput when running on resource-limited shared platforms. Finally, Wanax exploits the mesh network environments being deployed in the developing world, instead of just the star topologies common in enterprise branch offices.
    corecore