2,181 research outputs found

    On the Intrinsic Locality Properties of Web Reference Streams

    Full text link
    There has been considerable work done in the study of Web reference streams: sequences of requests for Web objects. In particular, many studies have looked at the locality properties of such streams, because of the impact of locality on the design and performance of caching and prefetching systems. However, a general framework for understanding why reference streams exhibit given locality properties has not yet emerged. In this work we take a first step in this direction, based on viewing the Web as a set of reference streams that are transformed by Web components (clients, servers, and intermediaries). We propose a graph-based framework for describing this collection of streams and components. We identify three basic stream transformations that occur at nodes of the graph: aggregation, disaggregation and filtering, and we show how these transformations can be used to abstract the effects of different Web components on their associated reference streams. This view allows a structured approach to the analysis of why reference streams show given properties at different points in the Web. Applying this approach to the study of locality requires good metrics for locality. These metrics must meet three criteria: 1) they must accurately capture temporal locality; 2) they must be independent of trace artifacts such as trace length; and 3) they must not involve manual procedures or model-based assumptions. We describe two metrics meeting these criteria that each capture a different kind of temporal locality in reference streams. The popularity component of temporal locality is captured by entropy, while the correlation component is captured by interreference coefficient of variation. We argue that these metrics are more natural and more useful than previously proposed metrics for temporal locality. We use this framework to analyze a diverse set of Web reference traces. We find that this framework can shed light on how and why locality properties vary across different locations in the Web topology. For example, we find that filtering and aggregation have opposing effects on the popularity component of the temporal locality, which helps to explain why multilevel caching can be effective in the Web. Furthermore, we find that all transformations tend to diminish the correlation component of temporal locality, which has implications for the utility of different cache replacement policies at different points in the Web.National Science Foundation (ANI-9986397, ANI-0095988); CNPq-Brazi

    GreedyDual-Join: Locality-Aware Buffer Management for Approximate Join Processing Over Data Streams

    Full text link
    We investigate adaptive buffer management techniques for approximate evaluation of sliding window joins over multiple data streams. In many applications, data stream processing systems have limited memory or have to deal with very high speed data streams. In both cases, computing the exact results of joins between these streams may not be feasible, mainly because the buffers used to compute the joins contain much smaller number of tuples than the tuples contained in the sliding windows. Therefore, a stream buffer management policy is needed in that case. We show that the buffer replacement policy is an important determinant of the quality of the produced results. To that end, we propose GreedyDual-Join (GDJ) an adaptive and locality-aware buffering technique for managing these buffers. GDJ exploits the temporal correlations (at both long and short time scales), which we found to be prevalent in many real data streams. We note that our algorithm is readily applicable to multiple data streams and multiple joins and requires almost no additional system resources. We report results of an experimental study using both synthetic and real-world data sets. Our results demonstrate the superiority and flexibility of our approach when contrasted to other recently proposed techniques

    NDN content store and caching policies: performance evaluation

    Get PDF
    Among various factors contributing to performance of named data networking (NDN), the organization of caching is a key factor and has benefited from intense studies by the networking research community. The performed studies aimed at (1) finding the best strategy to adopt for content caching; (2) specifying the best location, and number of content stores (CS) in the network; and (3) defining the best cache replacement policy. Accessing and comparing the performance of the proposed solutions is as essential as the development of the proposals themselves. The present work aims at evaluating and comparing the behavior of four caching policies (i.e., random, least recently used (LRU), least frequently used (LFU), and first in first out (FIFO)) applied to NDN. Several network scenarios are used for simulation (2 topologies, varying the percentage of nodes of the content stores (5–100), 1 and 10 producers, 32 and 41 consumers). Five metrics are considered for the performance evaluation: cache hit ratio (CHR), network traffic, retrieval delay, interest re-transmissions, and the number of upstream hops. The content request follows the Zipf–Mandelbrot distribution (with skewness factor α=1.1 and α=0.75). LFU presents better performance in all considered metrics, except on the NDN testbed, with 41 consumers, 1 producer and a content request rate of 100 packets/s. For the level of content store from 50% to 100%, LRU presents a notably higher performance. Although the network behavior is similar for both skewness factors, when α=0.75, the CHR is significantly reduced, as expected.This work has been supported by FCT – Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020

    Distributed Selfish Coaching

    Full text link
    Although cooperation generally increases the amount of resources available to a community of nodes, thus improving individual and collective performance, it also allows for the appearance of potential mistreatment problems through the exposition of one node's resources to others. We study such concerns by considering a group of independent, rational, self-aware nodes that cooperate using on-line caching algorithms, where the exposed resource is the storage at each node. Motivated by content networking applications -- including web caching, CDNs, and P2P -- this paper extends our previous work on the on-line version of the problem, which was conducted under a game-theoretic framework, and limited to object replication. We identify and investigate two causes of mistreatment: (1) cache state interactions (due to the cooperative servicing of requests) and (2) the adoption of a common scheme for cache management policies. Using analytic models, numerical solutions of these models, as well as simulation experiments, we show that on-line cooperation schemes using caching are fairly robust to mistreatment caused by state interactions. To appear in a substantial manner, the interaction through the exchange of miss-streams has to be very intense, making it feasible for the mistreated nodes to detect and react to exploitation. This robustness ceases to exist when nodes fetch and store objects in response to remote requests, i.e., when they operate as Level-2 caches (or proxies) for other nodes. Regarding mistreatment due to a common scheme, we show that this can easily take place when the "outlier" characteristics of some of the nodes get overlooked. This finding underscores the importance of allowing cooperative caching nodes the flexibility of choosing from a diverse set of schemes to fit the peculiarities of individual nodes. To that end, we outline an emulation-based framework for the development of mistreatment-resilient distributed selfish caching schemes. Our framework utilizes a simple control-theoretic approach to dynamically parameterize the cache management scheme. We show performance evaluation results that quantify the benefits from instantiating such a framework, which could be substantial under skewed demand profiles.National Science Foundation (CNS Cybertrust 0524477, CNS NeTS 0520166, CNS ITR 0205294, EIA RI 0202067); EU IST (CASCADAS and E-NEXT); Marie Curie Outgoing International Fellowship of the EU (MOIF-CT-2005-007230

    Performance Evaluation of Caching Policies in NDN - an ICN Architecture

    Full text link
    Information Centric Networking (ICN) advocates the philosophy of accessing the content independent of its location. Owing to this location independence in ICN, the routers en-route can be enabled to cache the content to serve the future requests for the same content locally. Several ICN architectures have been proposed in the literature along with various caching algorithms for caching and cache replacement at the routers en-route. The aim of this paper is to critically evaluate various caching policies using Named Data Networking (NDN), an ICN architecture proposed in literature. We have presented the performance comparison of different caching policies naming First In First Out (FIFO), Least Recently Used (LRU), and Universal Caching (UC) in two network models; Watts-Strogatz (WS) model (suitable for dense short link networks such as sensor networks) and Sprint topology (better suited for large Internet Service Provider (ISP) networks) using ndnSIM, an ns3 based discrete event simulator for NDN architecture. Our results indicate that UC outperforms other caching policies such as LRU and FIFO and makes UC a better alternative for both sensor networks and ISP networks
    • …
    corecore