56 research outputs found
On the Convergence of the TTL Approximation for an LRU Cache under Independent Stationary Request Processes
International audienceThe modeling and analysis of an LRU cache is extremely challenging as exact results for the main performance metrics (e.g. hit rate) are either lacking or cannot be used because of their high computational complexity for large caches. Recently a TTL-based approximation has been developed for requests described by various workload models and numerically demonstrated to be accurate. The theory for such an approximation, however, is not yet fully developed. In this paper we provide theoretical justification for the approximation in the case where distinct contents are described by independent stationary and ergodic processes. We show that this approximation is exact as the cache size and the number of contents go to infinity. This extends earlier results for the independent reference model. Moreover, we establish results not only for the aggregate cache hit probability but also for every individual content. Last, we obtain bounds on the rate of convergence
Adaptive TTL-Based Caching for Content Delivery
Content Delivery Networks (CDNs) deliver a majority of the user-requested
content on the Internet, including web pages, videos, and software downloads. A
CDN server caches and serves the content requested by users. Designing caching
algorithms that automatically adapt to the heterogeneity, burstiness, and
non-stationary nature of real-world content requests is a major challenge and
is the focus of our work. While there is much work on caching algorithms for
stationary request traffic, the work on non-stationary request traffic is very
limited. Consequently, most prior models are inaccurate for production CDN
traffic that is non-stationary.
We propose two TTL-based caching algorithms and provide provable guarantees
for content request traffic that is bursty and non-stationary. The first
algorithm called d-TTL dynamically adapts a TTL parameter using a stochastic
approximation approach. Given a feasible target hit rate, we show that the hit
rate of d-TTL converges to its target value for a general class of bursty
traffic that allows Markov dependence over time and non-stationary arrivals.
The second algorithm called f-TTL uses two caches, each with its own TTL. The
first-level cache adaptively filters out non-stationary traffic, while the
second-level cache stores frequently-accessed stationary traffic. Given
feasible targets for both the hit rate and the expected cache size, f-TTL
asymptotically achieves both targets. We implement d-TTL and f-TTL and evaluate
both algorithms using an extensive nine-day trace consisting of 500 million
requests from a production CDN server. We show that both d-TTL and f-TTL
converge to their hit rate targets with an error of about 1.3%. But, f-TTL
requires a significantly smaller cache size than d-TTL to achieve the same hit
rate, since it effectively filters out the non-stationary traffic for
rarely-accessed objects
Cache Miss Estimation for Non-Stationary Request Processes
The aim of the paper is to evaluate the miss probability of a Least Recently
Used (LRU) cache, when it is offered a non-stationary request process given by
a Poisson cluster point process. First, we construct a probability space using
Palm theory, describing how to consider a tagged document with respect to the
rest of the request process. This framework allows us to derive a general
integral formula for the expected number of misses of the tagged document.
Then, we consider the limit when the cache size and the arrival rate go to
infinity proportionally, and use the integral formula to derive an asymptotic
expansion of the miss probability in powers of the inverse of the cache size.
This enables us to quantify and improve the accuracy of the so-called Che
approximation
Exact Analysis of TTL Cache Networks: The Case of Caching Policies driven by Stopping Times
TTL caching models have recently regained significant research interest,
largely due to their ability to fit popular caching policies such as LRU. This
paper advances the state-of-the-art analysis of TTL-based cache networks by
developing two exact methods with orthogonal generality and computational
complexity. The first method generalizes existing results for line networks
under renewal requests to the broad class of caching policies whereby evictions
are driven by stopping times. The obtained results are further generalized,
using the second method, to feedforward networks with Markov arrival processes
(MAP) requests. MAPs are particularly suitable for non-line networks because
they are closed not only under superposition and splitting, as known, but also
under input-output caching operations as proven herein for phase-type TTL
distributions. The crucial benefit of the two closure properties is that they
jointly enable the first exact analysis of feedforward networks of TTL caches
in great generality
Jointly Optimal Routing and Caching for Arbitrary Network Topologies
We study a problem of fundamental importance to ICNs, namely, minimizing
routing costs by jointly optimizing caching and routing decisions over an
arbitrary network topology. We consider both source routing and hop-by-hop
routing settings. The respective offline problems are NP-hard. Nevertheless, we
show that there exist polynomial time approximation algorithms producing
solutions within a constant approximation from the optimal. We also produce
distributed, adaptive algorithms with the same approximation guarantees. We
simulate our adaptive algorithms over a broad array of different topologies.
Our algorithms reduce routing costs by several orders of magnitude compared to
prior art, including algorithms optimizing caching under fixed routing.Comment: This is the extended version of the paper "Jointly Optimal Routing
and Caching for Arbitrary Network Topologies", appearing in the 4th ACM
Conference on Information-Centric Networking (ICN 2017), Berlin, Sep. 26-28,
201
An approximate analysis of heterogeneous and general cache networks
In this paper, we propose approximate models to assess the performance of a cache network with arbitrary topology where nodes run the Least Recently Used (LRU), First-In First-Out (FIFO), or Random (RND) replacement policies on arbitrary size caches. Our model takes advantage of the notions of cache characteristic time and Time-To-Live (TTL)-based cache to develop a unified framework for approximating metrics of interest of interconnected caches. Our approach is validated through event-driven simulations; and when possible, compared to the existing a-NET model [23].Dans ce travail, nous proposons des modèles approximatifs pour évaluer les performances d'un réseau de caches ayant une topologie arbitraire où les noeuds exécutent les politiques Least Recently Used (LRU), First In First Out (FIFO), ou Random replacement (RND) sur des caches de taille quelconque. Notre modèle tire parti des notions de temps caractéristique d'un cache et des modèles Time-To-Live (TTL) de cache pour développer une approche unifiée pour l'approximation des métriques de performance sur des caches interconnectés. Notre approche est validée par des simulations événementielles; et, si possible, comparée au modèle existant a-NET [23]
Similarity Caching: Theory and Algorithms
This paper focuses on similarity caching systems, in which a user request for an object o that is not in the cache can be (partially) satisfied by a similar stored object o 0 , at the cost of a loss of user utility. Similarity caching systems can be effectively employed in several application areas, like multimedia retrieval, recommender systems, genome study, and machine learning training/serving. However, despite their relevance, the behavior of such systems is far from being well understood. In this paper, we provide a first comprehensive analysis of similarity caching in the offline, adversarial, and stochastic settings. We show that similarity caching raises significant new challenges, for which we propose the first dynamic policies with some optimality guarantees. We evaluate the performance of our schemes under both synthetic and real request traces
Parallel Simulation of Very Large-Scale General Cache Networks
In this paper we propose a methodology for the study of general cache networks, which is intrinsically scalable and amenable to parallel execution. We contrast two techniques: one that slices the network, and another that slices the content catalog. In the former, each core simulates requests for the whole catalog on a subgraph of the original topology, whereas in the latter each core simulates requests for a portion of the original catalog on a replica of the whole network. Interestingly, we find out that when the number of cores increases (and so the split ratio of the network topology), the overhead of message passing required to keeping consistency among nodes actually offsets any benefit from the parallelization: this is strictly due to the correlation among neighboring caches, meaning that requests arriving at one cache allocated on one core may depend on the status of one or more caches allocated on different cores. Even more interestingly, we find out that the newly proposed catalog slicing, on the contrary, achieves an ideal speedup in the number of cores. Overall, our system, which we make available as open source software, enables performance assessment of large scale general cache networks, i.e., comprising hundreds of nodes, trillions contents, and complex routing and caching algorithms, in minutes of CPU time and with exiguous amounts of memory
- …