2 research outputs found
On Resource Pooling and Separation for LRU Caching
Caching systems using the Least Recently Used (LRU) principle have now become
ubiquitous. A fundamental question for these systems is whether the cache space
should be pooled together or divided to serve multiple flows of data item
requests in order to minimize the miss probabilities. In this paper, we show
that there is no straight yes or no answer to this question, depending on
complex combinations of critical factors, including, e.g., request rates,
overlapped data items across different request flows, data item popularities
and their sizes. Specifically, we characterize the asymptotic miss
probabilities for multiple competing request flows under resource pooling and
separation for LRU caching when the cache size is large.
Analytically, we show that it is asymptotically optimal to jointly serve
multiple flows if their data item sizes and popularity distributions are
similar and their arrival rates do not differ significantly; the
self-organizing property of LRU caching automatically optimizes the resource
allocation among them asymptotically. Otherwise, separating these flows could
be better, e.g., when data sizes vary significantly. We also quantify critical
points beyond which resource pooling is better than separation for each of the
flows when the overlapped data items exceed certain levels. Technically, we
generalize existing results on the asymptotic miss probability of LRU caching
for a broad class of heavy-tailed distributions and extend them to multiple
competing flows with varying data item sizes, which also validates the Che
approximation under certain conditions. These results provide new insights on
improving the performance of caching systems
Adaptive TTL-Based Caching for Content Delivery
Content Delivery Networks (CDNs) deliver a majority of the user-requested
content on the Internet, including web pages, videos, and software downloads. A
CDN server caches and serves the content requested by users. Designing caching
algorithms that automatically adapt to the heterogeneity, burstiness, and
non-stationary nature of real-world content requests is a major challenge and
is the focus of our work. While there is much work on caching algorithms for
stationary request traffic, the work on non-stationary request traffic is very
limited. Consequently, most prior models are inaccurate for production CDN
traffic that is non-stationary.
We propose two TTL-based caching algorithms and provide provable guarantees
for content request traffic that is bursty and non-stationary. The first
algorithm called d-TTL dynamically adapts a TTL parameter using a stochastic
approximation approach. Given a feasible target hit rate, we show that the hit
rate of d-TTL converges to its target value for a general class of bursty
traffic that allows Markov dependence over time and non-stationary arrivals.
The second algorithm called f-TTL uses two caches, each with its own TTL. The
first-level cache adaptively filters out non-stationary traffic, while the
second-level cache stores frequently-accessed stationary traffic. Given
feasible targets for both the hit rate and the expected cache size, f-TTL
asymptotically achieves both targets. We implement d-TTL and f-TTL and evaluate
both algorithms using an extensive nine-day trace consisting of 500 million
requests from a production CDN server. We show that both d-TTL and f-TTL
converge to their hit rate targets with an error of about 1.3%. But, f-TTL
requires a significantly smaller cache size than d-TTL to achieve the same hit
rate, since it effectively filters out the non-stationary traffic for
rarely-accessed objects