41 research outputs found
Approximating k-Forest with Resource Augmentation: A Primal-Dual Approach
In this paper, we study the -forest problem in the model of resource
augmentation. In the -forest problem, given an edge-weighted graph ,
a parameter , and a set of demand pairs , the
objective is to construct a minimum-cost subgraph that connects at least
demands. The problem is hard to approximate---the best-known approximation
ratio is . Furthermore, -forest is as hard to
approximate as the notoriously-hard densest -subgraph problem.
While the -forest problem is hard to approximate in the worst-case, we
show that with the use of resource augmentation, we can efficiently approximate
it up to a constant factor.
First, we restate the problem in terms of the number of demands that are {\em
not} connected. In particular, the objective of the -forest problem can be
viewed as to remove at most demands and find a minimum-cost subgraph that
connects the remaining demands. We use this perspective of the problem to
explain the performance of our algorithm (in terms of the augmentation) in a
more intuitive way.
Specifically, we present a polynomial-time algorithm for the -forest
problem that, for every , removes at most demands and has
cost no more than times the cost of an optimal algorithm
that removes at most demands
On-Line File Caching
In the on-line file-caching problem problem, the input is a sequence of
requests for files, given on-line (one at a time). Each file has a non-negative
size and a non-negative retrieval cost. The problem is to decide which files to
keep in a fixed-size cache so as to minimize the sum of the retrieval costs for
files that are not in the cache when requested. The problem arises in web
caching by browsers and by proxies. This paper describes a natural
generalization of LRU called Landlord and gives an analysis showing that it has
an optimal performance guarantee (among deterministic on-line algorithms).
The paper also gives an analysis of the algorithm in a so-called ``loosely''
competitive model, showing that on a ``typical'' cache size, either the
performance guarantee is O(1) or the total retrieval cost is insignificant.Comment: ACM-SIAM Symposium on Discrete Algorithms (1998
The K-Server Dual and Loose Competitiveness for Paging
This paper has two results. The first is based on the surprising observation
that the well-known ``least-recently-used'' paging algorithm and the
``balance'' algorithm for weighted caching are linear-programming primal-dual
algorithms. This observation leads to a strategy (called ``Greedy-Dual'') that
generalizes them both and has an optimal performance guarantee for weighted
caching.
For the second result, the paper presents empirical studies of paging
algorithms, documenting that in practice, on ``typical'' cache sizes and
sequences, the performance of paging strategies are much better than their
worst-case analyses in the standard model suggest. The paper then presents
theoretical results that support and explain this. For example: on any input
sequence, with almost all cache sizes, either the performance guarantee of
least-recently-used is O(log k) or the fault rate (in an absolute sense) is
insignificant.
Both of these results are strengthened and generalized in``On-line File
Caching'' (1998).Comment: conference version: "On-Line Caching as Cache Size Varies", SODA
(1991
Probabilistic alternatives for competitive analysis
In the last 20 years competitive analysis has become the main tool for analyzing the quality of online algorithms. Despite of this, competitive analysis has also been criticized: it sometimes cannot discriminate between algorithms that exhibit significantly different empirical behavior or it even favors an algorithm that is worse from an empirical point of view. Therefore, there have been several approaches to circumvent these drawbacks. In this survey, we discuss probabilistic alternatives for competitive analysis.operations research and management science;
Online Coded Caching
We consider a basic content distribution scenario consisting of a single
origin server connected through a shared bottleneck link to a number of users
each equipped with a cache of finite memory. The users issue a sequence of
content requests from a set of popular files, and the goal is to operate the
caches as well as the server such that these requests are satisfied with the
minimum number of bits sent over the shared link. Assuming a basic Markov model
for renewing the set of popular files, we characterize approximately the
optimal long-term average rate of the shared link. We further prove that the
optimal online scheme has approximately the same performance as the optimal
offline scheme, in which the cache contents can be updated based on the entire
set of popular files before each new request. To support these theoretical
results, we propose an online coded caching scheme termed coded least-recently
sent (LRS) and simulate it for a demand time series derived from the dataset
made available by Netflix for the Netflix Prize. For this time series, we show
that the proposed coded LRS algorithm significantly outperforms the popular
least-recently used (LRU) caching algorithm.Comment: 15 page
Online Multi-Coloring with Advice
We consider the problem of online graph multi-coloring with advice.
Multi-coloring is often used to model frequency allocation in cellular
networks. We give several nearly tight upper and lower bounds for the most
standard topologies of cellular networks, paths and hexagonal graphs. For the
path, negative results trivially carry over to bipartite graphs, and our
positive results are also valid for bipartite graphs. The advice given
represents information that is likely to be available, studying for instance
the data from earlier similar periods of time.Comment: IMADA-preprint-c
On Resource Pooling and Separation for LRU Caching
Caching systems using the Least Recently Used (LRU) principle have now become
ubiquitous. A fundamental question for these systems is whether the cache space
should be pooled together or divided to serve multiple flows of data item
requests in order to minimize the miss probabilities. In this paper, we show
that there is no straight yes or no answer to this question, depending on
complex combinations of critical factors, including, e.g., request rates,
overlapped data items across different request flows, data item popularities
and their sizes. Specifically, we characterize the asymptotic miss
probabilities for multiple competing request flows under resource pooling and
separation for LRU caching when the cache size is large.
Analytically, we show that it is asymptotically optimal to jointly serve
multiple flows if their data item sizes and popularity distributions are
similar and their arrival rates do not differ significantly; the
self-organizing property of LRU caching automatically optimizes the resource
allocation among them asymptotically. Otherwise, separating these flows could
be better, e.g., when data sizes vary significantly. We also quantify critical
points beyond which resource pooling is better than separation for each of the
flows when the overlapped data items exceed certain levels. Technically, we
generalize existing results on the asymptotic miss probability of LRU caching
for a broad class of heavy-tailed distributions and extend them to multiple
competing flows with varying data item sizes, which also validates the Che
approximation under certain conditions. These results provide new insights on
improving the performance of caching systems