10 research outputs found
On-Line File Caching
In the on-line file-caching problem problem, the input is a sequence of
requests for files, given on-line (one at a time). Each file has a non-negative
size and a non-negative retrieval cost. The problem is to decide which files to
keep in a fixed-size cache so as to minimize the sum of the retrieval costs for
files that are not in the cache when requested. The problem arises in web
caching by browsers and by proxies. This paper describes a natural
generalization of LRU called Landlord and gives an analysis showing that it has
an optimal performance guarantee (among deterministic on-line algorithms).
The paper also gives an analysis of the algorithm in a so-called ``loosely''
competitive model, showing that on a ``typical'' cache size, either the
performance guarantee is O(1) or the total retrieval cost is insignificant.Comment: ACM-SIAM Symposium on Discrete Algorithms (1998
The K-Server Dual and Loose Competitiveness for Paging
This paper has two results. The first is based on the surprising observation
that the well-known ``least-recently-used'' paging algorithm and the
``balance'' algorithm for weighted caching are linear-programming primal-dual
algorithms. This observation leads to a strategy (called ``Greedy-Dual'') that
generalizes them both and has an optimal performance guarantee for weighted
caching.
For the second result, the paper presents empirical studies of paging
algorithms, documenting that in practice, on ``typical'' cache sizes and
sequences, the performance of paging strategies are much better than their
worst-case analyses in the standard model suggest. The paper then presents
theoretical results that support and explain this. For example: on any input
sequence, with almost all cache sizes, either the performance guarantee of
least-recently-used is O(log k) or the fault rate (in an absolute sense) is
insignificant.
Both of these results are strengthened and generalized in``On-line File
Caching'' (1998).Comment: conference version: "On-Line Caching as Cache Size Varies", SODA
(1991
On Resource Pooling and Separation for LRU Caching
Caching systems using the Least Recently Used (LRU) principle have now become
ubiquitous. A fundamental question for these systems is whether the cache space
should be pooled together or divided to serve multiple flows of data item
requests in order to minimize the miss probabilities. In this paper, we show
that there is no straight yes or no answer to this question, depending on
complex combinations of critical factors, including, e.g., request rates,
overlapped data items across different request flows, data item popularities
and their sizes. Specifically, we characterize the asymptotic miss
probabilities for multiple competing request flows under resource pooling and
separation for LRU caching when the cache size is large.
Analytically, we show that it is asymptotically optimal to jointly serve
multiple flows if their data item sizes and popularity distributions are
similar and their arrival rates do not differ significantly; the
self-organizing property of LRU caching automatically optimizes the resource
allocation among them asymptotically. Otherwise, separating these flows could
be better, e.g., when data sizes vary significantly. We also quantify critical
points beyond which resource pooling is better than separation for each of the
flows when the overlapped data items exceed certain levels. Technically, we
generalize existing results on the asymptotic miss probability of LRU caching
for a broad class of heavy-tailed distributions and extend them to multiple
competing flows with varying data item sizes, which also validates the Che
approximation under certain conditions. These results provide new insights on
improving the performance of caching systems
Truly Online Paging with Locality of Reference
The competitive analysis fails to model locality of reference in the online
paging problem. To deal with it, Borodin et. al. introduced the access graph
model, which attempts to capture the locality of reference. However, the access
graph model has a number of troubling aspects. The access graph has to be known
in advance to the paging algorithm and the memory required to represent the
access graph itself may be very large.
In this paper we present truly online strongly competitive paging algorithms
in the access graph model that do not have any prior information on the access
sequence. We present both deterministic and randomized algorithms. The
algorithms need only O(k log n) bits of memory, where k is the number of page
slots available and n is the size of the virtual address space. I.e.,
asymptotically no more memory than needed to store the virtual address
translation table.
We also observe that our algorithms adapt themselves to temporal changes in
the locality of reference. We model temporal changes in the locality of
reference by extending the access graph model to the so called extended access
graph model, in which many vertices of the graph can correspond to the same
virtual page. We define a measure for the rate of change in the locality of
reference in G denoted by Delta(G). We then show our algorithms remain strongly
competitive as long as Delta(G) >= (1+ epsilon)k, and no truly online algorithm
can be strongly competitive on a class of extended access graphs that includes
all graphs G with Delta(G) >= k- o(k).Comment: 37 pages. Preliminary version appeared in FOCS '9
Learning-Augmented Weighted Paging
We consider a natural semi-online model for weighted paging, where at any
time the algorithm is given predictions, possibly with errors, about the next
arrival of each page. The model is inspired by Belady's classic optimal offline
algorithm for unweighted paging, and extends the recently studied model for
learning-augmented paging (Lykouris and Vassilvitskii, 2018) to the weighted
setting.
For the case of perfect predictions, we provide an -competitive
deterministic and an -competitive randomized algorithm, where
is the number of distinct weight classes. Both these bounds are tight,
and imply an - and -competitive ratio, respectively,
when the page weights lie between and . Previously, it was not known how
to use these predictions in the weighted setting and only bounds of and
were known, where is the cache size. Our results also
generalize to the interleaved paging setting and to the case of imperfect
predictions, with the competitive ratios degrading smoothly from and
to and , respectively, as the prediction error
increases.
Our results are based on several insights on structural properties of
Belady's algorithm and the sequence of page arrival predictions, and novel
potential functions that incorporate these predictions. For the case of
unweighted paging, the results imply a very simple potential function based
proof of the optimality of Belady's algorithm, which may be of independent
interest
On-line algorithms for the K-server problem and its variants.
by Chi-ming Wat.Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.Includes bibliographical references (leaves 77-82).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Performance analysis of on-line algorithms --- p.2Chapter 1.2 --- Randomized algorithms --- p.4Chapter 1.3 --- Types of adversaries --- p.5Chapter 1.4 --- Overview of the results --- p.6Chapter 2 --- The k-server problem --- p.8Chapter 2.1 --- Introduction --- p.8Chapter 2.2 --- Related Work --- p.9Chapter 2.3 --- The Evolution of Work Function Algorithm --- p.12Chapter 2.4 --- Definitions --- p.16Chapter 2.5 --- The Work Function Algorithm --- p.18Chapter 2.6 --- The Competitive Analysis --- p.20Chapter 3 --- The weighted k-server problem --- p.27Chapter 3.1 --- Introduction --- p.27Chapter 3.2 --- Related Work --- p.29Chapter 3.3 --- Fiat and Ricklin's Algorithm --- p.29Chapter 3.4 --- The Work Function Algorithm --- p.32Chapter 3.5 --- The Competitive Analysis --- p.35Chapter 4 --- The Influence of Lookahead --- p.41Chapter 4.1 --- Introduction --- p.41Chapter 4.2 --- Related Work --- p.42Chapter 4.3 --- The Role of l-lookahead --- p.43Chapter 4.4 --- The LRU Algorithm with l-lookahead --- p.45Chapter 4.5 --- The Competitive Analysis --- p.45Chapter 5 --- Space Complexity --- p.57Chapter 5.1 --- Introduction --- p.57Chapter 5.2 --- Related Work --- p.59Chapter 5.3 --- Preliminaries --- p.59Chapter 5.4 --- The TWO Algorithm --- p.60Chapter 5.5 --- Competitive Analysis --- p.61Chapter 5.6 --- Remarks --- p.69Chapter 6 --- Conclusions --- p.70Chapter 6.1 --- Summary of Our Results --- p.70Chapter 6.2 --- Recent Results --- p.71Chapter 6.2.1 --- The Adversary Models --- p.71Chapter 6.2.2 --- On-line Performance-Improvement Algorithms --- p.73Chapter A --- Proof of Lemma1 --- p.75Bibliography --- p.7
Design of competitive paging algorithms with good behaviour in practice
Paging is one of the most prominent problems in the field of online algorithms. We have to serve a sequence of page requests using a cache that can hold up to k pages. If the currently requested page is in cache we have a cache hit, otherwise we say that a cache miss occurs, and the requested page needs to be loaded into the cache. The goal is to minimize the number of cache misses by providing a good page-replacement strategy. This problem is part of memory-management when data is stored in a two-level memory hierarchy, more precisely a small and fast memory (cache) and a slow but large memory (disk). The most important application area is the virtual memory management of operating systems. Accessed pages are either already in the RAM or need to be loaded from the hard disk into the RAM using expensive I/O. The time needed to access the RAM is insignificant compared to an I/O operation which takes several milliseconds.
The traditional evaluation framework for online algorithms is competitive analysis where the online algorithm is compared to the optimal offline solution. A shortcoming of competitive analysis consists of its too pessimistic worst-case guarantees. For example LRU has a theoretical competitive ratio of k but in practice this ratio rarely exceeds the value 4.
Reducing the gap between theory and practice has been a hot research issue during the last years. More recent evaluation models have been used to prove that LRU is an optimal online algorithm or part of a class of optimal algorithms respectively, which was motivated by the assumption that LRU is one of the best algorithms in practice. Most of the newer models make LRU-friendly assumptions regarding the input, thus not leaving much room for new algorithms.
Only few works in the field of online paging have introduced new algorithms which can compete with LRU as regards the small number of cache misses.
In the first part of this thesis we study strongly competitive randomized paging algorithms, i.e. algorithms with optimal competitive guarantees. Although the tight bound for the competitive ratio has been known for decades, current algorithms matching this bound are complex and have high running times and memory requirements. We propose the algorithm OnlineMin which processes a page request in O(log k/log log k) time in the worst case. The best previously known solution requires O(k^2) time.
Usually the memory requirement of a paging algorithm is measured by the maximum number of pages that the algorithm keeps track of. Any algorithm stores information about the k pages in the cache. In addition it can also store information about pages not in cache, denoted bookmarks. We answer the open question of Bein et al. '07 whether strongly competitive randomized paging algorithms using only o(k) bookmarks exist or not. To do so we modify the Partition algorithm of McGeoch and Sleator '85 which has an unbounded bookmark complexity, and obtain Partition2 which uses O(k/log k) bookmarks.
In the second part we extract ideas from theoretical analysis of randomized paging algorithms in order to design deterministic algorithms that perform well in practice. We refine competitive analysis by introducing the attack rate
parameter r, which ranges between 1 and k. We show that r is a tight bound on the competitive ratio of deterministic algorithms.
We give empirical evidence that r is usually much smaller than k and thus r-competitive algorithms have a reasonable performance on real-world traces. By introducing the r-competitive priority-based algorithm class OnOPT we obtain a collection of promising algorithms to beat the LRU-standard. We single out the new algorithm RDM and show that it outperforms LRU and some of its variants on a wide range of real-world traces.
Since RDM is more complex than LRU one may think at first sight that the gain in terms of lowering the number of cache misses is ruined by high runtime for processing pages. We engineer a fast implementation of RDM, and compare it
to LRU and the very fast FIFO algorithm in an overall evaluation scheme, where we measure the runtime of the algorithms and add penalties for each cache miss.
Experimental results show that for realistic penalties RDM still outperforms these two algorithms even if we grant the competitors an idealistic runtime of 0
Alternative Measures for the Analysis of Online Algorithms
In this thesis we introduce and evaluate several new models for the analysis of online algorithms. In an online problem, the algorithm does not know the entire input from the beginning; the input is revealed in a sequence of steps. At each step the algorithm should make its decisions based on the past and without any knowledge about the future. Many important real-life problems such as paging and routing are intrinsically online and thus the design and analysis of
online algorithms is one of the main research areas in theoretical computer science.
Competitive analysis is the standard measure for analysis of online algorithms. It has been applied to many online problems in diverse areas ranging from robot navigation, to network routing, to scheduling, to online graph coloring. While in several instances competitive analysis gives satisfactory results, for certain problems it results in unrealistically pessimistic ratios and/or
fails to distinguish between algorithms that have vastly differing performance under any practical characterization. Addressing these shortcomings has been the subject of intense research by many of the best minds in the field. In this thesis, building upon recent advances of others we introduce some new models for analysis of online algorithms, namely Bijective Analysis, Average Analysis,
Parameterized Analysis, and Relative Interval Analysis. We show that they lead to good results when applied to paging and list update algorithms. Paging and list update are two well known online problems. Paging is one of the main examples of poor behavior of competitive analysis. We show that LRU is the unique optimal online paging algorithm according to Average Analysis on sequences with locality of reference. Recall that in practice input sequences for paging have
high locality of reference. It has been empirically long established that LRU is the best paging algorithm. Yet, Average Analysis is the first model that gives strict separation of LRU from all other online paging algorithms, thus solving a long standing open problem. We prove a similar
result for the optimality of MTF for list update on sequences with locality of reference.
A technique for the analysis of online algorithms has to be effective to be useful in day-to-day analysis of algorithms. While Bijective and Average Analysis succeed at providing fine separation, their application can be, at times, cumbersome. Thus we apply a parameterized or adaptive analysis framework to online algorithms. We show that this framework is effective, can be applied more easily to a larger family of problems and leads to finer analysis than the competitive ratio. The conceptual innovation of parameterizing the performance of an algorithm by something other than the input size was first introduced over three decades ago [124, 125]. By now it has been extensively studied and understood in the context of adaptive analysis (for problems in P) and parameterized algorithms (for NP-hard problems), yet to our knowledge
this thesis is the first systematic application of this technique to the study of online algorithms. Interestingly, competitive analysis can be recast as a particular form of parameterized analysis in
which the performance of opt is the parameter. In general, for each problem we can choose the parameter/measure that best reflects the difficulty of the input. We show that in many instances the performance of opt on a sequence is a coarse approximation of the difficulty or complexity
of a given input sequence. Using a finer, more natural measure we can separate paging and list update algorithms which were otherwise indistinguishable under the classical model. This creates a performance hierarchy of algorithms which better reflects the intuitive relative strengths between them. Lastly, we show that, surprisingly, certain randomized algorithms which are superior to MTF in the classical model are not so in the parameterized case, which matches experimental results. We test list update algorithms in the context of a data compression problem known to have locality of reference. Our experiments show MTF outperforms other list update algorithms
in practice after BWT. This is consistent with the intuition that BWT increases locality of reference
Zur Analyse von randomisierten Suchheuristiken und Online-Heuristiken
Die Dissertation beschäftigt sich mit der theoretischen Analyse von
Heuristiken. Im ersten und zweiten Teil werden randomisierte
Heuristiken zur Optimierung im Black-Box-Szenario untersucht. Der
dritte Teil befasst sich mit dem Seitenwechselproblem (Pagingproblem),
das in der Regel im Online-Szenario zu lösen ist. Das Black-Box- und
das Online-Szenario haben gemeinsam, dass die zu bearbeitende
Probleminstanz erst im Laufe der Zeit bekannt wird, sodass Algorithmen
nur heuristisch vorgehen können.
Im ersten Teil wird die Laufzeit zweier einfacher evolutionärer
Algorithmen, des (1+1)-EAs und der randomisierten lokalen Suche, am
Beispiel des Matchingproblems im Black-Box-Szenario untersucht. Es
wird gezeigt, dass beide Heuristiken effizient Maximummatchings
approximieren können und dass sie zumindest für einfache Graphklassen
in polynomieller erwarteter Zeit Maximummatchings finden. Andererseits
gibt es Graphen, für die die Laufzeit beider Heuristiken mit einer
überwältigenden Wahrscheinlichkeit exponentiell ist.
Der zweite Teil untersucht einen grundlegenden evolutionären
Algorithmus für mehrkriterielle Optimierungsprobleme im
Black-Box-Szenario. Insbesondere wird die erwartete Laufzeit
im Worst Case ermittelt und die Laufzeit für verschiedene
zweikriterielle Probleme analysiert.
Der dritte Teil befasst sich mit der Analyse von Heuristiken für das
Seitenwechselproblem. Es werden zwei Modelle eingeführt, die
unmittelbar die Analyse der Fehlerrate unter dem Aspekt der
Anfragelokalität ermöglichen. In diesen Modellen werden Schranken für
die Fehlerraten verschiedener deterministischer
Seitenwechselstrategien bewiesen