2,248 research outputs found
The Cost of Address Translation
Modern computers are not random access machines (RAMs). They have a memory
hierarchy, multiple cores, and virtual memory. In this paper, we address the
computational cost of address translation in virtual memory. Starting point for
our work is the observation that the analysis of some simple algorithms (random
scan of an array, binary search, heapsort) in either the RAM model or the EM
model (external memory model) does not correctly predict growth rates of actual
running times. We propose the VAT model (virtual address translation) to
account for the cost of address translations and analyze the algorithms
mentioned above and others in the model. The predictions agree with the
measurements. We also analyze the VAT-cost of cache-oblivious algorithms.Comment: A extended abstract of this paper was published in the proceedings of
ALENEX13, New Orleans, US
Study and optimization of the memory management in Memcached
Over the years the Internet has become more popular than ever and web applications
like Facebook and Twitter are gaining more users. This results in generation of more and
more data by the users which has to be efficiently managed, because access speed is an
important factor nowadays, a user will not wait no more than three seconds for a web
page to load before abandoning the site. In-memory key-value stores like Memcached
and Redis are used to speed up web applications by speeding up access to the data by
decreasing the number of accesses to the slower data storage’s. The first implementation
of Memcached, in the LiveJournal’s website, showed that by using 28 instances of Memcached
on ten unique hosts, caching the most popular 30GB of data can achieve a hit rate
around 92%, reducing the number of accesses to the database and reducing the response
time considerably.
Not all objects in cache take the same time to recompute, so this research is going to
study and present a new cost aware memory management that is easy to integrate in a
key-value store, with this approach being implemented in Memcached. The new memory
management and cache will give some priority to key-value pairs that take longer to be
recomputed. Instead of replacing Memcached’s replacement structure and its policy, we
simply add a new segment in each structure that is capable of storing the more costly
key-value pairs. Apart from this new segment in each replacement structure, we created
a new dynamic cost-aware rebalancing policy in Memcached, giving more memory to
store more costly key-value pairs.
With the implementations of our approaches, we were able to offer a prototype that
can be used to research the cost on the caching systems performance. In addition, we
were able to improve in certain scenarios the access latency of the user and the total
recomputation cost of the key-value stored in the system
C-MOS array design techniques: SUMC multiprocessor system study
The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units
Jointly Optimal Routing and Caching for Arbitrary Network Topologies
We study a problem of fundamental importance to ICNs, namely, minimizing
routing costs by jointly optimizing caching and routing decisions over an
arbitrary network topology. We consider both source routing and hop-by-hop
routing settings. The respective offline problems are NP-hard. Nevertheless, we
show that there exist polynomial time approximation algorithms producing
solutions within a constant approximation from the optimal. We also produce
distributed, adaptive algorithms with the same approximation guarantees. We
simulate our adaptive algorithms over a broad array of different topologies.
Our algorithms reduce routing costs by several orders of magnitude compared to
prior art, including algorithms optimizing caching under fixed routing.Comment: This is the extended version of the paper "Jointly Optimal Routing
and Caching for Arbitrary Network Topologies", appearing in the 4th ACM
Conference on Information-Centric Networking (ICN 2017), Berlin, Sep. 26-28,
201
Design of competitive paging algorithms with good behaviour in practice
Paging is one of the most prominent problems in the field of online algorithms. We have to serve a sequence of page requests using a cache that can hold up to k pages. If the currently requested page is in cache we have a cache hit, otherwise we say that a cache miss occurs, and the requested page needs to be loaded into the cache. The goal is to minimize the number of cache misses by providing a good page-replacement strategy. This problem is part of memory-management when data is stored in a two-level memory hierarchy, more precisely a small and fast memory (cache) and a slow but large memory (disk). The most important application area is the virtual memory management of operating systems. Accessed pages are either already in the RAM or need to be loaded from the hard disk into the RAM using expensive I/O. The time needed to access the RAM is insignificant compared to an I/O operation which takes several milliseconds.
The traditional evaluation framework for online algorithms is competitive analysis where the online algorithm is compared to the optimal offline solution. A shortcoming of competitive analysis consists of its too pessimistic worst-case guarantees. For example LRU has a theoretical competitive ratio of k but in practice this ratio rarely exceeds the value 4.
Reducing the gap between theory and practice has been a hot research issue during the last years. More recent evaluation models have been used to prove that LRU is an optimal online algorithm or part of a class of optimal algorithms respectively, which was motivated by the assumption that LRU is one of the best algorithms in practice. Most of the newer models make LRU-friendly assumptions regarding the input, thus not leaving much room for new algorithms.
Only few works in the field of online paging have introduced new algorithms which can compete with LRU as regards the small number of cache misses.
In the first part of this thesis we study strongly competitive randomized paging algorithms, i.e. algorithms with optimal competitive guarantees. Although the tight bound for the competitive ratio has been known for decades, current algorithms matching this bound are complex and have high running times and memory requirements. We propose the algorithm OnlineMin which processes a page request in O(log k/log log k) time in the worst case. The best previously known solution requires O(k^2) time.
Usually the memory requirement of a paging algorithm is measured by the maximum number of pages that the algorithm keeps track of. Any algorithm stores information about the k pages in the cache. In addition it can also store information about pages not in cache, denoted bookmarks. We answer the open question of Bein et al. '07 whether strongly competitive randomized paging algorithms using only o(k) bookmarks exist or not. To do so we modify the Partition algorithm of McGeoch and Sleator '85 which has an unbounded bookmark complexity, and obtain Partition2 which uses O(k/log k) bookmarks.
In the second part we extract ideas from theoretical analysis of randomized paging algorithms in order to design deterministic algorithms that perform well in practice. We refine competitive analysis by introducing the attack rate
parameter r, which ranges between 1 and k. We show that r is a tight bound on the competitive ratio of deterministic algorithms.
We give empirical evidence that r is usually much smaller than k and thus r-competitive algorithms have a reasonable performance on real-world traces. By introducing the r-competitive priority-based algorithm class OnOPT we obtain a collection of promising algorithms to beat the LRU-standard. We single out the new algorithm RDM and show that it outperforms LRU and some of its variants on a wide range of real-world traces.
Since RDM is more complex than LRU one may think at first sight that the gain in terms of lowering the number of cache misses is ruined by high runtime for processing pages. We engineer a fast implementation of RDM, and compare it
to LRU and the very fast FIFO algorithm in an overall evaluation scheme, where we measure the runtime of the algorithms and add penalties for each cache miss.
Experimental results show that for realistic penalties RDM still outperforms these two algorithms even if we grant the competitors an idealistic runtime of 0
Truly Online Paging with Locality of Reference
The competitive analysis fails to model locality of reference in the online
paging problem. To deal with it, Borodin et. al. introduced the access graph
model, which attempts to capture the locality of reference. However, the access
graph model has a number of troubling aspects. The access graph has to be known
in advance to the paging algorithm and the memory required to represent the
access graph itself may be very large.
In this paper we present truly online strongly competitive paging algorithms
in the access graph model that do not have any prior information on the access
sequence. We present both deterministic and randomized algorithms. The
algorithms need only O(k log n) bits of memory, where k is the number of page
slots available and n is the size of the virtual address space. I.e.,
asymptotically no more memory than needed to store the virtual address
translation table.
We also observe that our algorithms adapt themselves to temporal changes in
the locality of reference. We model temporal changes in the locality of
reference by extending the access graph model to the so called extended access
graph model, in which many vertices of the graph can correspond to the same
virtual page. We define a measure for the rate of change in the locality of
reference in G denoted by Delta(G). We then show our algorithms remain strongly
competitive as long as Delta(G) >= (1+ epsilon)k, and no truly online algorithm
can be strongly competitive on a class of extended access graphs that includes
all graphs G with Delta(G) >= k- o(k).Comment: 37 pages. Preliminary version appeared in FOCS '9
- …