344 research outputs found
Competitive Parallel Disk Prefetching and Buffer Management
We provide a competitive analysis framework for online prefetching and buffer management
algorithms in parallel I/O systems, using a read-once model of block references. This has
widespread applicability to key I/O-bound applications such as external merging and concurrent
playback of multiple video streams. Two realistic lookahead models, global lookahead and local
lookahead, are defined. Algorithms NOM and GREED based on these two forms of lookahead
are analyzed for shared buffer and distributed buffer configurations, both of which occur frequently
in existing systems. An important aspect of our work is that we show how to implement
both the models of lookahead in practice using the simple techniques of forecasting and flushing.
Given a
-disk parallel I/O system and a globally shared I/O buffer that can hold upto
disk
blocks, we derive a lower bound of
on the competitive ratio of any deterministic online
prefetching algorithm with
lookahead. NOM is shown to match the lower bound using
global
-block lookahead. In contrast, using only local lookahead results in an
competitive
ratio. When the buffer is distributed into
portions of
blocks each, the algorithm
GREED based on local lookahead is shown to be optimal, and NOM is within a constant factor
of optimal. Thus we provide a theoretical basis for the intuition that global lookahead is more
valuable for prefetching in the case of a shared buffer configuration whereas it is enough to provide
local lookahead in case of the distributed configuration. Finally, we analyze the performance
of these algorithms for reference strings generated by a uniformly-random stochastic process and
we show that they achieve the minimal expected number of I/Os. These results also give bounds
on the worst-case expected performance of algorithms which employ randomization in the data
layout
Adaptive Prefetching for Device-Independent File I/O
Device independent I/O has been a holy grail to operating system designers since the early days of UNIX. Unfortunately, existing operating systems fall short of this goal for multimedia applications. Techniques such as caching and sequential read-ahead can help mask I/O latency in some cases, but in others they increase latency and add substantial jitter. Multimedia applications, such as video players, are sensitive to vagaries in performance since I/O latency and jitter affect the quality of presentation. Our solution uses adaptive prefetching to reduce both latency and jitter. Applications submit file access plans to the prefetcher, which then generates I/O requests to the operating system and manages the buffer cache to isolate the application from variations in device performance. Our experiments show device independence can be achieved: an MPEG video player sees the same latency when reading from a local disk or an NFS server. Moreover, our approach reduces jitter substantially
Prefetching and clustering techniques for network based storage.
The usage of network-based applications is increasing, as network speeds increase, and the use of streaming applications, e.g BBC iPlayer, YouTube etc., running over network infrastructure is becoming commonplace. These
applications access data sequentially. However, as processor speeds and the amount of memory available increase, the rate at which streaming applications access data is now faster than the rate at which the blocks can be
fetched consecutively from network storage. In addition to sequential access, the system also needs to promptly satisfy demand misses in order for applications to continue their execution.
This thesis proposes a design to provide Quality-Of-Service (QoS) for streaming applications (sequential accesses) and demand misses, such that, streaming applications can run without jitter (once they are started) and demand misses can be satisfied in reasonable time using network storage. To implement the proposed design in real time, the thesis presents an analytical
model to estimate the average time taken to service a demand miss.
Further, it defines and explores the operational space where the proposed QoS could be provided. Using database techniques, this region is then encapsulated into an autonomous algorithm which is verified using simulation.
Finally, a prototype Experimental File System (EFS) is designed and implemented to test the algorithm on a real test-bed
Prefetching techniques for client server object-oriented database systems
The performance of many object-oriented database applications suffers from the page fetch latency which is determined by the expense of disk access. In this work we suggest several prefetching techniques to avoid, or at least to reduce, page fetch latency. In practice no prediction technique is perfect and no prefetching technique can entirely eliminate delay due to page fetch latency. Therefore we are interested in the trade-off between the level of accuracy required for obtaining good results in terms of elapsed time reduction and the processing overhead needed to achieve this level of accuracy. If prefetching accuracy is high then the total elapsed time of an application can be reduced significantly otherwise if the prefetching accuracy is low, many incorrect pages are prefetched and the extra load on the client, network, server and disks decreases the whole system performance. Access pattern of object-oriented databases are often complex and usually hard to predict accurately. The ..
Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)
Database management systems (DBMSs) carefully optimize complex multi-join
queries to avoid expensive disk I/O. As servers today feature tens or hundreds
of gigabytes of RAM, a significant fraction of many analytic databases becomes
memory-resident. Even after careful tuning for an in-memory environment, a
linear disk I/O model such as the one implemented in PostgreSQL may make query
response time predictions that are up to 2X slower than the optimal multi-join
query plan over memory-resident data. This paper introduces a memory I/O cost
model to identify good evaluation strategies for complex query plans with
multiple hash-based equi-joins over memory-resident data. The proposed cost
model is carefully validated for accuracy using three different systems,
including an Amazon EC2 instance, to control for hardware-specific differences.
Prior work in parallel query evaluation has advocated right-deep and bushy
trees for multi-join queries due to their greater parallelization and
pipelining potential. A surprising finding is that the conventional wisdom from
shared-nothing disk-based systems does not directly apply to the modern
shared-everything memory hierarchy. As corroborated by our model, the
performance gap between the optimal left-deep and right-deep query plan can
grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in
SoCC'1
High-Performance In-Memory OLTP via Coroutine-to-Transaction
Data stalls are a major overhead in main-memory database engines due to the use of pointer-rich data structures. Lightweight coroutines ease the implementation of software prefetching to hide data stalls by overlapping computation and asynchronous data prefetching. Prior solutions, however, mainly focused on (1) individual components and operations and (2) intra-transaction batching that requires interface changes, breaking backward compatibility. It was not clear how they apply to a full database engine and how much end-to-end benefit they bring under various workloads. This thesis presents CoroBase, a main-memory database engine that tackles these challenges with a new coroutine-to-transaction paradigm. Coroutine-to-transaction models transactions as coroutines and thus enables inter-transaction batching, avoiding application changes but retaining the benefits of prefetching. We show that on a 48-core server, CoroBase can perform close to 2× better for read-intensive workloads and remain competitive for workloads that inherently do not benefit from software prefetching
BlobCR: Virtual Disk Based Checkpoint-Restart for HPC Applications on IaaS Clouds
International audienceInfrastructure-as-a-Service (IaaS) cloud computing is gaining significant interest in industry and academia as an alternative platform for running HPC applications. Given the need to provide fault tolerance, support for suspend-resume and offline migration, an efficient Checkpoint-Restart mechanism becomes paramount in this context. We propose BlobCR, a dedicated checkpoint repository that is able to take live incremental snapshots of the whole disk attached to the virtual machine (VM) instances. BlobCR aims to minimize the performance overhead of checkpointing by persisting VM disk snapshots asynchronously in the background using a low overhead technique we call selective copy-on-write. It includes support for both application-level and process-level checkpointing, as well as support to roll back file system changes. Experiments at large scale demonstrate the benefits of our proposal both in synthetic settings and for a real-life HPC application
- …