12,930 research outputs found
Run Generation Revisited: What Goes Up May or May Not Come Down
In this paper, we revisit the classic problem of run generation. Run
generation is the first phase of external-memory sorting, where the objective
is to scan through the data, reorder elements using a small buffer of size M ,
and output runs (contiguously sorted chunks of elements) that are as long as
possible.
We develop algorithms for minimizing the total number of runs (or
equivalently, maximizing the average run length) when the runs are allowed to
be sorted or reverse sorted. We study the problem in the online setting, both
with and without resource augmentation, and in the offline setting.
(1) We analyze alternating-up-down replacement selection (runs alternate
between sorted and reverse sorted), which was studied by Knuth as far back as
1963. We show that this simple policy is asymptotically optimal. Specifically,
we show that alternating-up-down replacement selection is 2-competitive and no
deterministic online algorithm can perform better.
(2) We give online algorithms having smaller competitive ratios with resource
augmentation. Specifically, we exhibit a deterministic algorithm that, when
given a buffer of size 4M , is able to match or beat any optimal algorithm
having a buffer of size M . Furthermore, we present a randomized online
algorithm which is 7/4-competitive when given a buffer twice that of the
optimal.
(3) We demonstrate that performance can also be improved with a small amount
of foresight. We give an algorithm, which is 3/2-competitive, with
foreknowledge of the next 3M elements of the input stream. For the extreme case
where all future elements are known, we design a PTAS for computing the optimal
strategy a run generation algorithm must follow.
(4) Finally, we present algorithms tailored for nearly sorted inputs which
are guaranteed to have optimal solutions with sufficiently long runs
A Bicriteria Approximation for the Reordering Buffer Problem
In the reordering buffer problem (RBP), a server is asked to process a
sequence of requests lying in a metric space. To process a request the server
must move to the corresponding point in the metric. The requests can be
processed slightly out of order; in particular, the server has a buffer of
capacity k which can store up to k requests as it reads in the sequence. The
goal is to reorder the requests in such a manner that the buffer constraint is
satisfied and the total travel cost of the server is minimized. The RBP arises
in many applications that require scheduling with a limited buffer capacity,
such as scheduling a disk arm in storage systems, switching colors in paint
shops of a car manufacturing plant, and rendering 3D images in computer
graphics.
We study the offline version of RBP and develop bicriteria approximations.
When the underlying metric is a tree, we obtain a solution of cost no more than
9OPT using a buffer of capacity 4k + 1 where OPT is the cost of an optimal
solution with buffer capacity k. Constant factor approximations were known
previously only for the uniform metric (Avigdor-Elgrabli et al., 2012). Via
randomized tree embeddings, this implies an O(log n) approximation to cost and
O(1) approximation to buffer size for general metrics. Previously the best
known algorithm for arbitrary metrics by Englert et al. (2007) provided an
O(log^2 k log n) approximation without violating the buffer constraint.Comment: 13 page
Online Permutation Routing in Partitioned Optical Passive Star Networks
This paper establishes the state of the art in both deterministic and
randomized online permutation routing in the POPS network. Indeed, we show that
any permutation can be routed online on a POPS network either with
deterministic slots, or, with high probability, with
randomized slots, where constant
. When , that we claim to be the
"interesting" case, the randomized algorithm is exponentially faster than any
other algorithm in the literature, both deterministic and randomized ones. This
is true in practice as well. Indeed, experiments show that it outperforms its
rivals even starting from as small a network as a POPS(2,2), and the gap grows
exponentially with the size of the network. We can also show that, under proper
hypothesis, no deterministic algorithm can asymptotically match its
performance
RAM-Efficient External Memory Sorting
In recent years a large number of problems have been considered in external
memory models of computation, where the complexity measure is the number of
blocks of data that are moved between slow external memory and fast internal
memory (also called I/Os). In practice, however, internal memory time often
dominates the total running time once I/O-efficiency has been obtained. In this
paper we study algorithms for fundamental problems that are simultaneously
I/O-efficient and internal memory efficient in the RAM model of computation.Comment: To appear in Proceedings of ISAAC 2013, getting the Best Paper Awar
Scheduling Packets with Values and Deadlines in Size-bounded Buffers
Motivated by providing quality-of-service differentiated services in the
Internet, we consider buffer management algorithms for network switches. We
study a multi-buffer model. A network switch consists of multiple size-bounded
buffers such that at any time, the number of packets residing in each
individual buffer cannot exceed its capacity. Packets arrive at the network
switch over time; they have values, deadlines, and designated buffers. In each
time step, at most one pending packet is allowed to be sent and this packet can
be from any buffer. The objective is to maximize the total value of the packets
sent by their respective deadlines. A 9.82-competitive online algorithm has
been provided for this model (Azar and Levy. SWAT 2006), but no offline
algorithms have been known yet. In this paper, We study the offline setting of
the multi-buffer model. Our contributions include a few optimal offline
algorithms for some variants of the model. Each variant has its unique and
interesting algorithmic feature. These offline algorithms help us understand
the model better in designing online algorithms.Comment: 7 page
Instant restore after a media failure
Media failures usually leave database systems unavailable for several hours
until recovery is complete, especially in applications with large devices and
high transaction volume. Previous work introduced a technique called
single-pass restore, which increases restore bandwidth and thus substantially
decreases time to repair. Instant restore goes further as it permits read/write
access to any data on a device undergoing restore--even data not yet
restored--by restoring individual data segments on demand. Thus, the restore
process is guided primarily by the needs of applications, and the observed mean
time to repair is effectively reduced from several hours to a few seconds.
This paper presents an implementation and evaluation of instant restore. The
technique is incrementally implemented on a system starting with the
traditional ARIES design for logging and recovery. Experiments show that the
transaction latency perceived after a media failure can be cut down to less
than a second and that the overhead imposed by the technique on normal
processing is minimal. The net effect is that a few "nines" of availability are
added to the system using simple and low-overhead software techniques
- …