12,930 research outputs found

    Run Generation Revisited: What Goes Up May or May Not Come Down

    Full text link
    In this paper, we revisit the classic problem of run generation. Run generation is the first phase of external-memory sorting, where the objective is to scan through the data, reorder elements using a small buffer of size M , and output runs (contiguously sorted chunks of elements) that are as long as possible. We develop algorithms for minimizing the total number of runs (or equivalently, maximizing the average run length) when the runs are allowed to be sorted or reverse sorted. We study the problem in the online setting, both with and without resource augmentation, and in the offline setting. (1) We analyze alternating-up-down replacement selection (runs alternate between sorted and reverse sorted), which was studied by Knuth as far back as 1963. We show that this simple policy is asymptotically optimal. Specifically, we show that alternating-up-down replacement selection is 2-competitive and no deterministic online algorithm can perform better. (2) We give online algorithms having smaller competitive ratios with resource augmentation. Specifically, we exhibit a deterministic algorithm that, when given a buffer of size 4M , is able to match or beat any optimal algorithm having a buffer of size M . Furthermore, we present a randomized online algorithm which is 7/4-competitive when given a buffer twice that of the optimal. (3) We demonstrate that performance can also be improved with a small amount of foresight. We give an algorithm, which is 3/2-competitive, with foreknowledge of the next 3M elements of the input stream. For the extreme case where all future elements are known, we design a PTAS for computing the optimal strategy a run generation algorithm must follow. (4) Finally, we present algorithms tailored for nearly sorted inputs which are guaranteed to have optimal solutions with sufficiently long runs

    A Bicriteria Approximation for the Reordering Buffer Problem

    Full text link
    In the reordering buffer problem (RBP), a server is asked to process a sequence of requests lying in a metric space. To process a request the server must move to the corresponding point in the metric. The requests can be processed slightly out of order; in particular, the server has a buffer of capacity k which can store up to k requests as it reads in the sequence. The goal is to reorder the requests in such a manner that the buffer constraint is satisfied and the total travel cost of the server is minimized. The RBP arises in many applications that require scheduling with a limited buffer capacity, such as scheduling a disk arm in storage systems, switching colors in paint shops of a car manufacturing plant, and rendering 3D images in computer graphics. We study the offline version of RBP and develop bicriteria approximations. When the underlying metric is a tree, we obtain a solution of cost no more than 9OPT using a buffer of capacity 4k + 1 where OPT is the cost of an optimal solution with buffer capacity k. Constant factor approximations were known previously only for the uniform metric (Avigdor-Elgrabli et al., 2012). Via randomized tree embeddings, this implies an O(log n) approximation to cost and O(1) approximation to buffer size for general metrics. Previously the best known algorithm for arbitrary metrics by Englert et al. (2007) provided an O(log^2 k log n) approximation without violating the buffer constraint.Comment: 13 page

    Online Permutation Routing in Partitioned Optical Passive Star Networks

    Full text link
    This paper establishes the state of the art in both deterministic and randomized online permutation routing in the POPS network. Indeed, we show that any permutation can be routed online on a POPS network either with O(dglogg)O(\frac{d}{g}\log g) deterministic slots, or, with high probability, with 5cd/g+o(d/g)+O(loglogg)5c\lceil d/g\rceil+o(d/g)+O(\log\log g) randomized slots, where constant c=exp(1+e1)3.927c=\exp (1+e^{-1})\approx 3.927. When d=Θ(g)d=\Theta(g), that we claim to be the "interesting" case, the randomized algorithm is exponentially faster than any other algorithm in the literature, both deterministic and randomized ones. This is true in practice as well. Indeed, experiments show that it outperforms its rivals even starting from as small a network as a POPS(2,2), and the gap grows exponentially with the size of the network. We can also show that, under proper hypothesis, no deterministic algorithm can asymptotically match its performance

    RAM-Efficient External Memory Sorting

    Full text link
    In recent years a large number of problems have been considered in external memory models of computation, where the complexity measure is the number of blocks of data that are moved between slow external memory and fast internal memory (also called I/Os). In practice, however, internal memory time often dominates the total running time once I/O-efficiency has been obtained. In this paper we study algorithms for fundamental problems that are simultaneously I/O-efficient and internal memory efficient in the RAM model of computation.Comment: To appear in Proceedings of ISAAC 2013, getting the Best Paper Awar

    Scheduling Packets with Values and Deadlines in Size-bounded Buffers

    Full text link
    Motivated by providing quality-of-service differentiated services in the Internet, we consider buffer management algorithms for network switches. We study a multi-buffer model. A network switch consists of multiple size-bounded buffers such that at any time, the number of packets residing in each individual buffer cannot exceed its capacity. Packets arrive at the network switch over time; they have values, deadlines, and designated buffers. In each time step, at most one pending packet is allowed to be sent and this packet can be from any buffer. The objective is to maximize the total value of the packets sent by their respective deadlines. A 9.82-competitive online algorithm has been provided for this model (Azar and Levy. SWAT 2006), but no offline algorithms have been known yet. In this paper, We study the offline setting of the multi-buffer model. Our contributions include a few optimal offline algorithms for some variants of the model. Each variant has its unique and interesting algorithmic feature. These offline algorithms help us understand the model better in designing online algorithms.Comment: 7 page

    Instant restore after a media failure

    Full text link
    Media failures usually leave database systems unavailable for several hours until recovery is complete, especially in applications with large devices and high transaction volume. Previous work introduced a technique called single-pass restore, which increases restore bandwidth and thus substantially decreases time to repair. Instant restore goes further as it permits read/write access to any data on a device undergoing restore--even data not yet restored--by restoring individual data segments on demand. Thus, the restore process is guided primarily by the needs of applications, and the observed mean time to repair is effectively reduced from several hours to a few seconds. This paper presents an implementation and evaluation of instant restore. The technique is incrementally implemented on a system starting with the traditional ARIES design for logging and recovery. Experiments show that the transaction latency perceived after a media failure can be cut down to less than a second and that the overhead imposed by the technique on normal processing is minimal. The net effect is that a few "nines" of availability are added to the system using simple and low-overhead software techniques
    corecore