11,440 research outputs found

    Run Generation Revisited: What Goes Up May or May Not Come Down

    Full text link
    In this paper, we revisit the classic problem of run generation. Run generation is the first phase of external-memory sorting, where the objective is to scan through the data, reorder elements using a small buffer of size M , and output runs (contiguously sorted chunks of elements) that are as long as possible. We develop algorithms for minimizing the total number of runs (or equivalently, maximizing the average run length) when the runs are allowed to be sorted or reverse sorted. We study the problem in the online setting, both with and without resource augmentation, and in the offline setting. (1) We analyze alternating-up-down replacement selection (runs alternate between sorted and reverse sorted), which was studied by Knuth as far back as 1963. We show that this simple policy is asymptotically optimal. Specifically, we show that alternating-up-down replacement selection is 2-competitive and no deterministic online algorithm can perform better. (2) We give online algorithms having smaller competitive ratios with resource augmentation. Specifically, we exhibit a deterministic algorithm that, when given a buffer of size 4M , is able to match or beat any optimal algorithm having a buffer of size M . Furthermore, we present a randomized online algorithm which is 7/4-competitive when given a buffer twice that of the optimal. (3) We demonstrate that performance can also be improved with a small amount of foresight. We give an algorithm, which is 3/2-competitive, with foreknowledge of the next 3M elements of the input stream. For the extreme case where all future elements are known, we design a PTAS for computing the optimal strategy a run generation algorithm must follow. (4) Finally, we present algorithms tailored for nearly sorted inputs which are guaranteed to have optimal solutions with sufficiently long runs

    Parallel Batch-Dynamic Graph Connectivity

    Full text link
    In this paper, we study batch parallel algorithms for the dynamic connectivity problem, a fundamental problem that has received considerable attention in the sequential setting. The most well known sequential algorithm for dynamic connectivity is the elegant level-set algorithm of Holm, de Lichtenberg and Thorup (HDT), which achieves O(log2n)O(\log^2 n) amortized time per edge insertion or deletion, and O(logn/loglogn)O(\log n / \log\log n) time per query. We design a parallel batch-dynamic connectivity algorithm that is work-efficient with respect to the HDT algorithm for small batch sizes, and is asymptotically faster when the average batch size is sufficiently large. Given a sequence of batched updates, where Δ\Delta is the average batch size of all deletions, our algorithm achieves O(lognlog(1+n/Δ))O(\log n \log(1 + n / \Delta)) expected amortized work per edge insertion and deletion and O(log3n)O(\log^3 n) depth w.h.p. Our algorithm answers a batch of kk connectivity queries in O(klog(1+n/k))O(k \log(1 + n/k)) expected work and O(logn)O(\log n) depth w.h.p. To the best of our knowledge, our algorithm is the first parallel batch-dynamic algorithm for connectivity.Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

    Truth and Regret in Online Scheduling

    Full text link
    We consider a scheduling problem where a cloud service provider has multiple units of a resource available over time. Selfish clients submit jobs, each with an arrival time, deadline, length, and value. The service provider's goal is to implement a truthful online mechanism for scheduling jobs so as to maximize the social welfare of the schedule. Recent work shows that under a stochastic assumption on job arrivals, there is a single-parameter family of mechanisms that achieves near-optimal social welfare. We show that given any such family of near-optimal online mechanisms, there exists an online mechanism that in the worst case performs nearly as well as the best of the given mechanisms. Our mechanism is truthful whenever the mechanisms in the given family are truthful and prompt, and achieves optimal (within constant factors) regret. We model the problem of competing against a family of online scheduling mechanisms as one of learning from expert advice. A primary challenge is that any scheduling decisions we make affect not only the payoff at the current step, but also the resource availability and payoffs in future steps. Furthermore, switching from one algorithm (a.k.a. expert) to another in an online fashion is challenging both because it requires synchronization with the state of the latter algorithm as well as because it affects the incentive structure of the algorithms. We further show how to adapt our algorithm to a non-clairvoyant setting where job lengths are unknown until jobs are run to completion. Once again, in this setting, we obtain truthfulness along with asymptotically optimal regret (within poly-logarithmic factors)

    A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks

    Full text link
    We study the problem of scheduling sensors in a resource-constrained linear dynamical system, where the objective is to select a small subset of sensors from a large network to perform the state estimation task. We formulate this problem as the maximization of a monotone set function under a matroid constraint. We propose a randomized greedy algorithm that is significantly faster than state-of-the-art methods. By introducing the notion of curvature which quantifies how close a function is to being submodular, we analyze the performance of the proposed algorithm and find a bound on the expected mean square error (MSE) of the estimator that uses the selected sensors in terms of the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for the scenario where{\color{black}{ the measurements are i.i.d. random vectors with bounded 2\ell_2 norm.}} Simulation results demonstrate efficacy of the randomized greedy algorithm in a comparison with greedy and semidefinite programming relaxation methods

    A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks

    Full text link
    We study the problem of scheduling sensors in a resource-constrained linear dynamical system, where the objective is to select a small subset of sensors from a large network to perform the state estimation task. We formulate this problem as the maximization of a monotone set function under a matroid constraint. We propose a randomized greedy algorithm that is significantly faster than state-of-the-art methods. By introducing the notion of curvature which quantifies how close a function is to being submodular, we analyze the performance of the proposed algorithm and find a bound on the expected mean square error (MSE) of the estimator that uses the selected sensors in terms of the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for the scenario where{\color{black}{ the measurements are i.i.d. random vectors with bounded 2\ell_2 norm.}} Simulation results demonstrate efficacy of the randomized greedy algorithm in a comparison with greedy and semidefinite programming relaxation methods

    Security Games with Information Leakage: Modeling and Computation

    Full text link
    Most models of Stackelberg security games assume that the attacker only knows the defender's mixed strategy, but is not able to observe (even partially) the instantiated pure strategy. Such partial observation of the deployed pure strategy -- an issue we refer to as information leakage -- is a significant concern in practical applications. While previous research on patrolling games has considered the attacker's real-time surveillance, our settings, therefore models and techniques, are fundamentally different. More specifically, after describing the information leakage model, we start with an LP formulation to compute the defender's optimal strategy in the presence of leakage. Perhaps surprisingly, we show that a key subproblem to solve this LP (more precisely, the defender oracle) is NP-hard even for the simplest of security game models. We then approach the problem from three possible directions: efficient algorithms for restricted cases, approximation algorithms, and heuristic algorithms for sampling that improves upon the status quo. Our experiments confirm the necessity of handling information leakage and the advantage of our algorithms
    corecore