11,440 research outputs found
Run Generation Revisited: What Goes Up May or May Not Come Down
In this paper, we revisit the classic problem of run generation. Run
generation is the first phase of external-memory sorting, where the objective
is to scan through the data, reorder elements using a small buffer of size M ,
and output runs (contiguously sorted chunks of elements) that are as long as
possible.
We develop algorithms for minimizing the total number of runs (or
equivalently, maximizing the average run length) when the runs are allowed to
be sorted or reverse sorted. We study the problem in the online setting, both
with and without resource augmentation, and in the offline setting.
(1) We analyze alternating-up-down replacement selection (runs alternate
between sorted and reverse sorted), which was studied by Knuth as far back as
1963. We show that this simple policy is asymptotically optimal. Specifically,
we show that alternating-up-down replacement selection is 2-competitive and no
deterministic online algorithm can perform better.
(2) We give online algorithms having smaller competitive ratios with resource
augmentation. Specifically, we exhibit a deterministic algorithm that, when
given a buffer of size 4M , is able to match or beat any optimal algorithm
having a buffer of size M . Furthermore, we present a randomized online
algorithm which is 7/4-competitive when given a buffer twice that of the
optimal.
(3) We demonstrate that performance can also be improved with a small amount
of foresight. We give an algorithm, which is 3/2-competitive, with
foreknowledge of the next 3M elements of the input stream. For the extreme case
where all future elements are known, we design a PTAS for computing the optimal
strategy a run generation algorithm must follow.
(4) Finally, we present algorithms tailored for nearly sorted inputs which
are guaranteed to have optimal solutions with sufficiently long runs
Parallel Batch-Dynamic Graph Connectivity
In this paper, we study batch parallel algorithms for the dynamic
connectivity problem, a fundamental problem that has received considerable
attention in the sequential setting. The most well known sequential algorithm
for dynamic connectivity is the elegant level-set algorithm of Holm, de
Lichtenberg and Thorup (HDT), which achieves amortized time per
edge insertion or deletion, and time per query. We
design a parallel batch-dynamic connectivity algorithm that is work-efficient
with respect to the HDT algorithm for small batch sizes, and is asymptotically
faster when the average batch size is sufficiently large. Given a sequence of
batched updates, where is the average batch size of all deletions, our
algorithm achieves expected amortized work per
edge insertion and deletion and depth w.h.p. Our algorithm
answers a batch of connectivity queries in expected
work and depth w.h.p. To the best of our knowledge, our algorithm
is the first parallel batch-dynamic algorithm for connectivity.Comment: This is the full version of the paper appearing in the ACM Symposium
on Parallelism in Algorithms and Architectures (SPAA), 201
Truth and Regret in Online Scheduling
We consider a scheduling problem where a cloud service provider has multiple
units of a resource available over time. Selfish clients submit jobs, each with
an arrival time, deadline, length, and value. The service provider's goal is to
implement a truthful online mechanism for scheduling jobs so as to maximize the
social welfare of the schedule. Recent work shows that under a stochastic
assumption on job arrivals, there is a single-parameter family of mechanisms
that achieves near-optimal social welfare. We show that given any such family
of near-optimal online mechanisms, there exists an online mechanism that in the
worst case performs nearly as well as the best of the given mechanisms. Our
mechanism is truthful whenever the mechanisms in the given family are truthful
and prompt, and achieves optimal (within constant factors) regret.
We model the problem of competing against a family of online scheduling
mechanisms as one of learning from expert advice. A primary challenge is that
any scheduling decisions we make affect not only the payoff at the current
step, but also the resource availability and payoffs in future steps.
Furthermore, switching from one algorithm (a.k.a. expert) to another in an
online fashion is challenging both because it requires synchronization with the
state of the latter algorithm as well as because it affects the incentive
structure of the algorithms. We further show how to adapt our algorithm to a
non-clairvoyant setting where job lengths are unknown until jobs are run to
completion. Once again, in this setting, we obtain truthfulness along with
asymptotically optimal regret (within poly-logarithmic factors)
A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks
We study the problem of scheduling sensors in a resource-constrained linear
dynamical system, where the objective is to select a small subset of sensors
from a large network to perform the state estimation task. We formulate this
problem as the maximization of a monotone set function under a matroid
constraint. We propose a randomized greedy algorithm that is significantly
faster than state-of-the-art methods. By introducing the notion of curvature
which quantifies how close a function is to being submodular, we analyze the
performance of the proposed algorithm and find a bound on the expected mean
square error (MSE) of the estimator that uses the selected sensors in terms of
the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for
the scenario where{\color{black}{ the measurements are i.i.d. random vectors
with bounded norm.}} Simulation results demonstrate efficacy of the
randomized greedy algorithm in a comparison with greedy and semidefinite
programming relaxation methods
A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks
We study the problem of scheduling sensors in a resource-constrained linear
dynamical system, where the objective is to select a small subset of sensors
from a large network to perform the state estimation task. We formulate this
problem as the maximization of a monotone set function under a matroid
constraint. We propose a randomized greedy algorithm that is significantly
faster than state-of-the-art methods. By introducing the notion of curvature
which quantifies how close a function is to being submodular, we analyze the
performance of the proposed algorithm and find a bound on the expected mean
square error (MSE) of the estimator that uses the selected sensors in terms of
the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for
the scenario where{\color{black}{ the measurements are i.i.d. random vectors
with bounded norm.}} Simulation results demonstrate efficacy of the
randomized greedy algorithm in a comparison with greedy and semidefinite
programming relaxation methods
Security Games with Information Leakage: Modeling and Computation
Most models of Stackelberg security games assume that the attacker only knows
the defender's mixed strategy, but is not able to observe (even partially) the
instantiated pure strategy. Such partial observation of the deployed pure
strategy -- an issue we refer to as information leakage -- is a significant
concern in practical applications. While previous research on patrolling games
has considered the attacker's real-time surveillance, our settings, therefore
models and techniques, are fundamentally different. More specifically, after
describing the information leakage model, we start with an LP formulation to
compute the defender's optimal strategy in the presence of leakage. Perhaps
surprisingly, we show that a key subproblem to solve this LP (more precisely,
the defender oracle) is NP-hard even for the simplest of security game models.
We then approach the problem from three possible directions: efficient
algorithms for restricted cases, approximation algorithms, and heuristic
algorithms for sampling that improves upon the status quo. Our experiments
confirm the necessity of handling information leakage and the advantage of our
algorithms
- …