30,198 research outputs found
Configurable Strategies for Work-stealing
Work-stealing systems are typically oblivious to the nature of the tasks they
are scheduling. For instance, they do not know or take into account how long a
task will take to execute or how many subtasks it will spawn. Moreover, the
actual task execution order is typically determined by the underlying task
storage data structure, and cannot be changed. There are thus possibilities for
optimizing task parallel executions by providing information on specific tasks
and their preferred execution order to the scheduling system.
We introduce scheduling strategies to enable applications to dynamically
provide hints to the task-scheduling system on the nature of specific tasks.
Scheduling strategies can be used to independently control both local task
execution order as well as steal order. In contrast to conventional scheduling
policies that are normally global in scope, strategies allow the scheduler to
apply optimizations on individual tasks. This flexibility greatly improves
composability as it allows the scheduler to apply different, specific
scheduling choices for different parts of applications simultaneously. We
present a number of benchmarks that highlight diverse, beneficial effects that
can be achieved with scheduling strategies. Some benchmarks (branch-and-bound,
single-source shortest path) show that prioritization of tasks can reduce the
total amount of work compared to standard work-stealing execution order. For
other benchmarks (triangle strip generation) qualitatively better results can
be achieved in shorter time. Other optimizations, such as dynamic merging of
tasks or stealing of half the work, instead of half the tasks, are also shown
to improve performance. Composability is demonstrated by examples that combine
different strategies, both within the same kernel (prefix sum) as well as when
scheduling multiple kernels (prefix sum and unbalanced tree search)
Taming Numbers and Durations in the Model Checking Integrated Planning System
The Model Checking Integrated Planning System (MIPS) is a temporal least
commitment heuristic search planner based on a flexible object-oriented
workbench architecture. Its design clearly separates explicit and symbolic
directed exploration algorithms from the set of on-line and off-line computed
estimates and associated data structures. MIPS has shown distinguished
performance in the last two international planning competitions. In the last
event the description language was extended from pure propositional planning to
include numerical state variables, action durations, and plan quality objective
functions. Plans were no longer sequences of actions but time-stamped
schedules. As a participant of the fully automated track of the competition,
MIPS has proven to be a general system; in each track and every benchmark
domain it efficiently computed plans of remarkable quality. This article
introduces and analyzes the most important algorithmic novelties that were
necessary to tackle the new layers of expressiveness in the benchmark problems
and to achieve a high level of performance. The extensions include critical
path analysis of sequentially generated plans to generate corresponding optimal
parallel plans. The linear time algorithm to compute the parallel plan bypasses
known NP hardness results for partial ordering by scheduling plans with respect
to the set of actions and the imposed precedence relations. The efficiency of
this algorithm also allows us to improve the exploration guidance: for each
encountered planning state the corresponding approximate sequential plan is
scheduled. One major strength of MIPS is its static analysis phase that grounds
and simplifies parameterized predicates, functions and operators, that infers
knowledge to minimize the state description length, and that detects domain
object symmetries. The latter aspect is analyzed in detail. MIPS has been
developed to serve as a complete and optimal state space planner, with
admissible estimates, exploration engines and branching cuts. In the
competition version, however, certain performance compromises had to be made,
including floating point arithmetic, weighted heuristic search exploration
according to an inadmissible estimate and parameterized optimization
Multi-threading a state-of-the-art maximum clique algorithm
We present a threaded parallel adaptation of a state-of-the-art maximum clique
algorithm for dense, computationally challenging graphs. We show that near-linear speedups
are achievable in practice and that superlinear speedups are common. We include results for
several previously unsolved benchmark problems
Distributed Queuing in Dynamic Networks
We consider the problem of forming a distributed queue in the adversarial
dynamic network model of Kuhn, Lynch, and Oshman (STOC 2010) in which the
network topology changes from round to round but the network stays connected.
This is a synchronous model in which network nodes are assumed to be fixed, the
communication links for each round are chosen by an adversary, and nodes do not
know who their neighbors are for the current round before they broadcast their
messages. Queue requests may arrive over rounds at arbitrary nodes and the goal
is to eventually enqueue them in a distributed queue. We present two algorithms
that give a total distributed ordering of queue requests in this model. We
measure the performance of our algorithms through round complexity, which is
the total number of rounds needed to solve the distributed queuing problem. We
show that in 1-interval connected graphs, where the communication links change
arbitrarily between every round, it is possible to solve the distributed
queueing problem in O(nk) rounds using O(log n) size messages, where n is the
number of nodes in the network and k <= n is the number of queue requests.
Further, we show that for more stable graphs, e.g. T-interval connected graphs
where the communication links change in every T rounds, the distributed queuing
problem can be solved in O(n+ (nk/min(alpha,T))) rounds using the same O(log n)
size messages, where alpha > 0 is the concurrency level parameter that captures
the minimum number of active queue requests in the system in any round. These
results hold in any arbitrary (sequential, one-shot concurrent, or dynamic)
arrival of k queue requests in the system. Moreover, our algorithms ensure
correctness in the sense that each queue request is eventually enqueued in the
distributed queue after it is issued and each queue request is enqueued exactly
once. We also provide an impossibility result for this distributed queuing
problem in this model. To the best of our knowledge, these are the first
solutions to the distributed queuing problem in adversarial dynamic networks.Comment: In Proceedings FOMC 2013, arXiv:1310.459
- …