29,362 research outputs found

    Configurable Strategies for Work-stealing

    Full text link
    Work-stealing systems are typically oblivious to the nature of the tasks they are scheduling. For instance, they do not know or take into account how long a task will take to execute or how many subtasks it will spawn. Moreover, the actual task execution order is typically determined by the underlying task storage data structure, and cannot be changed. There are thus possibilities for optimizing task parallel executions by providing information on specific tasks and their preferred execution order to the scheduling system. We introduce scheduling strategies to enable applications to dynamically provide hints to the task-scheduling system on the nature of specific tasks. Scheduling strategies can be used to independently control both local task execution order as well as steal order. In contrast to conventional scheduling policies that are normally global in scope, strategies allow the scheduler to apply optimizations on individual tasks. This flexibility greatly improves composability as it allows the scheduler to apply different, specific scheduling choices for different parts of applications simultaneously. We present a number of benchmarks that highlight diverse, beneficial effects that can be achieved with scheduling strategies. Some benchmarks (branch-and-bound, single-source shortest path) show that prioritization of tasks can reduce the total amount of work compared to standard work-stealing execution order. For other benchmarks (triangle strip generation) qualitatively better results can be achieved in shorter time. Other optimizations, such as dynamic merging of tasks or stealing of half the work, instead of half the tasks, are also shown to improve performance. Composability is demonstrated by examples that combine different strategies, both within the same kernel (prefix sum) as well as when scheduling multiple kernels (prefix sum and unbalanced tree search)

    Taming Numbers and Durations in the Model Checking Integrated Planning System

    Full text link
    The Model Checking Integrated Planning System (MIPS) is a temporal least commitment heuristic search planner based on a flexible object-oriented workbench architecture. Its design clearly separates explicit and symbolic directed exploration algorithms from the set of on-line and off-line computed estimates and associated data structures. MIPS has shown distinguished performance in the last two international planning competitions. In the last event the description language was extended from pure propositional planning to include numerical state variables, action durations, and plan quality objective functions. Plans were no longer sequences of actions but time-stamped schedules. As a participant of the fully automated track of the competition, MIPS has proven to be a general system; in each track and every benchmark domain it efficiently computed plans of remarkable quality. This article introduces and analyzes the most important algorithmic novelties that were necessary to tackle the new layers of expressiveness in the benchmark problems and to achieve a high level of performance. The extensions include critical path analysis of sequentially generated plans to generate corresponding optimal parallel plans. The linear time algorithm to compute the parallel plan bypasses known NP hardness results for partial ordering by scheduling plans with respect to the set of actions and the imposed precedence relations. The efficiency of this algorithm also allows us to improve the exploration guidance: for each encountered planning state the corresponding approximate sequential plan is scheduled. One major strength of MIPS is its static analysis phase that grounds and simplifies parameterized predicates, functions and operators, that infers knowledge to minimize the state description length, and that detects domain object symmetries. The latter aspect is analyzed in detail. MIPS has been developed to serve as a complete and optimal state space planner, with admissible estimates, exploration engines and branching cuts. In the competition version, however, certain performance compromises had to be made, including floating point arithmetic, weighted heuristic search exploration according to an inadmissible estimate and parameterized optimization

    Multi-threading a state-of-the-art maximum clique algorithm

    Get PDF
    We present a threaded parallel adaptation of a state-of-the-art maximum clique algorithm for dense, computationally challenging graphs. We show that near-linear speedups are achievable in practice and that superlinear speedups are common. We include results for several previously unsolved benchmark problems

    Distributed Queuing in Dynamic Networks

    Full text link
    We consider the problem of forming a distributed queue in the adversarial dynamic network model of Kuhn, Lynch, and Oshman (STOC 2010) in which the network topology changes from round to round but the network stays connected. This is a synchronous model in which network nodes are assumed to be fixed, the communication links for each round are chosen by an adversary, and nodes do not know who their neighbors are for the current round before they broadcast their messages. Queue requests may arrive over rounds at arbitrary nodes and the goal is to eventually enqueue them in a distributed queue. We present two algorithms that give a total distributed ordering of queue requests in this model. We measure the performance of our algorithms through round complexity, which is the total number of rounds needed to solve the distributed queuing problem. We show that in 1-interval connected graphs, where the communication links change arbitrarily between every round, it is possible to solve the distributed queueing problem in O(nk) rounds using O(log n) size messages, where n is the number of nodes in the network and k <= n is the number of queue requests. Further, we show that for more stable graphs, e.g. T-interval connected graphs where the communication links change in every T rounds, the distributed queuing problem can be solved in O(n+ (nk/min(alpha,T))) rounds using the same O(log n) size messages, where alpha > 0 is the concurrency level parameter that captures the minimum number of active queue requests in the system in any round. These results hold in any arbitrary (sequential, one-shot concurrent, or dynamic) arrival of k queue requests in the system. Moreover, our algorithms ensure correctness in the sense that each queue request is eventually enqueued in the distributed queue after it is issued and each queue request is enqueued exactly once. We also provide an impossibility result for this distributed queuing problem in this model. To the best of our knowledge, these are the first solutions to the distributed queuing problem in adversarial dynamic networks.Comment: In Proceedings FOMC 2013, arXiv:1310.459
    • …
    corecore