Search CORE

22,571 research outputs found

Data Structures for Task-based Priority Scheduling

Author: Cederman Daniel
Träff Jesper Larsson
Tsigas Philippas
Versaci Francesco
Wimmer Martin
Publication venue
Publication date: 09/12/2013
Field of study

Many task-parallel applications can benefit from attempting to execute tasks in a specific order, as for instance indicated by priorities associated with the tasks. We present three lock-free data structures for priority scheduling with different trade-offs on scalability and ordering guarantees. First we propose a basic extension to work-stealing that provides good scalability, but cannot provide any guarantees for task-ordering in-between threads. Next, we present a centralized priority data structure based on

k

-fifo queues, which provides strong (but still relaxed with regard to a sequential specification) guarantees. The parameter

k

allows to dynamically configure the trade-off between scalability and the required ordering guarantee. Third, and finally, we combine both data structures into a hybrid,

k

-priority data structure, which provides scalability similar to the work-stealing based approach for larger

k

, while giving strong ordering guarantees for smaller

k

. We argue for using the hybrid data structure as the best compromise for generic, priority-based task-scheduling. We analyze the behavior and trade-offs of our data structures in the context of a simple parallelization of Dijkstra's single-source shortest path algorithm. Our theoretical analysis and simulations show that both the centralized and the hybrid

k

-priority based data structures can give strong guarantees on the useful work performed by the parallel Dijkstra algorithm. We support our results with experimental evidence on an 80-core Intel Xeon system

arXiv.org e-Print Archive

Crossref

Chalmers Research

The Lock-free $k$ -LSM Relaxed Priority Queue

Author: Gruber Jakob
Träff Jesper Larsson
Tsigas Philippas
Wimmer Martin
Publication venue
Publication date: 01/01/2015
Field of study

Priority queues are data structures which store keys in an ordered fashion to allow efficient access to the minimal (maximal) key. Priority queues are essential for many applications, e.g., Dijkstra's single-source shortest path algorithm, branch-and-bound algorithms, and prioritized schedulers. Efficient multiprocessor computing requires implementations of basic data structures that can be used concurrently and scale to large numbers of threads and cores. Lock-free data structures promise superior scalability by avoiding blocking synchronization primitives, but the \emph{delete-min} operation is an inherent scalability bottleneck in concurrent priority queues. Recent work has focused on alleviating this obstacle either by batching operations, or by relaxing the requirements to the \emph{delete-min} operation. We present a new, lock-free priority queue that relaxes the \emph{delete-min} operation so that it is allowed to delete \emph{any} of the

\rho+1

smallest keys, where

\rho

is a runtime configurable parameter. Additionally, the behavior is identical to a non-relaxed priority queue for items added and removed by the same thread. The priority queue is built from a logarithmic number of sorted arrays in a way similar to log-structured merge-trees. We experimentally compare our priority queue to recent state-of-the-art lock-free priority queues, both with relaxed and non-relaxed semantics, showing high performance and good scalability of our approach.Comment: Short version as ACM PPoPP'15 poste

arXiv.org e-Print Archive

Crossref

Chalmers Research

Library abstraction for C/C++ concurrency

Author: Alexey Gotsman
Berdine Wehrman
Manson J.
Mark Batty
Mike Dodds
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

When constructing complex concurrent systems, abstraction is vital: programmers should be able to reason about concurrent libraries in terms of abstract specifications that hide the implementation details. Relaxed memory models present substantial challenges in this respect, as libraries need not provide sequentially consistent abstractions: to avoid unnecessary synchronisation, they may allow clients to observe relaxed memory effects, and library specifications must capture these. In this paper, we propose a criterion for sound library abstraction in the new C11 and C++11 concurrency model, generalising the standard sequentially consistent notion of linearizability. We prove that our criterion soundly captures all client-library interactions, both through call and return values, and through the subtle synchronisation effects arising from the memory model. To illustrate our approach, we verify implementations against specifications for the lock-free Treiber stack and a producer-consumer queue. Ours is the first approach to compositional reasoning for concurrent C11/C++11 programs. 1

CiteSeerX

Crossref

Kent Academic Repository

The Power of Choice in Priority Scheduling

Author: Alistarh Dan
Kopinsky Justin
Li Jerry
Nadiradze Giorgi
Publication venue
Publication date: 01/01/2017
Field of study

Consider the following random process: we are given

n

queues, into which elements of increasing labels are inserted uniformly at random. To remove an element, we pick two queues at random, and remove the element of lower label (higher priority) among the two. The cost of a removal is the rank of the label removed, among labels still present in any of the queues, that is, the distance from the optimal choice at each step. Variants of this strategy are prevalent in state-of-the-art concurrent priority queue implementations. Nonetheless, it is not known whether such implementations provide any rank guarantees, even in a sequential model. We answer this question, showing that this strategy provides surprisingly strong guarantees: Although the single-choice process, where we always insert and remove from a single randomly chosen queue, has degrading cost, going to infinity as we increase the number of steps, in the two choice process, the expected rank of a removed element is

O( n )

while the expected worst-case cost is

O( n \log n )

. These bounds are tight, and hold irrespective of the number of steps for which we run the process. The argument is based on a new technical connection between "heavily loaded" balls-into-bins processes and priority scheduling. Our analytic results inspire a new concurrent priority queue implementation, which improves upon the state of the art in terms of practical performance

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

IST Austria: PubRep (Institute of Science and Technology)