598 research outputs found

    Lock-free Concurrent Data Structures

    Full text link
    Concurrent data structures are the data sharing side of parallel programming. Data structures give the means to the program to store data, but also provide operations to the program to access and manipulate these data. These operations are implemented through algorithms that have to be efficient. In the sequential setting, data structures are crucially important for the performance of the respective computation. In the parallel programming setting, their importance becomes more crucial because of the increased use of data and resource sharing for utilizing parallelism. The first and main goal of this chapter is to provide a sufficient background and intuition to help the interested reader to navigate in the complex research area of lock-free data structures. The second goal is to offer the programmer familiarity to the subject that will allow her to use truly concurrent methods.Comment: To appear in "Programming Multi-core and Many-core Computing Systems", eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and Distributed Computin

    Non-Blocking Doubly-Linked Lists with Good Amortized Complexity

    Get PDF
    We present a new non-blocking doubly-linked list implementation for an asynchronous shared-memory system. It is the first such implementation for which an upper bound on amortized time complexity has been proved. In our implementation, operations access the list via cursors. Each cursor is located at an item in the list and is local to a process. In our implementation, cursors can be used to traverse and update the list, even as concurrent operations modify the list. The implementation supports two update operations, insertBefore and delete, and two move operations, moveRight and moveLeft. An insertBefore(c, x) operation inserts an item x into the list immediately before the cursor c\u27s location. A delete(c) operation removes the item at the cursor c\u27s location and sets the cursor to the next item in the list. The move operations move the cursor one position to the right or left. Update operations use single-word Compare&Swap instructions. Move operations only read shared memory and never change the state of the data structure. If all update operations modify different parts of the list, they run completely concurrently. A cursor is active if it is initialized, but not yet removed from the process\u27s set of cursors. Let c.(op) be the maximum number of active cursors at any one time during the operation op. The amortized step complexity is O(c.(op)) for each update op and O(1) for each move. We provide a detailed correctness proof and amortized analysis of our implementation

    Some Optimally Adaptive Parallel Graph Algorithms on EREW PRAM Model

    Get PDF
    The study of graph algorithms is an important area of research in computer science, since graphs offer useful tools to model many real-world situations. The commercial availability of parallel computers have led to the development of efficient parallel graph algorithms. Using an exclusive-read and exclusive-write (EREW) parallel random access machine (PRAM) as the computation model with a fixed number of processors, we design and analyze parallel algorithms for seven undirected graph problems, such as, connected components, spanning forest, fundamental cycle set, bridges, bipartiteness, assignment problems, and approximate vertex coloring. For all but the last two problems, the input data structure is an unordered list of edges, and divide-and-conquer is the paradigm for designing algorithms. One of the algorithms to solve the assignment problem makes use of an appropriate variant of dynamic programming strategy. An elegant data structure, called the adjacency list matrix, used in a vertex-coloring algorithm avoids the sequential nature of linked adjacency lists. Each of the proposed algorithms achieves optimal speedup, choosing an optimal granularity (thus exploiting maximum parallelism) which depends on the density or the number of vertices of the given graph. The processor-(time)2 product has been identified as a useful parameter to measure the cost-effectiveness of a parallel algorithm. We derive a lower bound on this measure for each of our algorithms

    Extending the Finite Domain Solver of GNU Prolog

    No full text
    International audienceThis paper describes three significant extensions for the Finite Domain solver of GNU Prolog. First, the solver now supports negative integers. Second, the solver detects and prevents integer overflows from occurring. Third, the internal representation of sparse domains has been redesigned to overcome its current limitations. The preliminary performance evaluation shows a limited slowdown factor with respect to the initial solver. This factor is widely counterbalanced by the new possibilities and the robustness of the solver. Furthermore these results are preliminary and we propose some directions to limit this overhead

    Simple, safe, and efficient memory management using linear pointers

    Full text link
    Efficient and safe memory management is a hard problem. Garbage collection promises automatic memory management but comes with the cost of increased memory footprint, reduced parallelism in multi-threaded programs, unpredictable pause time, and intricate tuning parameters balancing the program's workload and designated memory usage in order for an application to perform reasonably well. Existing research mitigates the above problems to some extent, but programmer error could still cause memory leak by erroneously keeping memory references when they are no longer needed. We need a methodology for programmers to become resource aware, so that efficient, scalable, predictable and high performance programs may be written without the fear of resource leak. Linear logic has been recognized as the formalism of choice for resource tracking. It requires explicit introduction and elimination of resources and guarantees that a resource cannot be implicitly shared or abandoned, hence must be linear. Early languages based on linear logic focused on Curry-Howard correspondence. They began by limiting the expressive powers of the language and then reintroduced them by allowing controlled sharing which is necessary for recursive functions. However, only by deviating from Curry-Howard correspondence could later development actually address programming errors in resource usage. The contribution of this dissertation is a simple, safe, and efficient approach introducing linear resource ownership semantics into C++ (which is still a widely used language after 30 years since inception) through linear pointer, a smart pointer inspired by linear logic. By implementing various linear data structures and a parallel, multi-threaded memory allocator based on these data structures, this work shows that linear pointer is practical and efficient in the real world, and that it is possible to build a memory management stack that is entirely leak free. The dissertation offers some closing remarks on the difficulties a formal system would encounter when reasoning about a concurrent linear data algorithm, and what might be done to solve these problems

    Novel Non-Blocking Approach for a Concurrent Heap

    Full text link
    We present a non-blocking algorithm for a concurrent heap in asynchronous shared memory multiprocessors. Processors in these machines often execute instructions at varying speeds and are subject to arbitrarily long delays. Our implementation supports Non-blocking Insert, Delete, and Find-Min operations on a heap. Non-blocking techniques avoid the drawbacks associated with mutual exclusion and also admit improved parallelism. Insert and DeleteMin operations in heap take more than one atomic instruction to complete, thus, it is possible that a new operation may be started before the previous one completes, leaving heap inconsistent between operations. We have represented heap as an array of pointers. Any modification to the heap is done by using single-word Compare&Swap instructions. If all update operations modify different parts of the heap, they run completely concurrently. Our approach guarantees lock-freedom for all concurrent threads and wait-freedom if threads execute operations on different nodes of the heap
    • …
    corecore