9 research outputs found

    Lock-free Concurrent Data Structures

    Full text link
    Concurrent data structures are the data sharing side of parallel programming. Data structures give the means to the program to store data, but also provide operations to the program to access and manipulate these data. These operations are implemented through algorithms that have to be efficient. In the sequential setting, data structures are crucially important for the performance of the respective computation. In the parallel programming setting, their importance becomes more crucial because of the increased use of data and resource sharing for utilizing parallelism. The first and main goal of this chapter is to provide a sufficient background and intuition to help the interested reader to navigate in the complex research area of lock-free data structures. The second goal is to offer the programmer familiarity to the subject that will allow her to use truly concurrent methods.Comment: To appear in "Programming Multi-core and Many-core Computing Systems", eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and Distributed Computin

    A Concurrency and Time Centered Framework for Certification of Autonomous Space Systems

    Get PDF
    Future space missions, such as Mars Science Laboratory, suggest the engineering of some of the most complex man-rated autonomous software systems. The present process-oriented certification methodologies are becoming prohibitively expensive and do not reach the level of detail of providing guidelines for the development and validation of concurrent software. Time and concurrency are the most critical notions in an autonomous space system. In this work we present the design and implementation of the first concurrency and time centered framework for product-oriented software certification of autonomous space systems. To achieve fast and reliable concurrent interactions, we define and apply the notion of Semantically Enhanced Containers (SEC). SECs are data structures that are designed to provide the flexibility and usability of the popular ISO C++ STL containers, while at the same time they are hand-crafted to guarantee domain-specific policies, such as conformance to a given concurrency model. The application of nonblocking programming techniques is critical to the implementation of our SEC containers. Lock-free algorithms help avoid the hazards of deadlock, livelock, and priority inversion, and at the same time deliver fast and scalable performance. Practical lock-free algorithms are notoriously difficult to design and implement and pose a number of hard problems such as ABA avoidance, high complexity, portability, and meeting the linearizability correctness requirements. This dissertation presents the design of the first lock-free dynamically resizable array. Our approach o ers a set of practical, portable, lock-free, and linearizable STL vector operations and a fast and space effcient implementation when compared to the alternative lock- and STM-based techniques. Currently, the literature does not offer an explicit analysis of the ABA problem, its relation to the most commonly applied nonblocking programming techniques, and the possibilities for its detection and avoidance. Eliminating the hazards of ABA is left to the ingenuity of the software designer. We present a generic and practical solution to the fundamental ABA problem for lock-free descriptor-based designs. To enable our SEC container with the property of validating domain-specific invariants, we present Basic Query, our expression template-based library for statically extracting semantic information from C++ source code. The use of static analysis allows for a far more efficient implementation of our nonblocking containers than would have been otherwise possible when relying on the traditional run-time based techniques. Shared data in a real-time cyber-physical system can often be polymorphic (as is the case with a number of components part of the Mission Data System's Data Management Services). The use of dynamic cast is important in the design of autonomous real-time systems since the operation allows for a direct representation of the management and behavior of polymorphic data. To allow for the application of dynamic cast in mission critical code, we validate and improve a methodology for constant-time dynamic cast that shifts the complexity of the operation to the compiler's static checker. In a case study that demonstrates the applicability of the programming and validation techniques of our certification framework, we show the process of verification and semantic parallelization of the Mission Data System's (MDS) Goal Networks. MDS provides an experimental platform for testing and development of autonomous real-time flight applications

    On Design and Applications of Practical Concurrent Data Structures

    Get PDF
    The proliferation of multicore processors is having an enormous impact on software design and development. In order to exploit parallelism available in multicores, there is a need to design and implement abstractions that programmers can use for general purpose applications development. A common abstraction for coordinated access to memory is a concurrent data structure. Concurrent data structures are challenging to design and implement as they are required to be correct, scalable, and practical under various application constraints. In this thesis, we contribute to the design of efficient concurrent data structures, propose new design techniques and improvements to existing implementations. Additionally, we explore the utilization of concurrent data structures in demanding application contexts such as data stream processing.In the first part of the thesis, we focus on data structures that are difficult to parallelize due to inherent sequential bottlenecks. We present a lock-free vector design that efficiently addresses synchronization bottlenecks by utilizing the combining technique. Typical combining techniques are blocking. Our design introduces combining without sacrificing non-blocking progress guarantees. We extend the vector to present a concurrent lock-free unbounded binary heap that implements a priority queue with mutable priorities.In the second part of the thesis, we shift our focus to concurrent search data structures. In order to offer strong progress guarantee, typical implementations of non-blocking search data structures employ a "helping" mechanism. However, helping may result in performance degradation. We propose help-optimality, which expresses optimization in amortized step complexity of concurrent operations. To describe the concept, we revisit the lock-free designs of a linked-list and a binary search tree and present improved algorithms. We design the algorithms without using any language/platform specific constructs; we do not use bit-stealing or runtime type introspection of objects. Thus, our algorithms are portable. We further delve into multi-dimensional data and similarity search. We present the first lock-free multi-dimensional data structure and linearizable nearest neighbor search algorithm. Our algorithm for nearest neighbor search is generic and can be adapted to other data structures.In the last part of the thesis, we explore the utilization of concurrent data structures for deterministic stream processing. We propose solutions to two challenges prevalent in data stream processing: (1) efficient processing on cloud as well as edge devices and (2) deterministic data-parallel processing at high-throughput and low-latency. As a first step, we present a methodology for customization of streaming aggregation on low-power multicore embedded platforms. Then we introduce Viper, a communication module that can be integrated into stream processing engines for the coordination of threads analyzing data in parallel

    Allocating memory in a lock-free manner

    No full text
    The potential of multiprocessor systems is often not fully realized by their system services. Certain synchronization methods, such as lock-based ones, may limit the parallelism. It is significant to see the impact of wait/lock-free synchronization design in key services for multiprocessor systems, such as the memory allocation service. Efficient, scalable memory allocators for multithreaded applications on multiprocessors is a significant goal of recent research projects. We propose a lock-free memory allocator, to enhance the parallelism in the system. Its architecture is inspired by Hoard, a successful concurrent memory allocator, with a modular, scalable design that preserves scalability and helps avoiding false-sharing and heap blowup. Within our effort on designing appropriate lock-free algorithms to construct this system, we propose a new non-blocking data structure called flat-sets, supporting conventional “internal” operations as well as “inter-object” operations, for moving items between flat-sets. We implemented the memory allocator in a set of multiprocessor systems (UMA Sun Enterprise 450 and ccNUMA Origin 3800) and studied its behaviour. The results show that the good properties of Hoard w.r.t. false-sharing and heap-blowup are preserved, while the scalability properties are enhanced even further with the help of lock-free synchronization

    Eight Biennial Report : April 2005 – March 2007

    No full text

    NBMALLOC: Allocating Memory in a Lock-Free Manner

    No full text
    Efficient, scalable memory allocation for multithreaded applications on multiprocessors is a significant goal of recent research. In the distributed computing literature it has been emphasized that lock-based synchronization and concurrency-control may limit the parallelism in multiprocessor systems. Thus, system services that employ such methods can hinder reaching the full potential of these systems. A natural research question is the pertinence and the impact of lock-free concurrency control in key services for multiprocessors, such as in the memory allocation service, which is the theme of this work. We show the design and implementation of NBMALLOC, a lock-free memory allocator designed to enhance the parallelism in the system. The architecture of NBMALLOC is inspired by Hoard, a well-known concurrent memory allocator, with modular design that preserves scalability and helps avoiding false-sharing and heap-blowup. Within our effort to design appropriate lockfree algorithms for NBMALLOC, we propose and show a lock-free implementation of a new data structure, flat-set, supporting conventional “internal” set operations as well as “inter-object ” operations, for moving items between flat-sets. The design of NBMALLOC also involved a series of other algorithmic problems, which are discussed in the paper. Further, we present the implementation of NBMALLOC and
    corecore