178 research outputs found

    QuakeTM: Parallelizing a complex serial application using transactional memory

    Get PDF
    'Is transactional memory useful?' is the question that cannot be answered until we provide substantial applications that can evaluate its capabilities. While existing TM applications can partially answer the above question, and are useful in the sense that they provide a first-order TM experimentation framework, they serve only as a proof of concept and fail to make a conclusive case for wide adoption by the general computing community. This work presents QuakeTM, a multiplayer game server; a complex real life TM application that was parallelized from the serial version with TM-specific considerations in mind. QuakeTM consists of 27,600 lines of code spread among 49 files and exhibits irregular parallelism and coarse-grain transactions with large read and write sets. In spite of its complexity, we show that QuakeTM does scale, however more effort is needed to decrease the overhead and the abort rate of current software transactional memory systems. We give insights into development challenges, suggest techniques to solve them and provide extensive analysis of transactional behavior of QuakeTM, with an emphasis and discussion of the TM promise of making parallel programming easy.Postprint (published version

    Extensible Scheduling in a Haskell-based Operating System

    Get PDF
    This thesis presents Lighthouse, an experimental branch of the Haskell-based House operating system which integrates Li et al.\u27s Lightweight Concurrency framework. First and foremost, it improves House\u27s viability as a real operating system by providing a new extensible scheduler framework which makes it easy to experiment with different scheduling policies. In particular, Lighthouse extends Concurrent Haskell with thread priority and implements a priority-based scheduler which significantly improves system responsiveness when compared with GHC\u27s normal round-robin scheduler. Even while doing this, it improves on House\u27s claim of being written in Haskell by moving a whole subsystem out of the complex C-based runtime system and into Haskell itself. In addition, Lighthouse also includes an alternate, simpler implementation of Lightweight Concurrency which takes advantage of House\u27s unique setting (running directly on uniprocessor x86 hardware). This experience sheds light on areas that need further attention before the system can truly be viable---primarily interactions between blackholing and interrupt handling. In particular, this thesis uncovers a potential case of self-deadlock and suggests potential solutions. Finally, this work offers further insight into the viability of using high-level languages such as Haskell for systems programming. Although laziness and blackholing present unique problems, many parts of the system are still much easier to express in Haskell than traditional languages such as C

    C4: Verified Transactional Objects

    Get PDF
    A framework for Verified Transactional Objects in Coq. - Formalization of concurrent objects, linearizability, strict serializability, and associated proof techniques. - Verified linearizable concurrent hash map - Verified strictly serializable TML - Verified strictly serializable transaction-predicated ma

    The 9th Conference of PhD Students in Computer Science

    Get PDF

    Automatic skeleton-driven performance optimizations for transactional memory

    Get PDF
    The recent shift toward multi -core chips has pushed the burden of extracting performance to the programmer. In fact, programmers now have to be able to uncover more coarse -grain parallelism with every new generation of processors, or the performance of their applications will remain roughly the same or even degrade. Unfortunately, parallel programming is still hard and error prone. This has driven the development of many new parallel programming models that aim to make this process efficient.This thesis first combines the skeleton -based and transactional memory programming models in a new framework, called OpenSkel, in order to improve performance and programmability of parallel applications. This framework provides a single skeleton that allows the implementation of transactional worklist applications. Skeleton or pattern-based programming allows parallel programs to be expressed as specialized instances of generic communication and computation patterns. This leaves the programmer with only the implementation of the particular operations required to solve the problem at hand. Thus, this programming approach simplifies parallel programming by eliminating some of the major challenges of parallel programming, namely thread communication, scheduling and orchestration. However, the application programmer has still to correctly synchronize threads on data races. This commonly requires the use of locks to guarantee atomic access to shared data. In particular, lock programming is vulnerable to deadlocks and also limits coarse grain parallelism by blocking threads that could be potentially executed in parallel.Transactional Memory (TM) thus emerges as an attractive alternative model to simplify parallel programming by removing this burden of handling data races explicitly. This model allows programmers to write parallel code as transactions, which are then guaranteed by the runtime system to execute atomically and in isolation regardless of eventual data races. TM programming thus frees the application from deadlocks and enables the exploitation of coarse grain parallelism when transactions do not conflict very often. Nevertheless, thread management and orchestration are left for the application programmer. Fortunately, this can be naturally handled by a skeleton framework. This fact makes the combination of skeleton -based and transactional programming a natural step to improve programmability since these models complement each other. In fact, this combination releases the application programmer from dealing with thread management and data races, and also inherits the performance improvements of both models. In addition to it, a skeleton framework is also amenable to skeleton - driven iii performance optimizations that exploits the application pattern and system information.This thesis thus also presents a set of pattern- oriented optimizations that are automatically selected and applied in a significant subset of transactional memory applications that shares a common pattern called worklist. These optimizations exploit the knowledge about the worklist pattern and the TM nature of the applications to avoid transaction conflicts, to prefetch data, to reduce contention etc. Using a novel autotuning mechanism, OpenSkel dynamically selects the most suitable set of these patternoriented performance optimizations for each application and adjusts them accordingly. Experimental results on a subset of five applications from the STAMP benchmark suite show that the proposed autotuning mechanism can achieve performance improvements within 2 %, on average, of a static oracle for a 16 -core UMA (Uniform Memory Access) platform and surpasses it by 7% on average for a 32 -core NUMA (Non -Uniform Memory Access) platform.Finally, this thesis also investigates skeleton -driven system- oriented performance optimizations such as thread mapping and memory page allocation. In order to do it, the OpenSkel system and also the autotuning mechanism are extended to accommodate these optimizations. The conducted experimental results on a subset of five applications from the STAMP benchmark show that the OpenSkel framework with the extended autotuning mechanism driving both pattern and system- oriented optimizations can achieve performance improvements of up to 88 %, with an average of 46 %, over a baseline version for a 16 -core UMA platform and up to 162 %, with an average of 91 %, for a 32 -core NUMA platform

    Executable Denotational Semantics With Interaction Trees

    Get PDF
    Interaction trees are a representation of effectful and reactive systemsdesigned to be implemented in a proof assistant such as Coq. They are equipped with a rich algebra of combinators to construct recursive and effectful computations and to reason about them equationally. Interaction trees are also an executable structure, notably via extraction, which enables testing and directly developing executable programs in Coq. To demonstrate the usefulness of interaction trees, two applications are presented. First, I develop a novel approach to verify a compiler from a simple imperative language to assembly, by proving a semantic preservation theorem which is termination-sensitive, using an equational proof. Second, I present a framework of concurrent objects, inheriting the modularity, compositionality, and executability of interaction trees. Leveraging that framework, I formally prove the correctness of a transactionally predicated map, using a novel approach to reason about objects combining the notions of linearizability and strict serializability, two well-known correctness conditions for concurrent objects

    Correctness and Progress Verification of Non-Blocking Programs

    Get PDF
    The progression of multi-core processors has inspired the development of concurrency libraries that guarantee safety and liveness properties of multiprocessor applications. The difficulty of reasoning about safety and liveness properties in a concurrent environment has led to the development of tools to verify that a concurrent data structure meets a correctness condition or progress guarantee. However, these tools possess shortcomings regarding the ability to verify a composition of data structure operations. Additionally, verification techniques for transactional memory evaluate correctness based on low-level read/write histories, which is not applicable to transactional data structures that use a high-level semantic conflict detection. In my dissertation, I present tools for checking the correctness of multiprocessor programs that overcome the limitations of previous correctness verification techniques. Correctness Condition Specification (CCSpec) is the first tool that automatically checks the correctness of a composition of concurrent multi-container operations performed in a non-atomic manner. Transactional Correctness tool for Abstract Data Types (TxC-ADT) is the first tool that can check the correctness of transactional data structures. TxC-ADT elevates the standard definitions of transactional correctness to be in terms of an abstract data type, an essential aspect for checking correctness of transactions that synchronize only for high-level semantic conflicts. Many practical concurrent data structures, transactional data structures, and algorithms to facilitate non-blocking programming all incorporate helping schemes to ensure that an operation comprising multiple atomic steps is completed according to the progress guarantee. The helping scheme introduces additional interference by the active threads in the system to achieve the designed progress guarantee. Previous progress verification techniques do not accommodate loops whose termination is dependent on complex behaviors of the interfering threads, making these approaches unsuitable. My dissertation presents the first progress verification technique for non-blocking algorithms that are dependent on descriptor-based helping mechanisms

    Reasoning about Locks and Transactions in Concurrent Programs

    Get PDF
    The aim of this thesis is to present novel techniques for reasoning about the dynamic and static semantics of concurrent programs that use locks and transactions to isolate accesses to shared memory. We use moverness to characterise the observational semantics of reads issued by locks and transactions under the simpler semantics of free, left, right and both movers. The second contribution is guaranteed transactions which are a safer alternative to locks and the privatisation/publication idioms for specific scenarios. Guaranteed transactions facilitate a simpler pessimistic coordination semantics than locks, but offer most of the conveniences that have made transactions appealing. Finally, we present a static analysis for reasoning about the isolation of a program that uses locks and transactions. If our isolation algorithm determines that all the accesses issued by a program are isolated, then the program is declared data-race-free
    • …
    corecore