243 research outputs found

    Efficient means of Achieving Composability using Transactional Memory

    Get PDF
    The major focus of software transaction memory systems (STMs) has been to facilitate the multiprocessor programming and provide parallel programmers with an abstraction for fast development of the concurrent and parallel applications. Thus, STMs allow the parallel programmers to focus on the logic of parallel programs rather than worrying about synchronization. Heart of such applications is the underlying concurrent data-structure. The design of the underlying concurrent data-structure is the deciding factor whether the software application would be efficient, scalable and composable. However, achieving composition in concurrent data structures such that they are efficient as well as easy to program poses many consistency and design challenges. We say a concurrent data structure compose when multiple operations from same or different object instances of the concurrent data structure can be glued together such that the new operation also behaves atomically. For example, assume we have a linked-list as the concurrent data structure with lookup, insert and delete as the atomic operations. Now, we want to implement the new move operation, which would delete a node from one position of the list and would insert into the another or same list. Such a move operation may not be atomic(transactional) as it may result in an execution where another process may access the inconsistent state of the linked-list where the node is deleted but not yet inserted into the list. Thus, this inability of composition in the concurrent data structures may hinder their practical use. In this context, the property of compositionality provided by the transactions in STMs can be handy. STMs provide easy to program and compose transactional interface which can be used to develop concurrent data structures thus the parallel software applications. However, whether this can be achieved efficiently is a question we would try to answer in this thesis. Most of the STMs proposed in the literature are based on read/write primitive operations(or methods) on memory buffers and hence denoted RWSTMs. These lower level read/write primitive operations do not provide any other useful information except that a write operation always needs to be ordered with any other read or write. Thus limiting the number of possible concurrent executions. In this thesis, we consider Object-based STMs or OSTMs which operate on higher level objects rather than read/write operations on memory locations. The main advantage of considering OSTMs is that with the greater semantic information provided by the methods of the object, the conflicts among the transactions can be reduced and as a result, the number of aborts will also be less. This allows for larger number of permissive concurrent executions leading to more concurrency. Hence, OSTMs could be an efficient means of achieving composability of higher-level operations in the software applications using the concurrent data structures. This would allow parallel programmers to leverage underlying multi-core architecture. To design the OSTM, we have adopted the transactional tree model developed for databases. We extend the traditional notion of conflicts and legality to higher level operations in STMs which allows efficient composability. Using these notions we define the standard STM correctness notion of Conflict-Opacity. The OSTM model can be easily extended to implement concurrent lists, sets, queues or other concurrent data structures. We use the proposed theoretical OSTM model to design HT-OSTM - an OSTM with underlying hash table object. We noticed that major concurrency hot-spot is the chaining data structure within the hash table. So, we have used Lazyskip-list approach which is time efficient compared to normal lists in terms of traversal overhead. At the transactional level, we use timestamp ordering protocol to ensure that the executions are conflict-opaque. We provide a detailed handcrafted proof of correctness starting from operational level to the transactional level. At the operational level we show that HT-OSTM generates legal sequential history. At vi transactional level we show that every such sequential history would be opaque thus co-opaque. The HT-OSTM exports STM insert, STM lookup and STM delete methods to the programmer along-with STM begin and STM trycommit. Using these higher level operations user may easily and efficiently program any parallel software application involving concurrent hash table. To demonstrate the efficiency of composition we build a test application which executes the number of hash-tab methods (generated with a given probability) atomically in a transaction. Finally, we evaluate HT-OSTM against ESTM based hash table of synchrobench and the hash-table designed for RWSTM based on basic time stamp ordering protocol. We observe that HT-OSTM outperforms ESTM by the average magnitude of 106 transactions per second(throughput) for both lookup intensive and update intensive work load. HT-OSTM outperforms RWSTM by 3% & 3.4% update intensive and lookup intensive workload respectivel

    Efficient means of Achieving Composability using Transactional Memory

    Get PDF
    A major focus of software transaction memory systems (STMs) has been to felicitate the multiprocessor programming and provide parallel programmers an abstraction for speedy and efficient development of parallel applications. To this end, different models for incorporating object/higher level semantics into STM have recently been proposed in transactional boosting, transactional data structure library, open nested transactions and abstract nested transactions. We build an alternative object model STM (OSTM) by adopting the transactional tree model of Weikum et al. originally given for databases and extend the current work by proposing comprehensive legality definitions and conflict notion which allows efficient composability of \otm{}. We first time show the proposed OSTM to be co-opacity. We build OSTM using chained hash table data structure. Lazyskip-list is used to implement chaining using lazy approach. We notice that major concurrency hotspot is the chaining data structure within the hash table. Lazyskip-list is time efficient compared to lists in terms of traversal overhead by average case O(log(n)). We optimise lookups as they are validated at the instant they execute and they have not validated again in commit phase. This allows lookup dominated transactions to be more efficient and at the same time co-opacity

    Achieving Starvation-Freedom with Greater Concurrency in Multi-Version Object-based Transactional Memory Systems

    Full text link
    To utilize the multi-core processors properly concurrent programming is needed. Concurrency control is the main challenge while designing a correct and efficient concurrent program. Software Transactional Memory Systems (STMs) provides ease of multithreading to the programmer without worrying about concurrency issues such as deadlock, livelock, priority inversion, etc. Most of the STMs works on read-write operations known as RWSTMs. Some STMs work at high-level operations and ensure greater concurrency than RWSTMs. Such STMs are known as Object-Based STMs (OSTMs). The transactions of OSTMs can return commit or abort. Aborted OSTMs transactions retry. But in the current setting of OSTMs, transactions may starve. So, we proposed a Starvation-Free OSTM (SF-OSTM) which ensures starvation-freedom in object based STM systems while satisfying the correctness criteria as co-opacity. Databases, RWSTMs and OSTMs say that maintaining multiple versions corresponding to each key of transaction reduces the number of aborts and improves the throughput. So, to achieve greater concurrency, we proposed Starvation-Free Multi-Version OSTM (SF-MVOSTM) which ensures starvation-freedom while storing multiple versions corresponding to each key and satisfies the correctness criteria such as local opacity. To show the performance benefits, We implemented three variants of SF-MVOSTM (SF-MVOSTM, SF-MVOSTM-GC and SF-KOSTM) and compared it with state-of-the-art STMs.Comment: 68 pages, 24 figures. arXiv admin note: text overlap with arXiv:1709.0103

    Performance Optimizations for Software Transactional Memory

    Get PDF
    The transition from single-core processors to multi-core processors demands a change from sequential programming to concurrent programming for mainstream programmers. However, concurrent programming has long been widely recognized as being notoriously difficult. A major reason for its difficulty is that existing concurrent programming constructs provide low-level programming abstractions. Using these constructs forces programmers to consider many low level details. Locks, the dominant programming construct for mutual exclusion, suffer several well known problems, such as deadlock, priority inversion, and convoying, and are directly related to the difficulty of concurrent programming. The alternative to locks, i.e. non-blocking programming, not only is extremely error-prone, but also does not produce consistently good performance. Better programming constructs are critical to reduce the complexity of concurrent programming, increase productivity, and expose the computing power in multi-core processors. Transactional memory has emerged recently as one promising programming construct for supporting atomic operations on shared data. By eliminating the need to consider a huge number of possible interactions among concurrent transactions, Transactional memory greatly reduces the complexity of concurrent programming and vastly improves programming productivity. Software transactional memory systems implement a transactional memory abstraction in software. Unfortunately, existing designs of Software Transactional Memory systems incur significant performance overhead that could potentially prevent it from being widely used. Reducing STM's overhead will be critical for mainstream programmers to improve productivity while not suffering performance degradation. My thesis is that the performance of STM can be significantly improved by intelligently designing validation and commit protocols, by designing the time base, and by incorporating application-specific knowledge. I present four novel techniques for improving performance of STM systems to support my thesis. First, I propose a time-based STM system based on a runtime tuning strategy that is able to deliver performance equal to or better than existing strategies. Second, I present several novel commit phase designs and evaluate their performance. Then I propose a new STM programming interface extension that enables transaction optimizations using fast shared memory reads while maintaining transaction composability. Next, I present a distributed time base design that outperforms existing time base designs for certain types of STM applications. Finally, I propose a novel programming construct to support multi-place isolation. Experimental results show the techniques presented here can significantly improve the STM performance. We expect these techniques to help STM be accepted by more programmers

    Obtaining Progress Guarantee and GreaterConcurrency in Multi-Version Object Semantics

    Get PDF
    Software Transactional Memory Systems (STMs) provides ease of multithreading to the programmer withoutworrying about concurrency issues such as deadlock, livelock, priority inversion, etc. Most of the STMs workson read-write operations known as RWSTMs. Some STMs work at high-level operations and ensure greaterconcurrency than RWSTMs. Such STMs are known as Object-Based STMs (OSTMs). The transactions of OSTMscan return commit or abort. Aborted OSTMs transactions retry. But in the current setting of OSTMs, transactionsmay starve. So, we proposed a Starvation-Free OSTM (SF-OSTM) which ensures starvation-freedom whilesatisfying the correctness criteria as opacity.Databases, RWSTMs and OSTMs say that maintaining multiple versions corresponding to each key reduces thenumber of aborts and improves the throughput. So, to achieve the greater concurrency, we proposed Starvation-Free Multi-Version OSTM (SF-MVOSTM) which ensures starvation-freedom while storing multiple versioncorresponding to each key and satisfies the correctness criteria as local opacity. To show the performance benefits,We implemented three variants of SF-MVOSTM and compare its performance with state-of-the-art STM

    Overcoming extreme-scale reproducibility challenges through a unified, targeted, and multilevel toolset

    Get PDF
    pre-printReproducibility, the ability to repeat program executions with the same numerical result or code behavior, is crucial for computational science and engineering applications. However, non-determinism in concurrency scheduling often hampers achieving this ability on high performance computing (HPC) systems. To aid in managing the adverse effects of non-determinism, prior work has provided techniques to achieve bit-precise reproducibility, but most of them focus only on small-scale parallelism. While scalable techniques recently emerged, they are disparate and target special purposes, e.g., single-schedule domains. On current systems with O(106) compute cores and future ones with O(109), any technique that does not embrace a unied, targeted, and multilevel approach will fall short of providing reproducibility. In this paper, we argue for a common toolset that embodies this approach, where programmers select and compose complementary tools and can effectively, yet scalably, analyze, control, and eliminate sources of non-determinism at scale. This allows users to gain reproducibility only to the levels demanded by specific code development needs. We present our research agenda and ongoing work toward this goal

    Overcoming extreme-scale reproducibility challenges through a unified, targeted, and multilevel toolset

    Full text link
    Abstract not provide

    Improving the Scalability of DPWS-Based Networked Infrastructures

    Full text link
    The Devices Profile for Web Services (DPWS) specification enables seamless discovery, configuration, and interoperability of networked devices in various settings, ranging from home automation and multimedia to manufacturing equipment and data centers. Unfortunately, the sheer simplicity of event notification mechanisms that makes it fit for resource-constrained devices, makes it hard to scale to large infrastructures with more stringent dependability requirements, ironically, where self-configuration would be most useful. In this report, we address this challenge with a proposal to integrate gossip-based dissemination in DPWS, thus maintaining compatibility with original assumptions of the specification, and avoiding a centralized configuration server or custom black-box middleware components. In detail, we show how our approach provides an evolutionary and non-intrusive solution to the scalability limitations of DPWS and experimentally evaluate it with an implementation based on the the Web Services for Devices (WS4D) Java Multi Edition DPWS Stack (JMEDS).Comment: 28 pages, Technical Repor
    corecore