10 research outputs found

    Safe Open-Nested Transactions Through Ownership

    Get PDF
    Researchers in transactional memory (TM) have proposed open nesting asa methodology for increasing the concurrency of a program. The ideais to ignore certain "low-level" memory operations of anopen-nested transaction when detecting conflicts for its parenttransaction, and instead perform abstract concurrency control for the"high-level" operation that nested transaction represents. Tosupport this methodology, TM systems use an open-nested commitmechanism that commits all changes performed by an open-nestedtransaction directly to memory, thereby avoiding low-levelconflicts. Unfortunately, because the TM runtime is unaware of thedifferent levels of memory, an unconstrained use of open-nestedcommits can lead to anomalous program behavior.In this paper, we describe a framework of ownership-awaretransactional memory which incorporates the notion of modules into theTM system and requires that transactions and data be associated withspecific transactional modules or Xmodules. We propose a newownership-aware commit mechanism, a hybrid between anopen-nested and closed-nested commit which commits a piece of datadifferently depending on whether the current Xmodule owns the data ornot. Moreover, we give a set of precise constraints on interactionsand sharing of data among the Xmodules based on familiar notions ofabstraction. We prove that ownership-aware TM has has cleanmemory-level semantics and can guarantee serializability bymodules, which is an adaptation of multilevel serializability fromdatabases to TM. In addition, we describe how a programmer canspecify Xmodules and ownership in a Java-like language. Our typesystem can enforce most of the constraints required by ownership-awareTM statically, and can enforce the remaining constraints dynamically.Finally, we prove that if transactions in the process of aborting obeyrestrictions on their memory footprint, the OAT model is free fromsemantic deadlock

    Safe open-nested transactions through ownership

    Full text link

    Optimistic parallelism requires abstractions

    Full text link

    Transactions with isolation and cooperation

    Full text link

    Memory abstractions for parallel programming

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 156-163).A memory abstraction is an abstraction layer between the program execution and the memory that provides a different "view" of a memory location depending on the execution context in which the memory access is made. Properly designed memory abstractions help ease the task of parallel programming by mitigating the complexity of synchronization or admitting more efficient use of resources. This dissertation describes five memory abstractions for parallel programming: (i) cactus stacks that interoperate with linear stacks, (ii) efficient reducers, (iii) reducer arrays, (iv) ownershipaware transactions, and (v) location-based memory fences. To demonstrate the utility of memory abstractions, my collaborators and I developed Cilk-M, a dynamically multithreaded concurrency platform which embodies the first three memory abstractions. Many dynamic multithreaded concurrency platforms incorporate cactus stacks to support multiple stack views for all the active children simultaneously. The use of cactus stacks, albeit essential, forces concurrency platforms to trade off between performance, memory consumption, and interoperability with serial code due to its incompatibility with linear stacks. This dissertation proposes a new strategy to build a cactus stack using thread-local memory mapping (or TLMM), which enables Cilk-M to satisfy all three criteria simultaneously. A reducer hyperobject allows different branches of a dynamic multithreaded program to maintain coordinated local views of the same nonlocal variable. With reducers, one can use nonlocal variables in a parallel computation without restructuring the code or introducing races. This dissertation introduces memory-mapped reducers, which admits a much more efficient access compared to existing implementations. When used in large quantity, reducers incur unnecessarily high overhead in execution time and space consumption. This dissertation describes support for reducer arrays, which offers the same functionality as an array of reducers with significantly less overhead. Transactional memory is a high-level synchronization mechanism, designed to be easier to use and more composable than fine-grain locking. This dissertation presents ownership-aware transactions, the first transactional memory design that provides provable safety guarantees for "opennested" transactions. On architectures that implement memory models weaker than sequential consistency, programs communicating via shared memory must employ memory-fences to ensure correct execution. This dissertation examines the concept of location-based memoryfences, which unlike traditional memory fences, incurs latency only when synchronization is necessary.by I-Ting Angelina Lee.Ph.D

    Scheduling and synchronization for multicore concurrency platforms

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 217-230).Developing correct and efficient parallel programs is difficult since programmers often have to manage low-level details like scheduling and synchronization explicitly. Recently, however, many hardware vendors have been shifting towards building multicore computers. This trend creates an enormous pressure to create concurrency platforms - platforms that provide an easier interface for parallel programming and enable ordinary programmers to write scalable, portable and efficient parallel programs. This thesis provides some provably-good practical solutions to problems that arise in the implementation of concurrency platforms, particularly in the domain of scheduling and synchronization. The first part of this thesis describes work on scheduling of parallel programs written in dynamic multithreaded languages (such as Cilk, Hood etc.). These languages allow the programmer to express parallelism of their code in a natural manner, while an automatic scheduler in the concurrency platform is responsible for scheduling the program on the underlying parallel hardware. This thesis presents designs to increase the functionality of these concurrency platforms. The second part of the thesis presents work on transactional memory semantics and design. Transactional memory (TM), has been recently proposed as an alternative to locks. TM provides a transactional interface to memory. The programmers can specify their critical sections inside a transaction, and the TM concurrency platform guarantees that the region executes atomically. One of the purported advantages of TM over locks is that transactional code is composable.(cont.) Most of the current TM concurrency platforms do not support full composability, however. This thesis addresses two of the composability problems in existing TM concurrency platforms.by Kunal Agrawal.Ph.D

    Composable abstractions for synchronization in dynamic threading platforms

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 259-269).High-level abstractions for parallel programming simplify the development of efficient parallel applications. In particular, composable abstractions allow programmers to construct a complex parallel application out of multiple components, where each component itself may be designed to exploit parallelism. This dissertation presents the design of three composable abstractions for synchronization in dynamic-threading platforms, based on ideas of task-graph execution, helper locks, and transactional memory. These designs demonstrate provably efficient runtime scheduling for programs with synchronization. For applications that use task-graph synchronization, I demonstrate provably efficient execution of task graphs with arbitrary dependencies as a library in a fork-join platform. Conventional wisdom suggests that a fork-join platform can execute an arbitrary task graph only with special runtime support or by converting the graph into a series-parallel computation which has less parallelism. By implementing Nabbit, a Cilk++ library for arbitrary task-graph execution, I show that one can in fact avoid introducing runtime modifications or additional constraints on parallelism. Nabbit achieves an asymptotically optimal completion-time bound for task graphs with constant degree. For applications that use lock-based synchronization, I introduce helper locks, a new synchronization abstraction that enables programmers to exploit asynchronous task parallelism inside locked critical regions. When a processor fails to acquire a helper lock, it can help complete the parallel critical region protected by the lock instead of simply waiting for the lock to be released. I also present HELPER, a runtime for supporting helper locks, and prove theoretical performance bounds which imply that HELPER achieves linear speedup on programs with a small number of highly parallel critical regions. For applications that use transaction-based synchronization, I present CWSTM, the first design of a transactional memory (TM) system that supports transactions with nested parallelism and nested parallel transactions of unbounded nesting depth. CWSTM demonstrates that one can provide theoretical bounds on the overhead of transaction conflict detection which are independent of nesting depth. I also introduce the concept of ownership-aware TM, the idea of using information about which memory locations a software module owns to provide provable guarantees of safety and correctness for open-nested transactions.by Jim Sukha.Ph.D

    Transactional collection classes

    No full text
    While parallel programmers find it easier to reason about large atomic regions, the conventional mutual exclusion-based primitives for synchronization force them to interleave many small operations to achieve performance. Transactional memory promises that programmers can use large atomic regions while achieving similar performance. However, these large transactions can conflict when operating on shared data structures, even for logically independent operations. Transactional collection classes address this problem by allowing long-running transactions to operate on shared data while eliminating unnecessary conflicts. Transactional collection classes wrap existing data structures, without the need for custom implementations or knowledge of data structure internals. Without transactional collection classes, access to shared data from within long-running transactions can suffer from data dependency conflicts that are logically unnecessary, but are artifacts o
    corecore