3,058 research outputs found
Transaction Activation scheduling Support for Transactional Memory
Transactional Memory (TM) is considered as one of the most promising paradigms for developing concurrent applications. TM has been shown to scale well on multiple cores when the data access pattern behaves “well,” i.e., when few conflicts are induced. In contrast, data patterns with frequent write sharing, with long transactions, or when many threads contend for a smaller number of cores, produce numerous aborts. These problems are traditionally addressed by application-level contention managers, but they suffer from a lack of precision and provide unpredictable benefits on many workloads. In this paper, we propose a system approach where the scheduler tries to avoid aborts by preventing conflicting transactions from running simultaneously. We use a combination of several techniques to help reduce the odds of conflicts, by (1) avoiding preempting threads running a transaction until the transaction completes, (2) keeping track of conflicts and delaying the restart of a transaction until conflicting transactions have committed, and (3) keeping track of conflicts and only allowing a thread with conflicts to run at low priority. Our approach has been implemented in Linux for Software Transactional Memory (STM) using a shared memory segment to allow fast communication between the STM library and the scheduler. It only requires small and contained modifications to the operating system. Experimental evaluation demonstrates that our approach significantly reduces the number of aborts while improving transaction throughput on various workloads
Scheduling in Transactional Memory Systems: Models, Algorithms, and Evaluations
Transactional memory provides an alternative synchronization mechanism that removes many limitations of traditional lock-based synchronization so that concurrent program writing is easier than lock-based code in modern multicore architectures. The fundamental module in a transactional memory system is the transaction which represents a sequence of read and write operations that are performed atomically to a set of shared resources; transactions may conflict if they access the same shared resources. A transaction scheduling algorithm is used to handle these transaction conflicts and schedule appropriately the transactions. In this dissertation, we study transaction scheduling problem in several systems that differ through the variation of the intra-core communication cost in accessing shared resources. Symmetric communication costs imply tightly-coupled systems, asymmetric communication costs imply large-scale distributed systems, and partially asymmetric communication costs imply non-uniform memory access systems. We made several theoretical contributions providing tight, near-tight, and/or impossibility results on three different performance evaluation metrics: execution time, communication cost, and load, for any transaction scheduling algorithm. We then complement these theoretical results by experimental evaluations, whenever possible, showing their benefits in practical scenarios. To the best of our knowledge, the contributions of this dissertation are either the first of their kind or significant improvements over the best previously known results
On the Impact of Memory Allocation on High-Performance Query Processing
Somewhat surprisingly, the behavior of analytical query engines is crucially
affected by the dynamic memory allocator used. Memory allocators highly
influence performance, scalability, memory efficiency and memory fairness to
other processes. In this work, we provide the first comprehensive experimental
analysis on the impact of memory allocation for high-performance query engines.
We test five state-of-the-art dynamic memory allocators and discuss their
strengths and weaknesses within our DBMS. The right allocator can increase the
performance of TPC-DS (SF 100) by 2.7x on a 4-socket Intel Xeon server
Dynamic Prediction based Scheduling for TM
Transactional memory (TM) provides an intuitive and simple way of writing parallel programs. TMs execute parallel programs speculatively and deliver better performance than conventional lock based parallel programs. However, in certain scenarios when an application lacks scope for parallelism, TMs are outperformed by conventional fine-grained locking. TM schedulers, which serialize transactions that face contention, have shown promise in improving performance of TMs in such scenarios. In this thesis, we develop a Dynamic Prediction based Scheduler (DPS) that exploits novel prediction techniques, like temporal locality and locality of access across repeated transactions. DPS predicts the access sets of future transactions based on the access patterns of the past transactions of the individual threads. We also propose a novel heuristic, called serialization affinity, which tends to serialize transactions with a probability proportional to the current amount of contention. Using the information of the currently executing transactions, the current amount of contention, and the predicted access sets, DPS dynamically serializes transactions to minimize conflicts. We implement DPS in two state-of-the-art STMs, SwissTM and TinySTM. Our results show that in scenarios where the number of threads is higher than the number of cores, DPS improves the performance of these STMs by up to 55% and 3000% respectively. On the other hand, the overhead of prediction techniques in DPS causes a performance degradation of just 5-8% in some cases, when the number of threads is less than the number of cores
Contention management for distributed data replication
PhD ThesisOptimistic replication schemes provide distributed applications with access
to shared data at lower latencies and greater availability. This is
achieved by allowing clients to replicate shared data and execute actions
locally. A consequence of this scheme raises issues regarding shared data
consistency. Sometimes an action executed by a client may result in
shared data that may conflict and, as a consequence, may conflict with
subsequent actions that are caused by the conflicting action. This requires
a client to rollback to the action that caused the conflicting data,
and to execute some exception handling. This can be achieved by relying
on the application layer to either ignore or handle shared data inconsistencies
when they are discovered during the reconciliation phase of an
optimistic protocol.
Inconsistency of shared data has an impact on the causality relationship
across client actions. In protocol design, it is desirable to preserve the
property of causality between different actions occurring across a distributed
application. Without application level knowledge, we assume
an action causes all the subsequent actions at the same client. With
application knowledge, we can significantly ease the protocol burden of
provisioning causal ordering, as we can identify which actions do not
cause other actions (even if they precede them). This, in turn, makes
possible the client’s ability to rollback to past actions and to change
them, without having to alter subsequent actions. Unfortunately, increased
instances of application level causal relations between actions
lead to a significant overhead in protocol. Therefore, minimizing the
rollback associated with conflicting actions, while preserving causality,
is seen as desirable for lower exception handling in the application layer.
In this thesis, we present a framework that utilizes causality to create
a scheduler that can inform a contention management scheme to reduce
the rollback associated with the conflicting access of shared data.
Our framework uses a backoff contention management scheme to provide
causality preserving for those optimistic replication systems with high
causality requirements, without the need for application layer knowledge.
We present experiments which demonstrate that our framework reduces
clients’ rollback and, more importantly, that the overall throughput of
the system is improved when the contention management is used with
applications that require causality to be preserved across all actions
Tailoring Transactional Memory to Real-World Applications
Transactional Memory (TM) promises to provide a scalable mechanism for synchronizationin concurrent programs, and to offer ease-of-use benefits to programmers. Since multiprocessorarchitectures have dominated CPU design, exploiting parallelism in program
A speculative execution approach to provide semantically aware contention management for concurrent systems
PhD ThesisMost modern platforms offer ample potention for parallel execution of concurrent programs yet concurrency control is required to exploit parallelism while maintaining program correctness. Pessimistic con-
currency control featuring blocking synchronization and mutual ex-
clusion, has given way to transactional memory, which allows the
composition of concurrent code in a manner more intuitive for the
application programmer. An important component in any transactional memory technique however is the policy for resolving conflicts
on shared data, commonly referred to as the contention management
policy.
In this thesis, a Universal Construction is described which provides
contention management for software transactional memory. The technique differs from existing approaches given that multiple execution
paths are explored speculatively and in parallel. In the resolution of
conflicts by state space exploration, we demonstrate that both concur-
rent conflicts and semantic conflicts can be solved, promoting multi-
threaded program progression.
We de ne a model of computation called Many Systems, which defines the execution of concurrent threads as a state space management
problem. An implementation is then presented based on concepts
from the model, and we extend the implementation to incorporate
nested transactions. Results are provided which compare the performance of our approach with an established contention management
policy, under varying degrees of concurrent and semantic conflicts. Finally, we provide performance results from a number of search strategies, when nested transactions are introduced
Performance Optimization Strategies for Transactional Memory Applications
This thesis presents tools for Transactional Memory (TM) applications that cover multiple TM systems (Software, Hardware, and hybrid TM) and use information of all different layers of the TM software stack. Therefore, this thesis addresses a number of challenges to extract static information, information about the run time behavior, and expert-level knowledge to develop these new methods and strategies for the optimization of TM applications
Recommended from our members
Software lock elision for x86 machine code
More than a decade after becoming a topic of intense research there is no
transactional memory hardware nor any examples of software transactional memory
use outside the research community. Using software transactional memory in large
pieces of software needs copious source code annotations and often means
that standard compilers and debuggers can no longer be used. At the same time,
overheads associated with software transactional memory fail to motivate
programmers to expend the needed effort to use software transactional
memory. The only way around the overheads in the case of general unmanaged code
is the anticipated availability of hardware support. On the other hand, architects
are unwilling to devote power and area budgets in mainstream microprocessors to
hardware transactional memory, pointing to transactional memory being a
"niche" programming construct. A deadlock has thus ensued that is blocking
transactional memory use and experimentation in the mainstream.
This dissertation covers the design and construction of a software transactional
memory runtime system called SLE_x86 that can potentially break this
deadlock by decoupling transactional memory from programs using it. Unlike most
other STM designs, the core design principle is transparency rather than
performance. SLE_x86 operates at the level of x86 machine code, thereby
becoming immediately applicable to binaries for the popular x86
architecture. The only requirement is that the binary synchronise using known
locking constructs or calls such as those in Pthreads or OpenMP
libraries. SLE_x86 provides speculative lock elision (SLE) entirely in
software, executing critical sections in the binary using transactional
memory. Optionally, the critical sections can also be executed without using
transactions by acquiring the protecting lock.
The dissertation makes a careful analysis of the impact on performance due to
the demands of the x86 memory consistency model and the need to transparently
instrument x86 machine code. It shows that both of these problems can be
overcome to reach a reasonable level of performance, where transparent
software transactional memory can perform better than a lock. SLE_x86 can
ensure that programs are ready for transactional memory in any form, without
being explicitly written for it
Dynamic contention management for distributed applications
PhD ThesisDistributed applications often make use of replicated state to afford a greater level of
availability and throughput. This is achieved by allowing individual processes to progress
without requiring prior synchronisation. This approach, termed optimistic replication,
results in divergent replicas that must be reconciled to achieve an overall consistent state.
Concurrent operations to shared objects in the replicas result in conflicting updates that
require reconciliatory action to rectify. This typically takes the form of compensatory
execution or simply undoing and rolling back client state.
When considering user interaction with the application, there exists relationships and
intent in the ordering and execution of these operations. The enactment of reconciliation
that determines one action as conflicted may have far reaching implications with regards to
the user’s original intent. In such scenarios, the compensatory action applied to a conflict
may require previous operations to also be undone or compensated such that the user’s
intent is maintained. Therefore, an ability to manage the contention to the shared data
across the distributed application to pre-emptively lower conflicts resulting from these
infringements is desirable. The aim is to not hinder throughput, achieved from the weaker
consistency model known as eventual consistency.
In this thesis, a model is presented for a contention management framework that schedules
access using the expected execution inherent in the application domain to best inform
the contention manager. A backoff scheme is employed to create an access schedule, preserving
user intent for applications that require this high level of maintenance for user
actions. By using such an approach, this results in a performance improvement seen
in the reduction of the overall number of conflicts, while also improving overall system
throughput. This thesis describes how the contention management scheme operates and,
through experimentation, the performance benefits received
- …