112 research outputs found
Tailoring Transactional Memory to Real-World Applications
Transactional Memory (TM) promises to provide a scalable mechanism for synchronizationin concurrent programs, and to offer ease-of-use benefits to programmers. Since multiprocessorarchitectures have dominated CPU design, exploiting parallelism in program
Insights into the Fallback Path of Best-Effort Hardware Transactional Memory Systems
DOI 10.1007/978-3-319-43659-3Current industry proposals for Hardware Transactional Memory (HTM) focus on best-effort solutions (BE-HTM) where hardware limits are imposed on transactions. These designs may show a significant performance degradation due
to high contention scenarios and different hardware and operating system limitations that abort transactions, e.g. cache overflows, hardware and software exceptions, etc. To deal with these events and to ensure forward progress, BE-HTM systems usually provide a software fallback path to execute a lock-based version of the code.
In this paper, we propose a hardware implementation of an irrevocability mechanism as an alternative to the software fallback path to gain insight into the hardware improvements that could enhance the execution of such a fallback. Our mechanism anticipates the abort that causes the transaction serialization, and stalls other transactions in the system so that transactional work loss is mini-
mized. In addition, we evaluate the main software fallback path approaches and propose the use of ticket locks that hold precise information of the number of transactions waiting to enter the fallback. Thus, the separation of transactional
and fallback execution can be achieved in a precise manner. The evaluation is carried out using the Simics/GEMS simulator and the complete range of STAMP transactional suite benchmarks. We obtain significant performance benefits of around twice the speedup and an abort reduction of 50% over the software fallback path for a number of benchmarks.Universidad de Málaga. Campus de Excelencia Internacional AndalucÃa Tech
Reusable Concurrent Data Types
This paper contributes to address the fundamental challenge of building Concurrent Data Types (CDT) that are reusable and scalable at the same time. We do so by proposing the abstraction of Polymorphic Transactions (PT): a new programming abstraction that offers different compatible transactions that can run concurrently in the same application. We outline the commonality of the problem in various object-oriented languages and implement PT and a reusable package in Java. With PT, annotating sequential ADTs guarantee novice programmers to obtain an atomic and deadlock-free CDT and let an advanced programmer leverage the application semantics to get higher performance. We compare our polymorphic synchronization against transaction-based, lock-based and lock-free synchronizations on SPARC and x86-64 architectures and we integrate our methodology to a travel reservation benchmark. Although our reusable CDTs are sometimes less efficient than non-composable handcrafted CDTs from the JDK, they outperform all reusable Java CDTs
Lock Holder Preemption Avoidance via Transactional Lock Elision
Abstract In this short paper we show that hardware-based transactional lock elision can provide benefit by reducing the incidence of lock holder preemption, decreasing lock hold times and promoting improved scalability
Recommended from our members
Software lock elision for x86 machine code
More than a decade after becoming a topic of intense research there is no
transactional memory hardware nor any examples of software transactional memory
use outside the research community. Using software transactional memory in large
pieces of software needs copious source code annotations and often means
that standard compilers and debuggers can no longer be used. At the same time,
overheads associated with software transactional memory fail to motivate
programmers to expend the needed effort to use software transactional
memory. The only way around the overheads in the case of general unmanaged code
is the anticipated availability of hardware support. On the other hand, architects
are unwilling to devote power and area budgets in mainstream microprocessors to
hardware transactional memory, pointing to transactional memory being a
"niche" programming construct. A deadlock has thus ensued that is blocking
transactional memory use and experimentation in the mainstream.
This dissertation covers the design and construction of a software transactional
memory runtime system called SLE_x86 that can potentially break this
deadlock by decoupling transactional memory from programs using it. Unlike most
other STM designs, the core design principle is transparency rather than
performance. SLE_x86 operates at the level of x86 machine code, thereby
becoming immediately applicable to binaries for the popular x86
architecture. The only requirement is that the binary synchronise using known
locking constructs or calls such as those in Pthreads or OpenMP
libraries. SLE_x86 provides speculative lock elision (SLE) entirely in
software, executing critical sections in the binary using transactional
memory. Optionally, the critical sections can also be executed without using
transactions by acquiring the protecting lock.
The dissertation makes a careful analysis of the impact on performance due to
the demands of the x86 memory consistency model and the need to transparently
instrument x86 machine code. It shows that both of these problems can be
overcome to reach a reasonable level of performance, where transparent
software transactional memory can perform better than a lock. SLE_x86 can
ensure that programs are ready for transactional memory in any form, without
being explicitly written for it
Consistent state software transactional memory
Dissertação apresentada na Faculdade de
Ciências e Tecnologia da Universidade
Nova de Lisboa para a obtenção do Grau
de Mestre em Engenharia InformáticaAs the multicore CPUs start getting into everyone’s computers, concurrent programming
must start covering, not only the scientific and enterprise applications, but also
every computer application we all use in our daily lives.
Since the introduction of software transactional memory [ST95,HLMWNS03], this
topic has had a strong interest by the scientific community as it has the potential of
greatly facilitating concurrent programming by hiding the concurrency issues under
the transactional layer.
This thesis builds on the TL2 STM engine [DON06], which is one of the top performing
to date. We have explored different design alternatives focusing on performance
and safety. With our research we have achieved performance improvements and better
safety properties of the engine. We have also achieved a much better understanding of
the design alternatives and their impacts.
During the course of this thesis we have come across several tough concurrency
bugs and we have created a list of testing patterns, which proved to be useful in finding
and reproducing several problems.
This thesis describes the cutting edge of STM engine technology, elaborates on the
design of a new STM engine and reports on the experimental results obtained
Consistent and efficient output-streams management in optimistic simulation platforms
Optimistic synchronization is considered an effective means for supporting Parallel Discrete Event Simulations. It relies on a speculative approach, where concurrent processes execute simulation events regardless of their safety, and consistency is ensured via proper rollback mechanisms, upon the a-posteriori detection of causal inconsistencies along the events' execution path. Interactions with the outside world (e.g. generation of output streams) are a well-known problem for rollback-based systems, since the outside world may have no notion of rollback. In this context, approaches for allowing the simulation modeler to generate consistent output rely on either the usage of ad-hoc APIs (which must be provided by the underlying simulation kernel) or temporary suspension of processing activities in order to wait for the final outcome (commit/rollback) associated with a speculatively-produced output. In this paper we present design indications and a reference implementation for an output streams' management subsystem which allows the simulation-model writer to rely on standard output-generation libraries (e.g. stdio) within code blocks associated with event processing. Further, the subsystem ensures that the produced output is consistent, namely associated with events that are eventually committed, and system-wide ordered along the simulation time axis. The above features jointly provide the illusion of a classical (simple to deal with) sequential programming model, which spares the developer from being aware that the simulation program is run concurrently and speculatively. We also show, via an experimental study, how the design/development optimizations we present lead to limited overhead, giving rise to the situation where the simulation run would have been carried out with near-to-zero or reduced output management cost. At the same time, the delay for materializing the output stream (making it available for any type of audit activity) is shown to be fairly limited and constant, especially for good mixtures of I/O-bound vs CPU-bound behaviors at the application level. Further, the whole output streams' management subsystem has been designed in order to provide scalability for I/O management on clusters. © 2013 ACM
Performance models of concurrency control protocols for transaction processing systems
Transaction processing plays a key role in a lot of IT infrastructures. It is widely used in a variety of contexts, spanning from database management systems to concurrent programming tools. Transaction processing systems leverage on concurrency control protocols, which allow them to concurrently process transactions preserving essential properties, as isolation and atomicity. Performance is a critical aspect of transaction processing systems, and it is unavoidably affected by the concurrency control. For this reason, methods and techniques to assess and predict the performance of concurrency control protocols are of interest for many IT players, including application designers, developers and system administrators. The analysis and the proper understanding of the impact on the system performance of these protocols require quantitative approaches. Analytical modeling is a practical approach for building cost-effective computer system performance models, enabling us to quantitatively describe the complex dynamics characterizing these systems. In this dissertation we present analytical performance models of concurrency control protocols. We deal with both traditional transaction processing systems, such as database management systems, and emerging ones, as transactional memories. The analysis focuses on widely used protocols, providing detailed performance models and validation studies. In addition, we propose new modeling approaches, which also broaden the scope of our study towards a more realistic, application-oriented, performance analysis
- …