13,628 research outputs found

    Maintaining the correctness of transactional memory programs

    Get PDF
    Dissertação para obtenção do Grau de Doutor em Engenharia InformáticaThis dissertation addresses the challenge of maintaining the correctness of transactional memory programs, while improving its parallelism with small transactions and relaxed isolation levels. The efficiency of the transactional memory systems depends directly on the level of parallelism, which in turn depends on the conflict rate. A high conflict rate between memory transactions can be addressed by reducing the scope of transactions, but this approach may turn the application prone to the occurrence of atomicity violations. Another way to address this issue is to ignore some of the conflicts by using a relaxed isolation level, such as snapshot isolation, at the cost of introducing write-skews serialization anomalies that break the consistency guarantees provided by a stronger consistency property, such as opacity. In order to tackle the correctness issues raised by the atomicity violations and the write-skew anomalies, we propose two static analysis techniques: one based in a novel static analysis algorithm that works on a dependency graph of program variables and detects atomicity violations; and a second one based in a shape analysis technique supported by separation logic augmented with heap path expressions, a novel representation based on sequences of heap dereferences that certifies if a transactional memory program executing under snapshot isolation is free from writeskew anomalies. The evaluation of the runtime execution of a transactional memory algorithm using snapshot isolation requires a framework that allows an efficient implementation of a multi-version algorithm and, at the same time, enables its comparison with other existing transactional memory algorithms. In the Java programming language there was no framework satisfying both these requirements. Hence, we extended an existing software transactional memory framework that already supported efficient implementations of some transactional memory algorithms, to also support the efficient implementation of multi-version algorithms. The key insight for this extension is the support for storing the transactional metadata adjacent to memory locations. We illustrate the benefits of our approach by analyzing its impact with both single- and multi-version transactional memory algorithms using several transactional workloads.Fundação para a Ciência e Tecnologia - PhD research grant SFRH/BD/41765/2007, and in the research projects Synergy-VM (PTDC/EIA-EIA/113613/2009), and RepComp (PTDC/EIAEIA/ 108963/2008

    Strong Memory Consistency For Parallel Programming

    Get PDF
    Correctly synchronizing multithreaded programs is challenging, and errors can lead to program failures (e.g., atomicity violations). Existing memory consistency models rule out some possible failures, but are limited by depending on subtle programmer-defined locking code and by providing unintuitive semantics for incorrectly synchronized code. Stronger memory consistency models assist programmers by providing them with easier-to-understand semantics with regard to memory access interleavings in parallel code. This dissertation proposes a new strong memory consistency model based on ordering-free regions (OFRs), which are spans of dynamic instructions between consecutive ordering constructs (e.g. barriers). Atomicity over ordering-free regions provides stronger atomicity than existing strong memory consistency models with competitive performance. Ordering-free regions also simplify programmer reasoning by limiting the potential for atomicity violations to fewer points in the program’s execution. This dissertation explores both software-only and hardware-supported systems that provide OFR serializability

    Enhancing the efficiency and practicality of software transactional memory on massively multithreaded systems

    Get PDF
    Chip Multithreading (CMT) processors promise to deliver higher performance by running more than one stream of instructions in parallel. To exploit CMT's capabilities, programmers have to parallelize their applications, which is not a trivial task. Transactional Memory (TM) is one of parallel programming models that aims at simplifying synchronization by raising the level of abstraction between semantic atomicity and the means by which that atomicity is achieved. TM is a promising programming model but there are still important challenges that must be addressed to make it more practical and efficient in mainstream parallel programming. The first challenge addressed in this dissertation is that of making the evaluation of TM proposals more solid with realistic TM benchmarks and being able to run the same benchmarks on different STM systems. We first introduce a benchmark suite, RMS-TM, a comprehensive benchmark suite to evaluate HTMs and STMs. RMS-TM consists of seven applications from the Recognition, Mining and Synthesis (RMS) domain that are representative of future workloads. RMS-TM features current TM research issues such as nesting and I/O inside transactions, while also providing various TM characteristics. Most STM systems are implemented as user-level libraries: the programmer is expected to manually instrument not only transaction boundaries, but also individual loads and stores within transactions. This library-based approach is increasingly tedious and error prone and also makes it difficult to make reliable performance comparisons. To enable an "apples-to-apples" performance comparison, we then develop a software layer that allows researchers to test the same applications with interchangeable STM back ends. The second challenge addressed is that of enhancing performance and scalability of TM applications running on aggressive multi-core/multi-threaded processors. Performance and scalability of current TM designs, in particular STM desings, do not always meet the programmer's expectation, especially at scale. To overcome this limitation, we propose a new STM design, STM2, based on an assisted execution model in which time-consuming TM operations are offloaded to auxiliary threads while application threads optimistically perform computation. Surprisingly, our results show that STM2 provides, on average, speedups between 1.8x and 5.2x over state-of-the-art STM systems. On the other hand, we notice that assisted-execution systems may show low processor utilization. To alleviate this problem and to increase the efficiency of STM2, we enriched STM2 with a runtime mechanism that automatically and adaptively detects application and auxiliary threads' computing demands and dynamically partition hardware resources between the pair through the hardware thread prioritization mechanism implemented in POWER machines. The third challenge is to define a notion of what it means for a TM program to be correctly synchronized. The current definition of transactional data race requires all transactions to be totally ordered "as if'' serialized by a global lock, which limits the scalability of TM designs. To remove this constraint, we first propose to relax the current definition of transactional data race to allow a higher level of concurrency. Based on this definition we propose the first practical race detection algorithm for C/C++ applications (TRADE) and implement the corresponding race detection tool. Then, we introduce a new definition of transactional data race that is more intuitive, transparent to the underlying TM implementation, can be used for a broad set of C/C++ TM programs. Based on this new definition, we proposed T-Rex, an efficient and scalable race detection tool for C/C++ TM applications. Using TRADE and T-Rex, we have discovered subtle transactional data races in widely-used STAMP applications which have not been reported in the past

    New hardware support transactional memory and parallel debugging in multicore processors

    Get PDF
    This thesis contributes to the area of hardware support for parallel programming by introducing new hardware elements in multicore processors, with the aim of improving the performance and optimize new tools, abstractions and applications related with parallel programming, such as transactional memory and data race detectors. Specifically, we configure a hardware transactional memory system with signatures as part of the hardware support, and we develop a new hardware filter for reducing the signature size. We also develop the first hardware asymmetric data race detector (which is also able to tolerate them), based also in hardware signatures. Finally, we propose a new module of hardware signatures that solves some of the problems that we found in the previous tools related with the lack of flexibility in hardware signatures

    Static detection of anomalies in transactional memory programs

    Get PDF
    Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para a obtenção do Grau de Mestre em Engenharia InformáticaTransactional Memory (TM) is an approach to concurrent programming based on the transactional semantics borrowed from database systems. In this paradigm, a transaction is a sequence of actions that may execute in a single logical instant, as though it was the only one being executed at that moment. Unlike concurrent systems based in locks, TM does not enforce that a single thread is performing the guarded operations. Instead, like in database systems, transactions execute concurrently, and the effects of a transaction are undone in case of a conflict, as though it never happened. The advantages of TM are an easier and less error-prone programming model, and a potential increase in scalability and performance. In spite of these advantages, TM is still a young and immature technology, and has still to become an established programming model. It still lacks the paraphernalia of tools and standards which we have come to expect from a widely used programming paradigm. Testing and analysis techniques and algorithms for TM programs are also just starting to be addressed by the scientific community, making this a leading research work is many of these aspects. This work is aimed at statically identifying possible runtime anomalies in TMprograms. We addressed both low-level dataraces in TM programs, as well as high-level anomalies resulting from incorrect splitting of transactions. We have defined and implemented an approach to detect low-level dataraces in TM programs by converting all the memory transactions into monitor protected critical regions, synchronized on a newly generated global lock. To validate the approach, we have applied our tool to a set of tests, adapted from the literature, that contain well documented errors. We have also defined and implemented a new approach to static detection of high-level concurrency anomalies in TM programs. This new approach works by conservatively tracing transactions, and matching the interference between each consecutive pair of transactions against a set of defined anomaly patterns. Once again, the approach was validated with well documented tests adapted from the literature

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Algorithms for Fault Detection and Diagnosis

    Get PDF
    Due to the increasing demand for security and reliability in manufacturing and mechatronic systems, early detection and diagnosis of faults are key points to reduce economic losses caused by unscheduled maintenance and downtimes, to increase safety, to prevent the endangerment of human beings involved in the process operations and to improve reliability and availability of autonomous systems. The development of algorithms for health monitoring and fault and anomaly detection, capable of the early detection, isolation, or even prediction of technical component malfunctioning, is becoming more and more crucial in this context. This Special Issue is devoted to new research efforts and results concerning recent advances and challenges in the application of “Algorithms for Fault Detection and Diagnosis”, articulated over a wide range of sectors. The aim is to provide a collection of some of the current state-of-the-art algorithms within this context, together with new advanced theoretical solutions

    The interplay between automatic and controlled processes: experimental contributions to dual-process theories of cognition

    Get PDF
    Since its beginnings, psychological science has frequently used dichotomous categories to describe behavior and mental phenomena. The most traditional dual models have impactfully equipped both the scientific and folkloristic psychological vocabularies of such dichotomies (e.g., conscious vs. unconscious, logic vs. creative, rational vs. emotional). However, while offering an affordable account of how the human cognitive system works, these models appear too simplistic. Substantially, they are grounded upon the findings obtained in decades of results in almost all the psychological fields, from perception to social processes, which have been later merged into a broad systemic theory of human cognition. However, this dual-system theory, which proposed to unify all cognitive dualities into System 1 (automatic, unconscious, fast, effortless, intuitive, and so on) and System 2 (controlled, conscious, slow, effortful, rational, and so on) entities, lacks a systematic investigation of its basic assumptions: for instance, that the features are aligned within and complementary between the two systems. These properties are essential for the tenets of the theory since a systemic theory should postulate the interdependence and interrelation of the elements constituting a system. In this view, the central thread linking all the experimental contributions in the present work is that the dual-system theory should resist when investigating cognitive performance either at low- and at high-level of complexity (complexity defined as the variety of mechanisms implicated in the phenomena of interest). Through seven studies conducted in three research lines, addressing temporal attention, task-switching, and decision-making, the interaction between automatic and controlled features in each process has shown to be the rule rather than the exception. Thus, the results presented in this work support the idea that the dual-system theory current formulation has a weak explanatory power, suggesting that decomposition approaches aimed at disentangling the contribution of qualitatively and quantitatively different mechanisms in each cognitive process are needed to advance or put aside dual-process theories
    corecore