9,612 research outputs found

    TMbarrier: speculative barriers using hardware transactional memory

    Get PDF
    Barrier is a very common synchronization method used in parallel programming. Barriers are used typically to enforce a partial thread execution order, since there may be dependences between code sections before and after the barrier. This work proposes TMbarrier, a new design of a barrier intended to be used in transactional applications. TMbarrier allows threads to continue executing speculatively after the barrier assuming that there are not dependences with safe threads that have not yet reached the barrier. Our design leverages transactional memory (TM) (specifically, the implementation offered by the IBM POWER8 processor) to hold the speculative updates and to detect possible conflicts between speculative and safe threads. Despite the limitations of the best-effort hardware TM implementation present in current processors, experiments show a reduction in wasted time due to synchronization compared to standard barriers.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Factoring out ordered sections to expose thread-level parallelism

    Get PDF
    With the rise of multi-core processors, researchers are taking a new look at extending the applicability auto-parallelization techniques. In this paper, we identify a dependence pattern on which autoparallelization currently fails. This dependence pattern occurs for ordered sections, i.e. code fragments in a loop that must be executed atomically and in original program order. We discuss why these ordered sections prohibit current auto-parallelizers from working and we present a technique to deal with them. We experimentally demonstrate the efficacy of the technique, yielding significant overall program speedups

    A Comparative Analysis of STM Approaches to Reduction Operations in Irregular Applications

    Get PDF
    As a recently consolidated paradigm for optimistic concurrency in modern multicore architectures, Transactional Memory (TM) can help to the exploitation of parallelism in irregular applications when data dependence information is not available up to run- time. This paper presents and discusses how to leverage TM to exploit parallelism in an important class of irregular applications, the class that exhibits irregular reduction patterns. In order to test and compare our techniques with other solutions, they were implemented in a software TM system called ReduxSTM, that acts as a proof of concept. Basically, ReduxSTM combines two major ideas: a sequential-equivalent ordering of transaction commits that assures the correct result, and an extension of the underlying TM privatization mechanism to reduce unnecessary overhead due to reduction memory updates as well as unnecesary aborts and rollbacks. A comparative study of STM solutions, including ReduxSTM, and other more classical approaches to the parallelization of reduction operations is presented in terms of time, memory and overhead.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Load-Sharing Policies in Parallel Simulation of Agent-Based Demographic Models

    Get PDF
    Execution parallelism in agent-Based Simulation (ABS) allows to deal with complex/large-scale models. This raises the need for runtime environments able to fully exploit hardware parallelism, while jointly offering ABS-suited programming abstractions. In this paper, we target last-generation Parallel Discrete Event Simulation (PDES) platforms for multicore systems. We discuss a programming model to support both implicit (in-place access) and explicit (message passing) interactions across concurrent Logical Processes (LPs). We discuss different load-sharing policies combining event rate and implicit/explicit LPs’ interactions. We present a performance study conducted on a synthetic test case, representative of a class of agent-based models

    Improving Transactional Memory Performance for Irregular Applications

    Get PDF
    Postprint de autor publicado posteriormente con este DOI:http://dx.doi.org/10.1016/j.procs.2015.05.398Transactional memory (TM) offers optimistic concurrency support in modern multicore archi- tectures, helping the programmers to extract parallelism in irregular applications when data dependence information is not available before runtime. In fact, recent research focus on ex- ploiting thread-level parallelism using TM approaches. However, the proposed techniques are of general use, valid for any type of application. This work presents ReduxSTM, a software TM system specially designed to extract maxi- mum parallelism from irregular applications. Commit management and conflict detection are tailored to take advantage of both, sequential transaction ordering to assure correct results, and privatization of reduction patterns, a very frequent memory access pattern in irregular applications. Both techniques are used to avoid unnecessary transaction aborts. A function in 300.twolf package from SPEC CPU2000 was taken as a motivating irregular program. This code was parallelized using ReduxSTM and an ordered version of TinySTM, a state-of-the-art TM system. Experimental evaluation shows that ReduxTM exploits more parallelism from the sequential program and obtains better performance than the other system.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech
    • …
    corecore