29,679 research outputs found

    Deterministic Consistency: A Programming Model for Shared Memory Parallelism

    Full text link
    The difficulty of developing reliable parallel software is generating interest in deterministic environments, where a given program and input can yield only one possible result. Languages or type systems can enforce determinism in new code, and runtime systems can impose synthetic schedules on legacy parallel code. To parallelize existing serial code, however, we would like a programming model that is naturally deterministic without language restrictions or artificial scheduling. We propose "deterministic consistency", a parallel programming model as easy to understand as the "parallel assignment" construct in sequential languages such as Perl and JavaScript, where concurrent threads always read their inputs before writing shared outputs. DC supports common data- and task-parallel synchronization abstractions such as fork/join and barriers, as well as non-hierarchical structures such as producer/consumer pipelines and futures. A preliminary prototype suggests that software-only implementations of DC can run applications written for popular parallel environments such as OpenMP with low (<10%) overhead for some applications.Comment: 7 pages, 3 figure

    Open Transactions on Shared Memory

    Full text link
    Transactional memory has arisen as a good way for solving many of the issues of lock-based programming. However, most implementations admit isolated transactions only, which are not adequate when we have to coordinate communicating processes. To this end, in this paper we present OCTM, an Haskell-like language with open transactions over shared transactional memory: processes can join transactions at runtime just by accessing to shared variables. Thus a transaction can co-operate with the environment through shared variables, but if it is rolled-back, also all its effects on the environment are retracted. For proving the expressive power of TCCS we give an implementation of TCCS, a CCS-like calculus with open transactions

    A Bulk-Parallel Priority Queue in External Memory with STXXL

    Get PDF
    We propose the design and an implementation of a bulk-parallel external memory priority queue to take advantage of both shared-memory parallelism and high external memory transfer speeds to parallel disks. To achieve higher performance by decoupling item insertions and extractions, we offer two parallelization interfaces: one using "bulk" sequences, the other by defining "limit" items. In the design, we discuss how to parallelize insertions using multiple heaps, and how to calculate a dynamic prediction sequence to prefetch blocks and apply parallel multiway merge for extraction. Our experimental results show that in the selected benchmarks the priority queue reaches 75% of the full parallel I/O bandwidth of rotational disks and and 65% of SSDs, or the speed of sorting in external memory when bounded by computation.Comment: extended version of SEA'15 conference pape

    Sharing Memory between Byzantine Processes using Policy-enforced Tuple Spaces

    Get PDF
    Abstract—Despite the large amount of Byzantine fault-tolerant algorithms for message-passing systems designed through the years, only recent algorithms for the coordination of processes subject to Byzantine failures using shared memory have appeared. This paper presents a new computing model in which shared memory objects are protected by fine-grained access policies, and a new shared memory object, the Policy-Enforced Augmented Tuple Space (PEATS). We show the benefits of this model by providing simple and efficient consensus algorithms. These algorithms are much simpler and require less shared memory operations, using also less memory bits than previous algorithms based on access control lists (ACLs) and sticky bits. We also prove that PEATS objects are universal, i.e., that they can be used to implement any other shared memory object, and present lock-free and wait-free universal constructions. Index Terms—Byzantine fault-tolerance, shared memory algorithms, tuple spaces, consensus, universal constructions. Ç

    Efficient shared memory message passing for inter-VM communications

    Get PDF
    Thanks to recent advances in virtualization technologies, it is now possible to beneïŹt from the ïŹ‚exibility brought by virtual machines at little cost in terms of CPU performance. However on HPC clusters some overheads remain which prevent widespread usage of virtualization. In this article, we tackle the issue of inter-VM MPI communications when VMs are located on the same physical machine. To achieve this we introduce a virtual device which provides a simple message passing API to the guest OS. This interface can then be used to implement an efficient MPI library for virtual machines. The use of a virtual device makes our solution easily portable across multiple guest operating systems since it only requires a small driver to be written for this device. We present an implementation based on Linux, the KVM hypervisor and Qemu as its userspace device emulator. Our implementation achieves near native performance in terms of MPI latency and bandwidth

    The 'what' and 'how' of learning in design, invited paper

    Get PDF
    Previous experiences hold a wealth of knowledge which we often take for granted and use unknowingly through our every day working lives. In design, those experiences can play a crucial role in the success or failure of a design project, having a great deal of influence on the quality, cost and development time of a product. But how can we empower computer based design systems to acquire this knowledge? How would we use such systems to support design? This paper outlines some of the work which has been carried out in applying and developing Machine Learning techniques to support the design activity; particularly in utilising previous designs and learning the design process

    Preliminary basic performance analysis of the Cedar multiprocessor memory system

    Get PDF
    Some preliminary basic results on the performance of the Cedar multiprocessor memory system are presented. Empirical results are presented and used to calibrate a memory system simulator which is then used to discuss the scalability of the system
    • 

    corecore