    Clarifying and compiling C/C++ concurrency: from C++11 to POWER

    The upcoming C and C++ revised standards add concurrency to the languages, for the first time, in the form of a subtle *relaxed memory model* (the *C++11 model*). This aims to permit compiler optimisation and to accommodate the differing relaxed-memory behaviours of mainstream multiprocessors, combining simple semantics for most code with high-performance *low-level atomics* for concurrency libraries. In this paper, we first establish two simpler but provably equivalent models for C++11, one for the full language and another for the subset without consume operations. Subsetting further to the fragment without low-level atomics, we identify a subtlety arising from atomic initialisation and prove that, under an additional condition, the model is equivalent to sequential consistency for race-free programs

    Nemos: a framework for axiomatic and executable specifications of memory consistency models

    technical reportConforming to the underlying memory consistency rules is a fundamental require- ment for implementing shared memory systems and writing multiprocessor programs. In order to promote understanding and enable automated verification, it is highly desir- able that a memory model specification be both declarative and executable. We have developed a specification framework called Nemos (Non-operational yet Executable Memory Ordering Specifications), which employs a uniform notation based on predi- cate logic to define shared memory semantics in an axiomatic as well as compositional style. In this paper, we present this framework and discuss how constraint logic pro- gramming and SAT solving can be used to make these axiomatic specifications exe- cutable for memory model analysis, thus supporting precise specification and automatic execution in the same framework. To illustrate our approach, this paper formalizes a collection of well known memory models, including sequential consistency, coherence, PRAM, causal consistency, and processor consistency

    A Scalable Middleware Solution for Advanced Wide Area Web Services

    To alleviate scalability problems in the Web, many researchers concentrate on how to incorporate advanced caching and replication techniques. Many solutions incorporate object-based techniques. In particular, Web resources are considered as distributed objects offering a well-defined interface. We argue that most proposals ignore two important aspects. First, there is little discussion on what kind of coherence should be provided. Proposing specific caching or replication solutions makes sense only if we know what coherence model they should implement. Second, most proposals treat all Web resources alike. Such a one-size-fits-all approach will never work in a wide-area system. We propose a solution in which Web resources are encapsulated in physically distributed shared objects. Each object should encapsulate not only state and operations, but also the policy by which its state is distributed, cached, replicated, migrated, etc


    The semantics of concurrent data structures is usually given by a sequential specification and a consistency condition. Linearizability is the most popular consistency condition due to its simplicity and general applicability. Nevertheless, for applications that do not require all guarantees offered by linearizability, recent research has focused on improving performance and scalability of concurrent data structures by relaxing their semantics. In this paper, we present local linearizability, a relaxed consistency condition that is applicable to container-type concurrent data structures like pools, queues, and stacks. While linearizability requires that the effect of each operation is observed by all threads at the same time, local linearizability only requires that for each thread T, the effects of its local insertion operations and the effects of those removal operations that remove values inserted by T are observed by all threads at the same time. We investigate theoretical and practical properties of local linearizability and its relationship to many existing consistency conditions. We present a generic implementation method for locally linearizable data structures that uses existing linearizable data structures as building blocks. Our implementations show performance and scalability improvements over the original building blocks and outperform the fastest existing container-type implementations

    A generic operational memory model specification framework for multithreaded program verification

    technical reportGiven the complicated nature of modern architectural and language level memory model designs, it is vital to have a systematic ap- proach for specifying memory consistency requirements that can support verification and promote understanding. In this paper, we develop a spec- ification methodology that defines a memory model operationally using a generic transition system with integrated model checking capability to enable formal reasoning about program correctness in a multithreaded environment. Based on a simple abstract machine, our system can be configured to define a variety of consistency models in a uniform nota- tion. We then apply this framework as a taxonomy to formalize several well known memory models. We also provide an alternative specification for the Java memory model based on a proposal from Manson and Pugh and demonstrate how to conduct computer aided analysis for Java thread semantics. Finally, we compare this operational approach with axiomatic approaches and discuss a method to convert a memory model definition from one style to the other

    Consistency in scalable systems

    About the efficiency of partial replication to implement Distributed Shared Memory

    Distributed Shared Memory abstraction (DSM) is traditionally realized through a distributed memory consistency system(MCS) on top of a message passing system. In this paper we analyze the impossibility of efficient partial replication implementation of causally consistent DSM. Efficiency is discussed in terms of control information that processes have to propagate to maintain consistency. We introduce the notions of share graph and hoop to model variable distribution and the concept of dependency chain to characterize processes that have to manage information about a variable even though they do not read or write that variable. Then, we weaken causal consistency to try to define new consistency criteria weaker enough to allow efficient partial replication implementations and strong enough to solve interesting problems. Finally, we prove that PRAM is such a criterion, and illustrate its power with the Bellman-Ford shortest path algorithm. / Les mémoires partagées réparties constituent une abstraction qui est traditionellement concrétisée par un système réparti de mémoire cohérente, au-dessus d'un système de communication par messages. Dans ce rapport, on analyse l'impossibilité d'avoir une implémentation efficace de mémoire partagée répartie à cohérence causale, basée sur la duplication partielle des variables. L'efficacité est envisagée en terme d'information contrôle qui doit être propagée pour assurer la cohérence. On introduit les notions de graphe de partage et d'arceau, qui modélisent la répartition des variables et la notion de chaîne de dépendance pour caractériser les processus qui doivent gérer des informations relatives à une variable dont ils ne possèdent pas de copie locale. Ensuite, on affaiblit le critère de cohérence causale, dans le but de déterminer un nouveau critère de cohérence qui soit suffisament faible pour permettre un implémentation efficace basée sur la duplication partielle, mais suffisament forte pour pouvoir résoudre des problèmes intéressants. Finalement, on prouve que le critère appelé PRAM satisfait ces exigences, et illustrons sa pertinence en montrant une implémentation de l'algorithme de plus court chemin de Bellman-Ford