438 research outputs found

    Dynamic deadlock verification for general barrier synchronisation

    Get PDF
    We present Armus, a dynamic verification tool for deadlock detection and avoidance specialised in barrier synchronisation. Barriers are used to coordinate the execution of groups of tasks, and serve as a building block of parallel computing. Our tool verifies more barrier synchronisation patterns than current state-of-the-art. To improve the scalability of verification, we introduce a novel eventbased representation of concurrency constraints, and a graph-based technique for deadlock analysis. The implementation is distributed and fault-tolerant, and can verify X10 and Java programs. To formalise the notion of barrier deadlock, we introduce a core language expressive enough to represent the three most widespread barrier synchronisation patterns: group, split-phase, and dynamic membership. We propose a graph analysis technique that selects from two alternative graph representations: the Wait-For Graph, that favours programs with more tasks than barriers; and the State Graph, optimised for programs with more barriers than tasks. We prove that finding a deadlock in either representation is equivalent, and that the verification algorithm is sound and complete with respect to the notion of deadlock in our core language. Armus is evaluated with three benchmark suites in local and distributed scenarios. The benchmarks show that graph analysis with automatic graph-representation selection can record a 7-fold execution increase versus the traditional fixed graph representation. The performance measurements for distributed deadlock detection between 64 processes show negligible overheads

    Communicating Mobile Processes

    Get PDF
    This paper presents a new model for mobile processes in occam-pi. A process, embedded anywhere in a dynamically evolving network, may suspend itself mid-execution, be safely disconnected from its local environment, moved (by communication along a channel), reconnected to a new environment and reactivated. Upon reactivation, the process resumes execution from the same state (i.e. data values and code positions) it held when it suspended. Its view of its environment is unchanged, since that is abstracted by its synchronisation (e.g. channels and barriers) interface and that remains constant. The environment behind that interface will (usually) be completely different. The mobile process itself may contain any number of levels of dynamic sub-network. This model is simpler and, in some ways, more powerful than our earlier proposal, which required a process to terminate before it could be moved. Its formal semantics and implementation, however, throw up extra challenges. We present details and performance of an initial implementation

    Life of occam-Pi

    Get PDF
    This paper considers some questions prompted by a brief review of the history of computing. Why is programming so hard? Why is concurrency considered an “advanced” subject? What’s the matter with Objects? Where did all the Maths go? In searching for answers, the paper looks at some concerns over fundamental ideas within object orientation (as represented by modern programming languages), before focussing on the concurrency model of communicating processes and its particular expression in the occam family of languages. In that focus, it looks at the history of occam, its underlying philosophy (Ockham’s Razor), its semantic foundation on Hoare’s CSP, its principles of process oriented design and its development over almost three decades into occam-? (which blends in the concurrency dynamics of Milner’s ?-calculus). Also presented will be an urgent need for rationalisation – occam-? is an experiment that has demonstrated significant results, but now needs time to be spent on careful review and implementing the conclusions of that review. Finally, the future is considered. In particular, is there a future

    Portable Inter-workgroup Barrier Synchronisation for GPUs

    Get PDF
    Despite the growing popularity of GPGPU programming, there is not yet a portable and formally-specified barrier that one can use to synchronise across workgroups. Moreover, the occupancy-bound execution model of GPUs breaks assumptions inherent in traditional software execution barriers, exposing them to deadlock. We present an occupancy discovery protocol that dynamically discovers a safe estimate of the occupancy for a given GPU and kernel, allowing for a starvation-free (and hence, deadlock-free) inter-workgroup barrier by restricting the number of workgroups according to this estimate. We implement this idea by adapting an existing, previously non-portable, GPU inter-workgroup barrier to use OpenCL 2.0 atomic operations, and prove that the barrier meets its natural specification in terms of synchronisation. We assess the portability of our approach over eight GPUs spanning four vendors, comparing the performance of our method against alternative methods. Our key findings include: (1) the recall of our discovery protocol is nearly 100%; (2) runtime comparisons vary substantially across GPUs and applications; and (3) our method provides portable and safe inter-workgroup synchronisation across the applications we study

    Safety Verification of Phaser Programs

    Full text link
    We address the problem of statically checking control state reachability (as in possibility of assertion violations, race conditions or runtime errors) and plain reachability (as in deadlock-freedom) of phaser programs. Phasers are a modern non-trivial synchronization construct that supports dynamic parallelism with runtime registration and deregistration of spawned tasks. They allow for collective and point-to-point synchronizations. For instance, phasers can enforce barriers or producer-consumer synchronization schemes among all or subsets of the running tasks. Implementations %of these recent and dynamic synchronization are found in modern languages such as X10 or Habanero Java. Phasers essentially associate phases to individual tasks and use their runtime values to restrict possible concurrent executions. Unbounded phases may result in infinite transition systems even in the case of programs only creating finite numbers of tasks and phasers. We introduce an exact gap-order based procedure that always terminates when checking control reachability for programs generating bounded numbers of coexisting tasks and phasers. We also show verifying plain reachability is undecidable even for programs generating few tasks and phasers. We then explain how to turn our procedure into a sound analysis for checking plain reachability (including deadlock freedom). We report on preliminary experiments with our open source tool

    Towards Model Checking Executable UML Specifications in mCRL2

    Get PDF
    We describe a translation of a subset of executable UML (xUML) into the process algebraic specification language mCRL2. This subset includes class diagrams with class generalisations, and state machines with signal and change events. The choice of these xUML constructs is dictated by their use in the modelling of railway interlocking systems. The long-term goal is to verify safety properties of interlockings modelled in xUML using the mCRL2 and LTSmin toolsets. Initial verification of an interlocking toy example demonstrates that the safety properties of model instances depend crucially on the run-to-completion assumptions

    Programming multicores safely : handling barrier deadlocks

    Get PDF
    Tese de doutoramento, Informática (Ciências da Computação), Universidade de Lisboa, Faculdade de Ciências, 2015Nowadays, most produced computing devices include multicore processors. Applications that run on these devices only scale if they can compute in parallel. To this end, mainstream programming languages, like Java and C♯, adopted various parallel programming techniqffes. his thesis focffses on a parallel technique, called barrier, used for synchronisation. A barrier coordinates the execution order of parallel activities, by letting them wait for each other. Tasks using barriers are susceptible to the problem of deadlocks, where at least two activities are (indirectly) in a stalemate because of a conflicting ordering of some barriers. Deadlocks are a class of concurrency failures with a big impact in parallel programs. To help make parallel programming more productive, we propose two complementary techniques that handle deadlocks caused by barriers: a runtime verification tool, and a deadlock-free programming model. We present Armus, a runtime verification tool specialised in barrier deadlocks that is distributed, fault-tolerant, and verifies X10 and Java programs. Our technique verifies more barrier synchronisation patterns than existing state-of-the-art techniques. We improve deadlock verification based on graph analysis: our technique selects from two alternative graph representations of concurrency dependencies to hasten deadlock checking. Armus is evaluated with three benchmark suites in local and distributed scenarios. To handle barrier deadlocks at design time we propose a language called SBrenner that extends and formalises a programming model that originates from the Habanero-Java and the X10 languages. The outcome is a deadlock-free programming model that leverages pipeline parallelism. We present an operational semantics and a type system for SBrenner. Offr type system enjoys the properties of progress and subject reduction.Actualmente, a generalidade dos dispositivos de computação inclui um processador multicore. As aplicações que correm em processadores muIticore só aumentam o seu desempenho se computarem em paralelo, aproveitando assim o poder computacional dos núcleos disponíveis. Para este efeito, as linguagens de programação mais populares, tal como Java e C♯, adoptaram, nos últimos anos, várias técnicas de programação paralela. Esta tese lida com uma classe de falhas que origina da utilização de uma técnica de programação paralela, chamada barreira, cuja funcionalidade é a de sincronizar grupos de tarefas. Uma barreira coordena a ordem de execução de um grupo de tarefas, disponibilizando um ponto de execução em que as várias tarefas dum grupo podem esperar umas pelas outras. As tarefas que usam barreiras são vulneráveis ao problema de impasse, em que pelo menos duas tarefas estão (indirectamente) à espera uma da outra em barreiras diferentes sem que qualquer uma das tarefas possa avançar. Os impasses constituem uma classe de falhas, da área de concorrência, com grande impacto em programas paralelos. O nosso objectivo é aumentar a produtividade da programação paralela tratando do problema de impasses em barreiras. Nesta tese propomos duas técnicas complementares para lidar com o problema de impasses: uma ferramenta de verificação especializada em impasses sobre barreiras que é distribuída, tolerante a falhas e verifica aplicações X10 e Java; um modelo de programação isento de impasses

    Formalization of Phase Ordering

    Full text link
    Phasers pose an interesting synchronization mechanism that generalizes many collective synchronization patterns seen in parallel programming languages, including barriers, clocks, and point-to-point synchronization using latches or semaphores. This work characterizes scheduling constraints on phaser operations, by relating the execution state of two tasks that operate on the same phaser. We propose a formalization of Habanero phasers, May-Happen-In-Parallel, and Happens-Before relations for phaser operations, and show that these relations conform with the semantics. Our formalization and proofs are fully mechanized using the Coq proof assistant, and are available online.Comment: In Proceedings PLACES 2016, arXiv:1606.0540

    Automatically Generating CSP Models for Communicating Haskell Processes

    Get PDF
    Tools such as FDR can check whether a CSP model of an implementation is a refinement of a given CSP specification. We present a technique for generating such CSP models of Haskell implementations that use the Communicating Haskell Processes library. Our technique avoids the need for a detailed semantics of the Haskell language, and requires only minimal program annotation. The generated CSP-M model can be checked for deadlock or refinements by FDR, allowing easy use of formal methods without the need to maintain a model of the program implementation alongside the program itself
    corecore