4,060 research outputs found

    Behavioural types for non-uniform memory accesses

    Full text link
    Concurrent programs executing on NUMA architectures consist of concurrent entities (e.g. threads, actors) and data placed on different nodes. Execution of these concurrent entities often reads or updates states from remote nodes. The performance of such systems depends on the extent to which the concurrent entities can be executing in parallel, and on the amount of the remote reads and writes. We consider an actor-based object oriented language, and propose a type system which expresses the topology of the program (the placement of the actors and data on the nodes), and an effect system which characterises remote reads and writes (in terms of which node reads/writes from which other nodes). We use a variant of ownership types for the topology, and a combination of behavioural and ownership types for the effect system.Comment: In Proceedings PLACES 2015, arXiv:1602.0325

    Static Race Detection and Mutex Safety and Liveness for Go Programs

    Get PDF
    Go is a popular concurrent programming language thanks to its ability to efficiently combine concurrency and systems programming. In Go programs, a number of concurrency bugs can be caused by a mixture of data races and communication problems. In this paper, we develop a theory based on behavioural types to statically detect data races and deadlocks in Go programs. We first specify lock safety/liveness and data race properties over a Go program model, using the happens-before relation defined in the Go memory model. We represent these properties of programs in a μ-calculus model of types, and validate them using type-level model-checking. We then extend the framework to account for Go’s channels, and implement a static verification tool which can detect concurrency errors. This is, to the best of our knowledge, the first static verification framework of this kind for the Go language, uniformly analysing concurrency errors caused by a mix of shared memory accesses and asynchronous message-passing communications

    Optimization of Discrete-parameter Multiprocessor Systems using a Novel Ergodic Interpolation Technique

    Full text link
    Modern multi-core systems have a large number of design parameters, most of which are discrete-valued, and this number is likely to keep increasing as chip complexity rises. Further, the accurate evaluation of a potential design choice is computationally expensive because it requires detailed cycle-accurate system simulation. If the discrete parameter space can be embedded into a larger continuous parameter space, then continuous space techniques can, in principle, be applied to the system optimization problem. Such continuous space techniques often scale well with the number of parameters. We propose a novel technique for embedding the discrete parameter space into an extended continuous space so that continuous space techniques can be applied to the embedded problem using cycle accurate simulation for evaluating the objective function. This embedding is implemented using simulation-based ergodic interpolation, which, unlike spatial interpolation, produces the interpolated value within a single simulation run irrespective of the number of parameters. We have implemented this interpolation scheme in a cycle-based system simulator. In a characterization study, we observe that the interpolated performance curves are continuous, piece-wise smooth, and have low statistical error. We use the ergodic interpolation-based approach to solve a large multi-core design optimization problem with 31 design parameters. Our results indicate that continuous space optimization using ergodic interpolation-based embedding can be a viable approach for large multi-core design optimization problems.Comment: A short version of this paper will be published in the proceedings of IEEE MASCOTS 2015 conferenc

    Using program behaviour to exploit heterogeneous multi-core processors

    Get PDF
    Multi-core CPU architectures have become prevalent in recent years. A number of multi-core CPUs consist of not only multiple processing cores, but multiple different types of processing cores, each with different capabilities and specialisations. These heterogeneous multi-core architectures (HMAs) can deliver exceptional performance; however, they are notoriously difficult to program effectively. This dissertation investigates the feasibility of ameliorating many of the difficulties encountered in application development on HMA processors, by employing a behaviour aware runtime system. This runtime system provides applications with the illusion of executing on a homogeneous architecture, by presenting a homogeneous virtual machine interface. The runtime system uses knowledge of a program's execution behaviour, gained through explicit code annotations, static analysis or runtime monitoring, to inform its resource allocation and scheduling decisions, such that the application makes best use of the HMA's heterogeneous processing cores. The goal of this runtime system is to enable non-specialist application developers to write applications that can exploit an HMA, without the developer requiring in-depth knowledge of the HMA's design. This dissertation describes the development of a Java runtime system, called Hera-JVM, aimed at investigating this premise. Hera-JVM supports the execution of unmodified Java applications on both processing core types of the heterogeneous IBM Cell processor. An application's threads of execution can be transparently migrated between the Cell's different core types by Hera-JVM, without requiring the application's involvement. A number of real-world Java benchmarks are executed across both of the Cell's core types, to evaluate the efficacy of abstracting a heterogeneous architecture behind a homogeneous virtual machine. By characterising the performance of each of the Cell processor's core types under different program behaviours, a set of influential program behaviour characteristics is uncovered. A set of code annotations are presented, which enable program code to be tagged with these behaviour characteristics, enabling a runtime system to track a program's behaviour throughout its execution. This information is fed into a cost function, which Hera-JVM uses to automatically estimate whether the executing program's threads of execution would benefit from being migrated to a different core type, given their current behaviour characteristics. The use of history, hysteresis and trend tracking, by this cost function, is explored as a means of increasing its stability and limiting detrimental thread migrations. The effectiveness of a number of different migration strategies is also investigated under real-world Java benchmarks, with the most effective found to be a strategy that can target code, such that a thread is migrated whenever it executes this code. This dissertation also investigates the use of runtime monitoring to enable a runtime system to automatically infer a program's behaviour characteristics, without the need for explicit code annotations. A lightweight runtime behaviour monitoring system is developed, and its effectiveness at choosing the most appropriate core type on which to execute a set of real-world Java benchmarks is examined. Combining explicit behaviour characteristic annotations with those characteristics which are monitored at runtime is also explored. Finally, an initial investigation is performed into the use of behaviour characteristics to improve application performance under a different type of heterogeneous architecture, specifically, a non-uniform memory access (NUMA) architecture. Thread teams are proposed as a method of automatically clustering communicating threads onto the same NUMA node, thereby reducing data access overheads. Evaluation of this approach shows that it is effective at improving application performance, if the application's threads can be partitioned across the available NUMA nodes of a system. The findings of this work demonstrate that a runtime system with a homogeneous virtual machine interface can reduce the challenge of application development for HMA processors, whilst still being able to exploit such a processor by taking program behaviour into account

    The Scythe Statistical Library: An Open Source C++ Library for Statistical Computation

    Get PDF
    The Scythe Statistical Library is an open source C++ library for statistical computation. It includes a suite of matrix manipulation functions, a suite of pseudo-random number generators, and a suite of numerical optimization routines. Programs written using Scythe are generally much faster than those written in commonly used interpreted languages, such as R and \proglang{MATLAB}; and can be compiled on any system with the GNU GCC compiler (and perhaps with other C++ compilers). One of the primary design goals of the Scythe developers has been ease of use for non-expert C++ programmers. Ease of use is provided through three primary mechanisms: (1) operator and function over-loading, (2) numerous pre-fabricated utility functions, and (3) clear documentation and example programs. Additionally, Scythe is quite flexible and entirely extensible because the source code is available to all users under the GNU General Public License.

    Simulation models of shared-memory multiprocessor systems

    Get PDF
    • …
    corecore