321 research outputs found

    High Level Concurrency in C∀

    Get PDF
    Concurrent programs are notoriously hard to write and even harder to debug. Furthermore concurrent programs must be performant, as the introduction of concurrency into a program is often done to achieve some form of speedup. This thesis presents a suite of high-level concurrent-language features in the new programming language C∀, all of which are implemented with the aim of improving the performance, productivity, and safety of concurrent programs. C∀ is a non-object-oriented programming language that extends C. The foundation for concurrency in C∀ was laid by Thierry Delisle [15], who implemented coroutines, user-level threads, and monitors. This thesis builds upon that work and introduces a suite of new concurrent features as its main contribution. The features include a mutex statement (similar to a C++ scoped lock or Java synchronized statement), Go-like channels and select statement, and an actor system. The root ideas behind these features are not new, but the C∀ implementations extends the original ideas in performance, productivity, and safety

    Efficiently and Transparently Maintaining High SIMD Occupancy in the Presence of Wavefront Irregularity

    Get PDF
    Demand is increasing for high throughput processing of irregular streaming applications; examples of such applications from scientific and engineering domains include biological sequence alignment, network packet filtering, automated face detection, and big graph algorithms. With wide SIMD, lightweight threads, and low-cost thread-context switching, wide-SIMD architectures such as GPUs allow considerable flexibility in the way application work is assigned to threads. However, irregular applications are challenging to map efficiently onto wide SIMD because data-dependent filtering or replication of items creates an unpredictable data wavefront of items ready for further processing. Straightforward implementations of irregular applications on a wide-SIMD architecture are prone to load imbalance and reduced occupancy, while more sophisticated implementations require advanced use of parallel GPU operations to redistribute work efficiently among threads. This dissertation will present strategies for addressing the performance challenges of wavefront- irregular applications on wide-SIMD architectures. These strategies are embodied in a developer framework called Mercator that (1) allows developers to map irregular applications onto GPUs ac- cording to the streaming paradigm while abstracting from low-level data movement and (2) includes generalized techniques for transparently overcoming the obstacles to high throughput presented by wavefront-irregular applications on a GPU. Mercator forms the centerpiece of this dissertation, and we present its motivation, performance model, implementation, and extensions in this work

    Finding and Tolerating Concurrency Bugs.

    Full text link
    Shared-memory multi-threaded programming is inherently more difficult than single-threaded programming. The main source of complexity is that, the threads of an application can interleave in so many different ways. To ensure correctness, a programmer has to test all possible thread interleavings, which, however, is impractical. Many rare thread interleavings remain untested in production systems, and they are the major cause for a majority of concurrency bugs. Given that untested interleavings are the major cause of a majority of the concurrency bugs, this dissertation explores two possible ways to tackle concurrency bugs in this dissertation. One is to expose untested interleavings during testing to find concurrency bugs. The other is to avoid untested interleavings during production runs to tolerate concurrency bugs. The key is an efficient and effective way to encode and remember tested interleavings. This dissertation first discusses two hypotheses about concurrency bugs: the small scope hypothesis and the value independent hypothesis. Based on these two hypotheses, this dissertation defines a set of interleaving patterns, called interleaving idioms, which are used to encode tested interleavings. The empirical analysis shows that the idiom based interleaving encoding scheme is able to represent most of the concurrency bugs that are used in the study. Then, this dissertation discusses an open source testing tool called Maple. It memoizes tested interleavings and actively seeks to expose untested interleavings. The results show that Maple is able to expose concurrency bugs and expose interleavings faster than other conventional testing techniques. Finally, this dissertation discusses two parallel runtime system designs which seek to avoid untested interleavings during production runs to tolerate concurrency bugs. Avoiding untested interleavings significantly improve correctness because most of the concurrency bugs are caused by untested interleavings. Also, the performance overhead for disallowing untested interleavings is low as commonly occuring interleavings should have been tested in a well-tested program.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/99765/1/jieyu_1.pd

    Scheduling of flexible manufacturing systems integrating petri nets and artificial intelligence methods.

    Get PDF
    The work undertaken in this thesis is about the integration of two well-known methodologies: Petri net (PN) model Ii ng/analysis of industrial production processes and Artificial Intelligence (AI) optimisation search techniques. The objective of this integration is to demonstrate its potential in solving a difficult and widely studied problem, the scheduling of Flexible Manufacturing Systems (FIVIS). This work builds on existing results that clearly show the convenience of PNs as a modelling tool for FIVIS. It addresses the problem of the integration of PN and Al based search methods. Whilst this is recognised as a potentially important approach to the scheduling of FIVIS there is a lack of any clear evidence that practical systems might be built. This thesis presents a novel scheduling methodology that takes forward the current state of the art in the area by: Firstly presenting a novel modelling procedure based on a new class of PN (cb-NETS) and a language to define the essential features of basic FIVIS, demonstrating that the inclusion of high level FIVIS constraints is straight forward. Secondly, we demonstrate that PN analysis is useful in reducing search complexity and presents two main results: a novel heuristic function based on PN analysis that is more efficient than existing methods and a novel reachability scheme that avoids futile exploration of candidate schedules. Thirdly a novel scheduling algorithm that overcomes the efficiency drawbacks of previous algorithms is presented. This algorithm satisfactorily overcomes the complexity issue while achieving very promising results in terms of optimality. Finally, this thesis presents a novel hybrid scheduler that demonstrates the convenience of the use of PN as a representation paradigm to support hybridisation between traditional OR methods, Al systematic search and stochastic optimisation algorithms. Initial results show that the approach is promising

    Intelligent Management of Inter-Thread Synchronization Dependencies for Concurrent Programs.

    Full text link
    Power dissipation limits and design complexity have made the microprocessor industry less successful in improving the performance of monolithic processors, even though semiconductor technology continues to scale. Consequently, chip multiprocessors (CMPs) have become a standard for all ranges of computing from cellular phones to high-performance servers. As sufficient thread level parallelism (TLP) is necessary to exploit the computational power provided by CMPs, most performance-aware programmers need to parallelize their programs. For shared memory multi-threaded programs, synchronization mechanisms such as mutexes, barriers, and condition variables, are used to enforce the threads to interact with each other in the way the programmers intended. However, employing synchronization operations in both correct and efficient way at the same time is extremely difficult, and there have been trade-offs between programmability and efficiency of using synchronizations. This thesis proposes a collection of works that increase the programmability and efficiency of concurrent programs by intelligently managing the synchronization operations. First, we focus on mutex locks and unlocks. Many concurrency bug detection tools and automated bug fixers rely on the precise identification of critical sections guarded by lock/unlock operations. We suggest a practical lock/unlock pairing mechanism that combines static analysis with dynamic instrumentation to identify critical sections in POSIX multi-threaded C/C++ programs. Second, we present Dynamic Core Boosting (DCB) to accelerate critical paths in multi-thread programs. Inter-thread dependencies through synchronizations form critical paths. These critical paths are major performance bottlenecks for concurrent programs, and they are exacerbated by workload imbalances in performance asymmetric CMPs. DCB coordinates its compiler, runtime subsystem, and architecture to mitigates such performance bottlenecks. Finally, we propose exploiting synchronization operations for better energy efficiency through dynamic power management.PhDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108886/1/netforce_1.pd

    On the Performance Estimation and Resource Optimisation in Process Petri Nets

    Get PDF
    Many artificial systems can be modeled as discrete dynamic systems in which resources are shared among different tasks. The performance of such systems, which is usually a system requirement, heavily relies on the number and distribution of such resources. The goal of this paper is twofold: first, to design a technique to estimate the steady-state performance of a given system with shared resources, and second, to propose a heuristic strategy to distribute shared resources so that the system performance is enhanced as much as possible. The systems under consideration are assumed to be large systems, such as service-oriented architecture (SOA) systems, and modeled by a particular class of Petri nets (PNs) called process PNs. In order to avoid the state explosion problem inherent to discrete models, the proposed techniques make intensive use of linear programming (LP) problems

    Deadlock Avoidance in Automated Manufacturing Systems

    Get PDF

    Cellular Automata for Pattern Recognition

    Get PDF

    QUEST/Ada (Query Utility Environment for Software Testing of Ada): The development of a prgram analysis environment for Ada, task 1, phase 2

    Get PDF
    The results of research and development efforts are described for Task one, Phase two of a general project entitled The Development of a Program Analysis Environment for Ada. The scope of this task includes the design and development of a prototype system for testing Ada software modules at the unit level. The system is called Query Utility Environment for Software Testing of Ada (QUEST/Ada). The prototype for condition coverage provides a platform that implements expert system interaction with program testing. The expert system can modify data in the instrument source code in order to achieve coverage goals. Given this initial prototype, it is possible to evaluate the rule base in order to develop improved rules for test case generation. The goals of Phase two are the following: (1) to continue to develop and improve the current user interface to support the other goals of this research effort (i.e., those related to improved testing efficiency and increased code reliable); (2) to develop and empirically evaluate a succession of alternative rule bases for the test case generator such that the expert system achieves coverage in a more efficient manner; and (3) to extend the concepts of the current test environment to address the issues of Ada concurrency
    • …
    corecore