10 research outputs found
Safe and Verifiable Design of Concurrent Java Programs
The design of concurrent programs has a reputation for being difficult, and thus potentially dangerous in safetycritical real-time and embedded systems. The recent appearance of Java, whilst cleaning up many insecure aspects of OO programming endemic in C++, suffers from a deceptively simple threads model that is an insecure variant of ideas that are over 25 years old [1]. Consequently, we cannot directly exploit a range of new CASE tools -- based upon modern developments in parallel computing theory -- that can verify and check the design of concurrent systems for a variety of dangers\ud
such as deadlock and livelock that otherwise plague us during testing and maintenance and, more seriously, cause catastrophic failure in service. \ud
Our approach uses recently developed Java class\ud
libraries based on Hoare's Communicating Sequential Processes (CSP); the use of CSP greatly simplifies the design of concurrent systems and, in many cases, a parallel approach often significantly simplifies systems originally approached sequentially. New CSP CASE tools permit designs to be verified against formal specifications\ud
and checked for deadlock and livelock. Below we introduce CSP and its implementation in Java and develop a small concurrent application. The formal CSP description of the application is provided, as well as that of an equivalent sequential version. FDR is used to verify the correctness of both implementations, their\ud
equivalence, and their freedom from deadlock and livelock
A Design Strategy for Deadlock-Free Concurrent Systems
When building concurrent systems, it would be useful to have a collection of reusable processes
to perform standard tasks. However, without knowing certain details of the inner workings of
these components, one can never be sure that they will not cause deadlock when connected to
some particular network.
Here we describe a hierarchical method for designing complex networks of communicating
processeswhich are deadlock-free.We use this to define a safe and simple method for specifying
the communication interface to third party software components. This work is presented using
the CSP model of concurrency and the occam2.1 programming language
Deadlock-freeness of hexagonal systolic arrays
With the re-emergence of parallel computation for technical applications in these days also the classical concept of systolic arrays is becoming important again. However, for the sake of their operational safety, the question of deadlock must be addressed. For this contribution we used the well-known Roscoe-Dathi method to demonstrate the deadlock-freeness of a systolic array with hexagonal connectivity. Our result implies that it is theoretically safe to deploy such arrays on various platforms. Our proof is valid for all cases in which the computational pattern (input-output-behaviour) of the array does not depend on the particular values (contents) of the communicated data.http://www.elsevier.com/locate/iplmv201
Deadlock checking by a behavioral effect system for lock handling
AbstractDeadlocks are a common error in programs with lock-based concurrency and are hard to avoid or even to detect. One way for deadlock prevention is to statically analyze the program code to spot sources of potential deadlocks. Often static approaches try to confirm that the lock-taking adheres to a given order, or, better, to infer that such an order exists. Such an order precludes situations of cyclic waiting for each other’s resources, which constitute a deadlock.In contrast, we do not enforce or infer an explicit order on locks. Instead we use a behavioral type and effect system that, in a first stage, checks the behavior of each thread or process against the declared behavior, which captures potential interaction of the thread with the locks. In a second step on a global level, the state space of the behavior is explored to detect potential deadlocks. We define a notion of deadlock-sensitive simulation to prove the soundness of the abstraction inherent in the behavioral description. Soundness of the effect system is proven by subject reduction, formulated such that it captures deadlock-sensitive simulation.To render the state-space finite, we show two further abstractions of the behavior sound, namely restricting the upper bound on re-entrant lock counters, and similarly by abstracting the (in general context-free) behavioral effect into a coarser, tail-recursive description. We prove our analysis sound using a simple, concurrent calculus with re-entrant locks
A Pattern-based deadlock-freedom analysis strategy for concurrent systems
Local analysis has long been recognised as an effective tool to combat the
state-space explosion problem. In this work, we propose a method that
systematises the use of local analysis in the verification of deadlock freedom
for concurrent and distributed systems. It combines a strategy for system
decomposition with the verification of the decomposed subsystems via adherence
to behavioural patterns. At the core of our work, we have a number of CSP
refinement expressions that allows the user of our method to automatically
verify all the behavioural restrictions that we impose. We also propose a
prototype tool to support our method. Finally, we demonstrate the practical
impact our method can have by analysing how it fares when applied to some
examples
Transputer Implementation for the Shell Model and Sd Shell Calculations
This thesis consists of two parts. The first part discusses a new Shell model implementation based on communicating sequential processes. The second part contains different shell model calculations, which have been done using an earlier implementation. Sequential processing computers appear to be fast reaching their upper limits of efficiency. Presently they can perform one machine operation in every clock cycle and the silicon technology also seems to have reached its physical limits of miniaturization. Hence new software/hardware approaches should be investigated in order to meet growing computational requirements. Parallel processing has been demonstrated to be one alternative to achieve this objective. But the major problem with this approach is that many algorithms used for the solution of physical problems are not suitable for distribution over a number of processors. In part one of this work we have identified this concurrency in the shell model calculations and implemented it on the Meiko Computing Surface. Firstly we have explained the motivation for this project and then give a detailed comparison of different hardware/software that has been available to us and reasons for our preferred choice. Similarly, we also outline the advantages/disadvantages of the available parallel/sequential languages before choosing parallel C to be our language of implementation. We describe our new serial implementation DASS, the Dynamic And Structured Shell model, which forms basis for the parallel version. We have developed a new algorithm for the phase calculation of Slater Determinants, which is, superior to the previously used occupancy representation method. Both our serial and parallel implementations have adopted this representation. The PARALLEL GLASNAST, as we call it, PARALLEL GLASgow Nuclear Algorithmic Technique, is our complete implementation of the inherent parallelism in Shell model calculation and has been described in detail. It is actually based on splitting the whole calculation into three tasks, which can be distributed on the number of processors required by the chosen topology, and executed concurrently. We also give a detailed discussion of the communication/ synchronization protocols which preserve the available concurrency. We have achieved a complete overlap of the the main tasks, one responsible for arithmetically intensive operations and the other doing searching among, possibly, millions of states. It demonstrates that the implementation of these tasks has got enough built in flexibility that they could be run on any number of processors. Execution times for one and three transputers have been obtained for 28Si, which are fairly good. We have also undertaken a detailed analysis of how the amount of communication (traffic) between processors changes with the increase in the number of states. Part two describes shell model calculations for mass 21 nuclei. Previous many calculations have not taken into account the Coulomb's interaction, which is responsible for differences between mirror nuclei. They also do not use the valuable information on nucleon occupancies. We have made extensive calculations for the six isobars in mass 21 using CWC, PW and USD interactions. The results obtained in this case include, energy, spin, isospin and electromagnetic transition rates. These result are discussed and conclusions drawn. We concentrate on the comparison of the properties in of each mirror pairs. This comparison is supplemented by tables, energy level diagrams and occupancy diagrams. As we consider mirror pair individually, the mixing of states, which is caused by the short range nuclear force and the Coulomb force, becomes more evident. The other important thing we have noticed is, that some pairs of states swap their places, between a mirror pair, on the occupancy diagram, suggesting that their wave functions might have been swapped. We have undertaken a detailed study to discover any swapping states. The tests applied to confirm this include comparison of energy, electromagnetic properties and the occupancy information obtained with different interactions. We find that only the 91, 92 states in Al have swapped over. We also report some real energy gaps which exist on the basis of our calculations for Al
Recommended from our members
Cherub: A hardware distributed single shared address space memory architecture
Increased computer throughput can be achieved through the use of parallel processing. The granularity of a parallel program is the average number of instructions performed by the tasks constituting it. Coarse-grained programs typically execute huge numbers of instructions per task (w 105). The tasks in fine-grained programs are typically short (æ 103). In general, the finer the program grain, the greater the potential for exploiting parallelism. Amdahl’s Law shows that in the absence of overheads, the more potential parallelism that is realised in an algorithm, the faster it will be. The economical granularity of tasks is determined by the intertask communications overhead. Break-even occurs when processing is approximately equally divided between useful work and overhead.
The two common parallel programming paradigms are shared variable and message passing. Shared variable is, in general, the more natural of the two as it allows implicit communication between tasks. This encourages the programmer to make use of fine-grained tasks. The message passing paradigm requires explicit communication between tasks. This encourages the programmer to use coarser-grained tasks.
Two kinds of parallel architecture have become established. The first is the multiprocessor, which is built around a shared bus giving broadcast communications and a shared memory. This is characterised by low communications overhead, but limited scalability. The second is the multicomputer, which is based on point-to-point communications with larger communications overhead, but good scalability. Quantitatively, the low overhead of the multiprocessor is well matched to fine-grain tasks and, hence, to supporting the shared variable paradigm, while the high overhead of the multicomputer matches it to coarse-grain parallelism and, hence, to the message passing paradigm.
Currently, there appears to be no middle ground in parallel computing; an architecture which can support both several hundred medium-grained (« 104 instructions) parallel tasks and the shared variable programming paradigm would be advantageous in many applications.
This thesis asserts that it is possible to implement a new computer architecture, Cherub, which has at least 200 processors and is able to support shared variable programming with an optimal task granularity of around 104 instructions. This can be achieved through the combination of a hardware-based distributed shared single address space and a wafer-scale communications network.
To support the thesis, the dissertation first specifies a programmer’s interface to Cherub which is simple enough to implement in hardware. It then designs algorithms which provide this interface, allowing the requirements of the underlying network to be estimated. Finally, a wafer scale communications network is outlined, and simulations are used to demonstrate that it can provide the performance required to successfully implement Cherub
The pursuit of deadlock freedom
AbstractWe introduce some combinatorial techniques for establishing the deadlock freedom of concurrent systems which are similar to the variant/invariant method of proving loop termination. Our methods are based on the local analysis of networks, which is combinatorially far easier than analysing all global states. They are illustrated by proving numerous examples to be free of deadlock, some of which are useful classes of network