120 research outputs found

    Experiments with parallel algorithms for combinatorial problems

    Get PDF
    In the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines. The parallel machines constructed so far all use a simple model of parallel computation. Therefore, not every existing parallel machine is equally well suited for each type of algorithm. The adaptation of a certain algorithm to a specific parallel archi- tecture may severely increase the complexity of the algorithm or severely obscure its essence. Little is known about the performance of some standard combinatorial algorithms on existing parallel machines. In this paper we present computational results concerning the solution of knapsack, shortest paths and change-making problems by branch and bound, dynamic programming, and divide and conquer algorithms on the ICL-DAP (an SIMD computer), the Manchester dataflow machine and the CDC-CYBER-205 (a pipeline computer)

    A parallel implementation of SASL

    Get PDF
    The applicative or functional language SASL is investigated from the point of view of an implementation. The aim is to determine and experiment with a run-time environment (SASL parallel machine) which incorporates parallelism so that constituent parts of a program (its sub-expressions) can be processed concurrently. The introduction of parallelism is characterised by two fundamental issues. The type of programs, referred to as parallel and the so called strategy of parallelism, employed by the parallel machine. The former concerns deriving a graph from the program text indicating the order in which things must be done and the notion of "worthwhile" parallelism. In order to obtain a parallel program the original (sequential) program is transformed and/or modified. Certain programs are found to be essentially sequential. Parallelism is expressed as call-by-parallel parameter passing mechanism and by a parallel conditional operator, suggesting speculative parallelism. The issue of the strategy of parallelism concerns the scheme under which a regime of SASL processors combine their effort in processing a parallel program. The objective being to shorten the length of computation carried out by the sequential machine on the initial program. The class of parallel programs seems to be non-trivial and it includes both non-numerical and numerical programs. The "speed-up" by appealing to parallelism for such programs is found to be substantial

    Intelligent cell memory system for real time engineering applications

    Get PDF

    Improving Model-Based Software Synthesis: A Focus on Mathematical Structures

    Get PDF
    Computer hardware keeps increasing in complexity. Software design needs to keep up with this. The right models and abstractions empower developers to leverage the novelties of modern hardware. This thesis deals primarily with Models of Computation, as a basis for software design, in a family of methods called software synthesis. We focus on Kahn Process Networks and dataflow applications as abstractions, both for programming and for deriving an efficient execution on heterogeneous multicores. The latter we accomplish by exploring the design space of possible mappings of computation and data to hardware resources. Mapping algorithms are not at the center of this thesis, however. Instead, we examine the mathematical structure of the mapping space, leveraging its inherent symmetries or geometric properties to improve mapping methods in general. This thesis thoroughly explores the process of model-based design, aiming to go beyond the more established software synthesis on dataflow applications. We starting with the problem of assessing these methods through benchmarking, and go on to formally examine the general goals of benchmarks. In this context, we also consider the role modern machine learning methods play in benchmarking. We explore different established semantics, stretching the limits of Kahn Process Networks. We also discuss novel models, like Reactors, which are designed to be a deterministic, adaptive model with time as a first-class citizen. By investigating abstractions and transformations in the Ohua language for implicit dataflow programming, we also focus on programmability. The focus of the thesis is in the models and methods, but we evaluate them in diverse use-cases, generally centered around Cyber-Physical Systems. These include the 5G telecommunication standard, automotive and signal processing domains. We even go beyond embedded systems and discuss use-cases in GPU programming and microservice-based architectures

    Adaptive architecture-transparent policy control in a distributed graph reducer

    Get PDF
    The end of the frequency scaling era occured around 2005 as the clock frequency has stalled for commodity architectures. Thus performance improvements that could in the past be expected with each new hardware generation needed to originate elsewhere. Almost all computer architectures exhibit substantial and growing levels of parallelism, exploiting which became one of the key sources of performance and scalability improvements. Alas, parallel programming proved much more difficult than sequential, due to the need to specify coordination and parallelism management aspects. Whilst low-level languages place the burden on the programmers reducing productivity and portability, semi-implicit approaches delegate the responsibility to sophisticated compilers and run-time systems. This thesis presents a study of adaptive load distribution based on work stealing using history and ancestry information in a distributed graph reducer for a nonstrict functional language. The results contribute to the exploration of more flexible run-time-system-level parallelism control implementing a semi-explicit model of parallelism, which offers productivity and high level of abstraction by delegating the responsibility of coordination to the run-time system. After characterising a set of parallel functional applications, we study the use of historical information to adapt the choice of the victim to steal from in a work stealing scheduler. We observe substantially lower numbers of messages for data-parallel and nested applications. However, this heuristic fails in cases where past application behaviour is not resembling future behaviour, for instance for Divide-&-Conquer applications with a large number of very fine-grained threads and generators of parallelism that move dynamically across processing elements. This mechanism is not specific to the language and the run-time system, and applies to other work stealing schedulers. Next, we focus on the other key work stealing decision of which sparks that represent potential parallelism to donate, investigating the effect of Spark Colocation on the performance of five Divide-&-Conquer programs run on a cluster of up to 256 PEs. When using Spark Colocation, the distributed graph reducer shares related work resulting in a higher degree of both potential and actual parallelism, and more fine-grained and less variable thread size. We validate this behaviour by observing a reduction in average fetch times, but increased amounts of FETCH messages and of inter-PE pointers for colocation, which nevertheless results in improved load balance for three of the five benchmark programs. The results show high speedups and speedup improvements for Spark Colocation for the three more regular and nested applications and performance degradation for two programs: one that is excessively fine-grained and one exhibiting limited scalability. Overall, Spark Colocation appears most beneficial for higher numbers of PEs, where improved load balance and higher degree of parallelism have more opportunities to pay off. In more general terms, we show that a run-time system can beneficially use historical information on past stealing successes that is gathered dynamically and used within the same run and the ancestry information dynamically reconstructed at run time using annotations. Moreover, the results support the view that different heuristics are beneficial for applications using different parallelism patterns, underlining the advantages of a flexible architecture-transparent approach.The Scottish Informatics and Computer Science Alliance (SICSA

    Context flow architecture

    Get PDF

    Calculi for higher order communicating systems

    Get PDF
    This thesis develops two Calculi for Higher Order Communicating Systems. Both calculi consider sending and receiving processes to be as fundamental as nondeterminism and parallel composition. The first calculus called CHOCS is an extension of Milner's CCS in the sense that all the constructions of CCS are included or may be derived from more fundamental constructs. Most of the mathematical framework of CCS carries over almost unchanged. The operational semantics of CHOCS is given as a labelled transition system and it is a direct extension of the semantics of CCS with value passing. A set of algebraic laws satisfied by the calculus is presented. These are similar to the CCS laws only introducing obvious extra laws for sending and receiving processes. The power of process passing is underlined by a result showing that the recursion operator is unnecessary in the sense that recursion can be simulated by means of process passing and communication. The CHOCS language is also studied by means of a denotational semantics. A major result is the full abstractness of this semantics with respect to the operational semantics. The denotational semantics is used to provide an easy proof of the simulation of recursion. Introducing processes as first class objects yields a powerful metalanguage. It is shown that it is possible to simulate various reduction strategies of the untyped λ-Calculus in CHOCS. As pointed out by Milner, CCS has its limitations when one wants to describe unboundedly expanding systems, e.g. an unbounded number of procedure invocations in an imperative concurrent programming language P with recursive procedures. CHOCS may neatly describe both call-by-value and call-by-reference parameter mechanisms for P. We also consider call-by-name and lazy parameter mechanisms for P. The second calculus is called Plain CHOCS. Essential to the new calculus is the treatment of restriction as a static binding operator on port names. This calculus is given an operational semantics using labelled transition systems which combines ideas from the applicative transition systems described by Abramsky and the transition systems used for CHOCS. This calculus enjoys algebraic properties which are similar to those of CHOCS only needing obvious extra laws for the static nature of the restriction operator. Processes as first class objects enable description of networks with changing interconnection structure and there is a close connection between the Plain CHOCS calculus and the π-Calculus described by Milner, Parrow and Walker: the two calculi can simulate one another. Recently object oriented programming has grown into a major discipline in computational practice as well as in computer science. From a theoretical point of view object oriented programming presents a challenge to any metalanguage since most object oriented languages have no formal semantics. We show how Plain CHOCS may be used to give a semantics to a prototype object oriented language called 0.Open Acess

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 28th European Symposium on Programming, ESOP 2019, which took place in Prague, Czech Republic, in April 2019, held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019
    corecore