Search CORE

1,229 research outputs found

A semantics and implementation of a causal logic programming language

Author: Clayton Roger
Cleary John G.
Utting Mark
Publication venue: University of Waikato, Department of Computer Science
Publication date: 11/02/2009
Field of study

The increasingly widespread availability of multicore and manycore computers demands new programming languages that make parallel programming dramatically easier and less error prone. This paper describes a semantics for a new class of declarative programming languages that support massive amounts of implicit parallelism

Research Commons@Waikato

Improving Implicit Parallelism

Author: Calderon Jose Manuel
Publication venue: University of York
Publication date: 01/09/2015
Field of study

We propose a new technique for exploiting the inherent parallelism in lazy functional programs. Known as implicit parallelism, the goal of writing a sequential program and having the compiler improve its performance by determining what can be executed in parallel has been studied for many years. Our technique abandons the idea that a compiler should accomplish this feat in ‘one shot’ with static analysis and instead allow the compiler to improve upon the static analysis using iterative feedback. We demonstrate that iterative feedback can be relatively simple when the source language is a lazy purely functional programming language. We present three main contributions to the field: the auto- matic derivation of parallel strategies from a demand on a structure, and two new methods of feedback-directed auto-parallelisation. The first method treats the runtime of the program as a black box and uses the ‘wall-clock’ time as a fitness function to guide a heuristic search on bitstrings representing the parallel setting of the program. The second feedback approach is profile directed. This allows the compiler to use profile data that is gathered by the runtime system as the pro- gram executes. This allows the compiler to determine which threads are not worth the overhead of creating them. Our results show that the use of feedback-directed compilation can be a good source of refinement for the static analysis techniques that struggle to account for the cost of a computation. This lifts the burden of ‘is this parallelism worthwhile?’ away from the static phase of compilation and to the runtime, which is better equipped to answer the question

White Rose E-theses Online

Exploiting implicit parallelism in SPARC instruction execution

Author: Austin Todd Michael
Publication venue: RIT Scholar Works
Publication date: 01/01/1990
Field of study

One way to increase the performance of a processing unit is to exploit implicit parallelism. Exploiting this parallelism requires a processor to dynamically select instructions in a serial instruction stream which can be executed in parallel. As operations are computed concurrently, an execution speedup will occur. This thesis studies how effectively implicit parallelism could be exploited in the Scalable Pro cessor Architecture (SPARC)[9], a reduced instruction set architecture developed by Sun Microsystems. First an analysis of SPARC instruction traces will determine the optimal speedup that would be realized by a processor with infinite resources. Next, an analytical model of a parallelizing processor will be developed and used to predict the effects of limited resources on optimal speedup. Lastly, a SPARC simulator will be employed to determine the actual speedup of resource limited configurations, and the results will be correlated with the analytical model

RIT Scholar Works

Identifying, Quantifying, Extracting and Enhancing Implicit Parallelism

Author: Agarwal Mayank
Publication venue
Publication date: 15/05/2009
Field of study

The shift of the microprocessor industry towards multicore architectures has placed a huge burden on the programmers by requiring explicit parallelization for performance. Implicit Parallelization is an alternative that could ease the burden on programmers by parallelizing applications ???under the covers??? while maintaining sequential semantics externally. This thesis develops a novel approach for thinking about parallelism, by casting the problem of parallelization in terms of instruction criticality. Using this approach, parallelism in a program region is readily identified when certain conditions about fetch-criticality are satisfied by the region. The thesis formalizes this approach by developing a criticality-driven model of task-based parallelization. The model can accurately predict the parallelism that would be exposed by potential task choices by capturing a wide set of sources of parallelism as well as costs to parallelization. The criticality-driven model enables the development of two key components for Implicit Parallelization: a task selection policy, and a bottleneck analysis tool. The task selection policy can partition a single-threaded program into tasks that will profitably execute concurrently on a multicore architecture in spite of the costs associated with enforcing data-dependences and with task-related actions. The bottleneck analysis tool gives feedback to the programmers about data-dependences that limit parallelism. In particular, there are several ???accidental dependences??? that can be easily removed with large improvements in parallelism. These tools combine into a systematic methodology for performance tuning in Implicit Parallelization. Finally, armed with the criticality-driven model, the thesis revisits several architectural design decisions, and finds several encouraging ways forward to increase the scope of Implicit Parallelization.unpublishednot peer reviewe

Illinois Digital Environment for Access to Learning and Scholarship Repository

Or-Parallel Prolog Execution on Clusters of Multicores

Author: Rocha Ricardo
Publication venue: OASIcs - OpenAccess Series in Informatics. 2nd Symposium on Languages, Applications and Technologies
Publication date: 01/01/2013
Field of study

Logic Programming languages, such as Prolog, provide an excellent framework for the parallel execution of logic programs. In particular, the inherent non-determinism in the way logic programs are structured makes Prolog very attractive for the exploitation of implicit parallelism. One of the most noticeable sources of implicit parallelism in Prolog programs is or-parallelism. Or-parallelism arises from the simultaneous evaluation of a subgoal call against the clauses that match that call. Arguably, the most successful model for or-parallelism is environment copying, that has been efficiently used in the implementation of or-parallel Prolog systems both on shared memory and distributed memory architectures. Nowadays, multicores and clusters of multicores are becoming the norm and, although, many parallel Prolog systems have been developed in the past, to the best of our knowledge, none of them was specially designed to explore the combination of shared with distributed memory architectures. Motivated by our past experience, in designing and developing parallel Prolog systems based on environment copying, we propose a novel computational model to efficiently exploit implicit parallelism from large scale real-world applications specialized for the novel architectures based on clusters of multicores

Dagstuhl Research Online Publication Server

Scheduling Data-Intensive Tasks on Heterogeneous Many Cores

Author: Kotthaus Helena
Tözün Pinar
Publication venue
Publication date: 01/01/2019
Field of study

The IT University of Copenhagen's Repository

Threads and Or-Parallelism Unified

Author: Carro
Correia
Gupta
INÊS DUTRA
Moura
Pontelli
RICARDO ROCHA
Rocha
Santos Costa
Santos Costa
Shen
Stevens
VíTOR SANTOS COSTA
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 26/07/2010
Field of study

One of the main advantages of Logic Programming (LP) is that it provides an excellent framework for the parallel execution of programs. In this work we investigate novel techniques to efficiently exploit parallelism from real-world applications in low cost multi-core architectures. To achieve these goals, we revive and redesign the YapOr system to exploit or-parallelism based on a multi-threaded implementation. Our new approach takes full advantage of the state-of-the-art fast and optimized YAP Prolog engine and shares the underlying execution environment, scheduler and most of the data structures used to support YapOr's model. Initial experiments with our new approach consistently achieve almost linear speedups for most of the applications, proving itself as a good alternative for exploiting implicit parallelism in the currently available low cost multi-core architectures.Comment: 17 pages, 21 figures, International Conference on Logic Programming (ICLP 2010

arXiv.org e-Print Archive

Crossref