Search CORE

1,816 research outputs found

Safe Concurrency Introduction through Slicing

Author: Bozó I.
Cesarini F.
Cheng J.
Nyblom P.
Tóth M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Traditional refactoring is about modifying the structure of existing code without changing its behaviour, but with the aim of making code easier to understand, modify, or reuse. In this paper, we introduce three novel refactorings for retrofitting concurrency to Erlang applications, and demonstrate how the use of program slicing makes the automation of these refactorings possible

CiteSeerX

Crossref

Kent Academic Repository

Domain-Specific Acceleration and Auto-Parallelization of Legacy Scientific Code in FORTRAN 77 using Source-to-Source Compilation

Author: Davidson Gavin
Vanderbauwhede Wim
Publication venue
Publication date: 13/11/2017
Field of study

Massively parallel accelerators such as GPGPUs, manycores and FPGAs represent a powerful and affordable tool for scientists who look to speed up simulations of complex systems. However, porting code to such devices requires a detailed understanding of heterogeneous programming tools and effective strategies for parallelization. In this paper we present a source to source compilation approach with whole-program analysis to automatically transform single-threaded FORTRAN 77 legacy code into OpenCL-accelerated programs with parallelized kernels. The main contributions of our work are: (1) whole-source refactoring to allow any subroutine in the code to be offloaded to an accelerator. (2) Minimization of the data transfer between the host and the accelerator by eliminating redundant transfers. (3) Pragmatic auto-parallelization of the code to be offloaded to the accelerator by identification of parallelizable maps and reductions. We have validated the code transformation performance of the compiler on the NIST FORTRAN 78 test suite and several real-world codes: the Large Eddy Simulator for Urban Flows, a high-resolution turbulent flow model; the shallow water component of the ocean model Gmodel; the Linear Baroclinic Model, an atmospheric climate model and Flexpart-WRF, a particle dispersion simulator. The automatic parallelization component has been tested on as 2-D Shallow Water model (2DSW) and on the Large Eddy Simulator for Urban Flows (UFLES) and produces a complete OpenCL-enabled code base. The fully OpenCL-accelerated versions of the 2DSW and the UFLES are resp. 9x and 20x faster on GPU than the original code on CPU, in both cases this is the same performance as manually ported code.Comment: 12 pages, 5 figures, submitted to "Computers and Fluids" as full paper from ParCFD conference entr

arXiv.org e-Print Archive

Enlighten

Human-Centric Program Synthesis

Author: Crichton Will
Publication venue: OASIcs - OpenAccess Series in Informatics. 10th Workshop on Evaluation and Usability of Programming Languages and Tools (PLATEAU 2019)
Publication date: 26/09/2019
Field of study

Program synthesis techniques offer significant new capabilities in searching for programs that satisfy high-level specifications. While synthesis has been thoroughly explored for input/output pair specifications (programming-by-example), this paper asks: what does program synthesis look like beyond examples? What actual issues in day-to-day development would stand to benefit the most from synthesis? How can a human-centric perspective inform the exploration of alternative specification languages for synthesis? I sketch a human-centric vision for program synthesis where programmers explore and learn languages and APIs aided by a synthesis tool

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

How do particle physicists learn the programming concepts they need?

Author: Kluth Stefan
Pia Maria Grazia
Schoerner-Sadenius Thomas
Steinbach Peter
Publication venue: 'IOP Publishing'
Publication date: 17/04/2015
Field of study

The ability to read, use and develop code efficiently and successfully is a key ingredient in modern particle physics. We report the experience of a training program, identified as "Advanced Programming Concepts", that introduces software concepts, methods and techniques to work effectively on a daily basis in a HEP experiment or other programming intensive fields. This paper illustrates the principles, motivations and methods that shape the "Advanced Computing Concepts" training program, the knowledge base that it conveys, an analysis of the feedback received so far, and the integration of these concepts in the software development process of the experiments as well as its applicability to a wider audience.Comment: 8 pages, 2 figures, CHEP2015 proceeding

arXiv.org e-Print Archive

MPG.PuRe

Towards semi-automatic data-type translation for parallelism in Erlang

Author: Barwell Adam David
Brown Christopher Mark
Castro David
Hammond Kevin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/09/2016
Field of study

As part of our ongoing research programme into programmer-in-the-loop parallelisation, we are studying the problem of introducing alternative data structures to support parallelism. Automated support for data structure transformations makes it easier to produce the best parallelisation for some given program, or even to make parallelisation feasible. We use a refactoring approach to choose and introduce these transformations for specific algorithmic skeletons, structured forms of parallelism that capture common patterns of parallelism. Our approach integrates with the Wrangler refactoring tool for Erlang, and uses the advanced Skel [4] skeleton library for Erlang. This library has previously been shown to give good parallelisations for a number of applications, including a multi-agent system [1] where we have achieved speedups of up to 142.44 on a 61-core machine with 244 threads. We have investigated three widely-used Erlang data structures: lists, binary structures and ETS (Erlang Term Storage) tables. In general, we have found that ETS tables deliver the best parallel performance for the examples that we have considered. However, our results show that simple lists may deliver similar performance to the use of ETS tables, and better performance than using binary structures. This means that we cannot blindly choose to implement a single optimisation as part of the compilation process. Our approach also allows the use of new (possibly user-defined) data structures and other transformations in future, giving a high level of flexibility and generality.Postprin

Crossref

University of St. Andrews - Pure

St Andrews Research Repository

ParaSCAN: A Static Profiler to Help Parallelization

Author: Rajan Hridesh
Sondag Tyler
Upadhyaya Ganesha
Publication venue: Iowa State University Digital Repository
Publication date: 13/05/2014
Field of study

Parallelizing software often starts by profiling to identify program paths that are worth parallelizing. Static profiling techniques, e.g. hot paths, can be used to identify parallelism opportunities for programs that lack representative inputs and in situations where dynamic techniques aren\u27t applicable, e.g. parallelizing compilers and refactoring tools. Existing static techniques for identification of hot paths rely on path frequencies. Relying on path frequencies alone isn\u27t sufficient for identifying parallelism opportunities. We propose a novel automated approach for static profiling that combines both path frequencies and computational weight of the paths. We apply our technique called ParaSCAN to parallelism recommendation, where it is highly effective. Our results demonstrate that ParaSCAN\u27s recommendations cover all the parallelism manually identified by experts with 85% accuracy and in some cases also identifies parallelism missed by the experts

Digital Repository @ Iowa State University (ISU)

The best multicore-parallelization refactoring you've never heard of

Author: Rainey Mike
Publication venue
Publication date: 19/07/2023
Field of study

In this short paper, we explore a new way to refactor a simple but tricky-to-parallelize tree-traversal algorithm to harness multicore parallelism. Crucially, the refactoring draws from some classic techniques from programming-languages research, such as the continuation-passing-style transform and defunctionalization. The algorithm we consider faces a particularly acute granularity-control challenge, owing to the wide range of inputs it has to deal with. Our solution achieves efficiency from heartbeat scheduling, a recent approach to automatic granularity control. We present our solution in a series of individually simple refactoring steps, starting from a high-level, recursive specification of the algorithm. As such, our approach may prove useful as a teaching tool, and perhaps be used for one-off parallelizations, as the technique requires no special compiler support

arXiv.org e-Print Archive