14 research outputs found
Common Subexpression Elimination in a Lazy Functional Language
Common subexpression elimination is a well-known compiler optimisation that saves time by avoiding the repetition of the same computation. To our knowledge it has not yet been applied to lazy functional programming languages, although there are several advantages. First, the referential transparency of these languages makes the identification of common subexpressions very simple. Second, more common subexpressions can be recognised because they can be of arbitrary type whereas standard common subexpression elimination only shares primitive values. However, because lazy functional languages decouple program structure from data space allocation and control flow, analysing its effects and deciding under which conditions the elimination of a common subexpression is beneficial proves to be quite difficult. We developed and implemented the transformation for the language Haskell by extending the Glasgow Haskell compiler and measured its effectiveness on real-world programs
A General Framework for Static Profiling of Parametric Resource Usage
Traditional static resource analyses estimate the total resource usage of a
program, without executing it. In this paper we present a novel resource
analysis whose aim is instead the static profiling of accumulated cost, i.e.,
to discover, for selected parts of the program, an estimate or bound of the
resource usage accumulated in each of those parts. Traditional resource
analyses are parametric in the sense that the results can be functions on input
data sizes. Our static profiling is also parametric, i.e., our accumulated cost
estimates are also parameterized by input data sizes. Our proposal is based on
the concept of cost centers and a program transformation that allows the static
inference of functions that return bounds on these accumulated costs depending
on input data sizes, for each cost center of interest. Such information is much
more useful to the software developer than the traditional resource usage
functions, as it allows identifying the parts of a program that should be
optimized, because of their greater impact on the total cost of program
executions. We also report on our implementation of the proposed technique
using the CiaoPP program analysis framework, and provide some experimental
results. This paper is under consideration for acceptance in TPLP.Comment: Paper presented at the 32nd International Conference on Logic
Programming (ICLP 2016), New York City, USA, 16-21 October 2016, 22 pages,
LaTe
Common Subexpressions are Uncommon in Lazy Functional Languages
Common subexpression elimination is a well-known compiler optimisation that saves time by avoiding the repetition of the same computation. In lazy functional languages, referential transparency renders the identification of common subexpressions very simple. More common subexpressions can be recognised because they can be of arbitrary type whereas standard common subexpression elimination only shares primitive values. However, because lazy functional languages decouple program structure from data space allocation and control flow, analysing its effects and deciding under which conditions the elimination of a common subexpression is beneficial proves to be quite difficult. We developed and implemented the transformation for the language Haskell by extending the Glasgow Haskell compiler. On real-world programs the transformation showed nearly no effect. The reason is that common subexpressions whose elimination could speed up programs are uncommon in lazy functional languages
The Weak Call-By-Value {\lambda}-Calculus is Reasonable for Both Time and Space
We study the weak call-by-value -calculus as a model for
computational complexity theory and establish the natural measures for time and
space -- the number of beta-reductions and the size of the largest term in a
computation -- as reasonable measures with respect to the invariance thesis of
Slot and van Emde Boas [STOC~84]. More precisely, we show that, using those
measures, Turing machines and the weak call-by-value -calculus can
simulate each other within a polynomial overhead in time and a constant factor
overhead in space for all computations that terminate in (encodings) of 'true'
or 'false'. We consider this result as a solution to the long-standing open
problem, explicitly posed by Accattoli [ENTCS~18], of whether the natural
measures for time and space of the -calculus are reasonable, at least
in case of weak call-by-value evaluation.
Our proof relies on a hybrid of two simulation strategies of reductions in
the weak call-by-value -calculus by Turing machines, both of which are
insufficient if taken alone. The first strategy is the most naive one in the
sense that a reduction sequence is simulated precisely as given by the
reduction rules; in particular, all substitutions are executed immediately.
This simulation runs within a constant overhead in space, but the overhead in
time might be exponential. The second strategy is heap-based and relies on
structure sharing, similar to existing compilers of eager functional languages.
This strategy only has a polynomial overhead in time, but the space consumption
might require an additional factor of , which is essentially due to the
size of the pointers required for this strategy. Our main contribution is the
construction and verification of a space-aware interleaving of the two
strategies, which is shown to yield both a constant overhead in space and a
polynomial overhead in time
Dag-calculus: a calculus for parallel computation
International audienceIncreasing availability of multicore systems has led to greater focus on the design and implementation of languages for writing parallel programs. Such languages support various abstractions for parallelism, such as fork-join, async-finish, futures. While they may seem similar, these abstractions lead to different semantics, language design and implementation decisions, and can significantly impact the performance of end-user applications. In this paper, we consider the question of whether it would be possible to unify various paradigms of parallel computing. To this end, we propose a calculus, called dag calculus, that can encode fork-join, async-finish, and futures, and possibly others. We describe dag calculus and its semantics, establish translations from the afore-mentioned paradigms into dag calculus. These translations establish that dag calculus is sufficiently powerful for encoding programs written in prevailing paradigms of parallelism. We present concurrent algorithms and data structures for realizing dag calculus on multi-core hardware and prove that the proposed techniques are consistent with the semantics. Finally, we present an implementation of the calculus and evaluate it empirically by comparing its performance to highly optimized code from prior work. The results show that the calculus is expressive and that it competes well with, and sometimes outperforms, the state of the art
Time and Space Profiling for Non-Strict, Higher-Order Functional Languages
We present the first profiler for a compiled, non-strict, higher-order, purely functional language capable of measuring time as well as space usage. Our profiler is implemented in a production-quality optimising compiler for Haskell, has low overheads, and can successfully profile large applications. A unique feature of our approach is that we give a formal specification of the attribution of execution costs to cost centres. This specification enables us to discuss our design decisions in a precise framework. Since it is not obvious how to map this specification onto a particular implementation, we also present an implementation-oriented operational semantics, and prove it equivalent to the specification. 1 Motivation and overview Everyone knows the importance of profiling tools: the best way to improve a program's performance is to concentrate on the parts of the program which are eating the lion's share of the total space and time resources. One would expect profiling tools to be ..
Profiling large-scale lazy functional programs
The LOLITA natural language processing system is an example of one of the ever increasing number of large-scale systems written entirely in a functional programming language. The system consists of over 50,000 lines of Haskell code and is able to perform a number of tasks such as semantic and pragmatic analysis of text, context scanning and query analysis. Such a system is more useful if the results are calculated in real-time, therefore the efficiency of such a system is paramount. For the past three years we have used profiling tools supplied with the Haskell compilers GHC and HBC to analyse and reason about our programming solutions and have achieved good results; however, our experience has shown that the profiling life-cycle is often too long to make a detailed analysis of a large system possible, and the profiling results are often misleading. A profiling system is developed which allows three types of functionality not previously found in a profiler for lazy functional programs. Firstly, the profiler is able to produce results based on an accurate method of cost inheritance. We have found that this reduces the possibility of the programmer obtaining misleading profiling results. Secondly, the programmer is able to explore the results after the execution of the program. This is done by selecting and deselecting parts of the program using a post-processor. This greatly reduces the analysis time as no further compilation, execution or profiling of the program is needed. Finally, the new profiling system allows the user to examine aspects of the run-time call structure of the program. This is useful in the analysis of the run-time behaviour of the program. Previous attempts at extending the results produced by a profiler in such a way have failed due to the exceptionally high overheads. Exploration of the overheads produced by the new profiling scheme show that typical overheads in profiling the LOLITA system are: a 10% increase in compilation time; a 7% increase in executable size and a 70% run-time overhead. These overheads mean a considerable saving in time in the detailed analysis of profiling a large, lazy functional program
Fresh Techniques for Memory Profiling of Lazy Functional Programs
Lazy functional languages are known for their semantic elegance. They liberate programmers from many difficult responsibilities, such as the operational details of computations including memory management. However, the productivity and elegant semantics provided by lazy functional languages do not come without a cost. Lazy functional programs often suffer from unpredictable space leaks. For over two decades, various lazy functional implementations have been equipped with memory profiling tools. These tools furnish programmers with valuable information about space demands, but there is still scope for their future development. This dissertation presents two variants of memory profiling tools. The first tool is a hotspot heap profiler which presents information in two forms: profile charts and highlighted hotspots by source occurrence. The profile chart represents a hotspot-construction profile, distributed by hotspot temperatures. Hotspots are also marked in the textual display of source programs with the temperature they represent. Further information about hotspots is given in individual profiles. The second tool is a stack profiler which yields information about producers and construction of stack frames