64 research outputs found

    Empirical Evaluation of Test Coverage for Functional Programs

    Get PDF
    The correlation between test coverage and test effectiveness is important to justify the use of coverage in practice. Existing results on imperative programs mostly show that test coverage predicates effectiveness. However, since functional programs are usually structurally different from imperative ones, it is unclear whether the same result may be derived and coverage can be used as a prediction of effectiveness on functional programs. In this paper we report the first empirical study on the correlation between test coverage and test effectiveness on functional programs. We consider four types of coverage: as input coverages, statement/branch coverage and expression coverage, and as oracle coverages, count of assertions and checked coverage. We also consider two types of effectiveness: raw effectiveness and normalized effectiveness. Our results are twofold. (1) In general the findings on imperative programs still hold on functional programs, warranting the use of coverage in practice. (2) On specific coverage criteria, the results may be unexpected or different from the imperative ones, calling for further studies on functional programs

    Kindergarten Cop : dynamic nursery resizing for GHC

    Get PDF
    Generational garbage collectors are among the most popular garbage collectors used in programming language runtime systems. Their performance is known to depend heavily on choosing the appropriate size of the area where new objects are allocated (the nursery). In imperative languages, it is usual to make the nursery as large as possible, within the limits imposed by the heap size. Functional languages, however, have quite different memory behaviour. In this paper, we study the effect that the nursery size has on the performance of lazy functional programs, through the interplay between cache locality and the frequency of collections. We demonstrate that, in contrast with imperative programs, having large nurseries is not always the best solution. Based on these results, we propose two novel algorithms for dynamic nursery resizing that aim to achieve a compromise between good cache locality and the frequency of garbage collections. We present an implementation of these algorithms in the state-of-the-art GHC compiler for the functional language Haskell, and evaluate them using an extensive benchmark suite. In the best case, we demonstrate a reduction in total execution times of up to 88.5%, or an 8.7 overall speedup, compared to using the production GHC garbage collector. On average, our technique gives an improvement of 9.3% in overall performance across a standard suite of 63 benchmarks for the production GHC compiler.Postprin

    Common Subexpression Elimination in a Lazy Functional Language

    Get PDF
    Common subexpression elimination is a well-known compiler optimisation that saves time by avoiding the repetition of the same computation. To our knowledge it has not yet been applied to lazy functional programming languages, although there are several advantages. First, the referential transparency of these languages makes the identification of common subexpressions very simple. Second, more common subexpressions can be recognised because they can be of arbitrary type whereas standard common subexpression elimination only shares primitive values. However, because lazy functional languages decouple program structure from data space allocation and control flow, analysing its effects and deciding under which conditions the elimination of a common subexpression is beneficial proves to be quite difficult. We developed and implemented the transformation for the language Haskell by extending the Glasgow Haskell compiler and measured its effectiveness on real-world programs

    Finding the needle: Stack Traces for GHC

    No full text

    Modular, higher order cardinality analysis in theory and practice

    Get PDF
    Since the mid '80s, compiler writers for functional languages (especially lazy ones) have been writing papers about identifying and exploiting thunks and lambdas that are used only once. However, it has proved difficult to achieve both power and simplicity in practice. In this paper, we describe a new, modular analysis for a higher order language, which is both simple and effective. We prove the analysis sound with respect to a standard call-by-need semantics, and present measurements of its use in a full-scale, state-of-the-art optimising compiler. The analysis finds many single-entry thunks and one-shot lambdas and enables a number of program optimisations. This paper extends our preceding conference publication (Sergey et al. 2014 Proceedings of the 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL 2014). ACM, pp. 335–348) with proofs, expanded report on evaluation and a detailed examination of the factors causing the loss of precision in the analysis

    SC-Haskell: Sequential Consistency in Languages That Minimize Mutable Shared Heap

    Get PDF
    A core, but often neglected, aspect of a programming language design is its memory (consistency) model. Sequential consistency~(SC) is the most intuitive memory model for programmers as it guarantees sequential composition of instructions and provides a simple abstraction of shared memory as a single global store with atomic read and writes. Unfortunately, SC is widely considered to be impractical due to its associated performance overheads. Perhaps contrary to popular opinion, this paper demonstrates that SC is achievable with acceptable performance overheads for mainstream languages that minimize mutable shared heap. In particular, we modify the Glasgow Haskell Compiler to insert fences on all writes to shared mutable memory accessed in nonfunctional parts of the program. For a benchmark suite containing 1,279 programs, SC adds a geomean overhead of less than 0.4 on an x86 machine. The efficiency of SC arises primarily due to the isolation provided by the Haskell type system between purely functional and thread-local imperative computations on the one hand, and imperative computations on the global heap on the other. We show how to use new programming idioms to further reduce the SC overhead; these create a virtuous cycle of less overhead and even stronger semantic guarantees (static data-race freedom)

    Architecture aware parallel programming in Glasgow parallel Haskell (GPH)

    Get PDF
    General purpose computing architectures are evolving quickly to become manycore and hierarchical: i.e. a core can communicate more quickly locally than globally. To be effective on such architectures, programming models must be aware of the communications hierarchy. This thesis investigates a programming model that aims to share the responsibility of task placement, load balance, thread creation, and synchronisation between the application developer and the runtime system. The main contribution of this thesis is the development of four new architectureaware constructs for Glasgow parallel Haskell that exploit information about task size and aim to reduce communication for small tasks, preserve data locality, or to distribute large units of work. We define a semantics for the constructs that specifies the sets of PEs that each construct identifies, and we check four properties of the semantics using QuickCheck. We report a preliminary investigation of architecture aware programming models that abstract over the new constructs. In particular, we propose architecture aware evaluation strategies and skeletons. We investigate three common paradigms, such as data parallelism, divide-and-conquer and nested parallelism, on hierarchical architectures with up to 224 cores. The results show that the architecture-aware programming model consistently delivers better speedup and scalability than existing constructs, together with a dramatic reduction in the execution time variability. We present a comparison of functional multicore technologies and it reports some of the first ever multicore results for the Feedback Directed Implicit Parallelism (FDIP) and the semi-explicit parallelism (GpH and Eden) languages. The comparison reflects the growing maturity of the field by systematically evaluating four parallel Haskell implementations on a common multicore architecture. The comparison contrasts the programming effort each language requires with the parallel performance delivered. We investigate the minimum thread granularity required to achieve satisfactory performance for three implementations parallel functional language on a multicore platform. The results show that GHC-GUM requires a larger thread granularity than Eden and GHC-SMP. The thread granularity rises as the number of cores rises

    In search of a map : using program slicing to discover potential parallelism in recursive functions

    Get PDF
    Funding: EU FP7 grant “Parallel Patterns for Adaptive Heterogeneous Multicore Systems” (ICT-288570), by the EU H2020 grant “RePhrase: Refactoring Parallel Het- erogeneous Resource-Aware Applications – a Software Engineering Approach” (ICT-644235), by COST Action IC1202 (“Timing Analysis on Code-Level”), by the EPSRC grant “Discovery: Pattern Discovery and Program Shaping for Manycore Systems” (EP/P020631/1), and by Scottish Enterprise grant PS7305CA44.Recursion schemes, such as the well-known map, can be used as loci of potential parallelism, where schemes are replaced with an equivalent parallel implementation. This paper formalises a novel technique, using program slicing, that automatically and statically identifies computations in recursive functions that can be lifted out of the function and then potentially performed in parallel. We define a new program slicing algorithm, build a prototype implementation, and demonstrate its use on 12 Haskell examples, including benchmarks from the NoFib suite and functions from the standard Haskell Prelude. In all cases, we obtain the expected results in terms of finding potential parallelism. Moreover, we have tested our prototype against synthetic benchmarks, and found that our prototype has quadratic time complexity. For the NoFib benchmark examples we demonstrate that relative parallel speedups can be obtained (up to 32.93x the sequential performance on 56 hyperthreaded cores).Postprin

    遅延評価に基づく関数型言語におけるメモリ割当量の削減

    Get PDF
    遅延評価は,値が実際に必要になるまで計算を遅らせる評価戦略である.必要になった値から計算するため,最終結果を求めるのに不要な計算を除去し,計算の最適化を目指すことができる.それと同時に,どの値が必要となるかという判断を処理系に任せることによって,プログラムに計算の進め方を記述する必要がなくなり,宣言的で簡潔なプログラムの記述につながる.たとえば,リストなどの再帰的データ構造を扱う際には,そのデータ構造を生成する処理と読み進める処理を分けて記述することができるため,遅延評価により得られる記述面での恩恵は大きい.記述面での利点が多い一方で,遅延評価を行う言語処理系を実装するには,計算の遅延に必要となるオブジェクト(遅延オブジェクト,以下サンクと呼ぶ)について,十分考慮して処理系を設計する必要がある.特に,サンクをメモリ上に割り当てる時間的・空間的コストが問題となり,遅延評価によって不要な計算を除去できるとしても,プログラムの実行時のオーバヘッドが大きくなってしまうという問題点がある.そのため,効率的な遅延評価機構の実現を目指して,サンクの生成を抑制する静的解析手法について今まで多くの研究がなされてきた.たとえば,正格性解析は,プログラムの最終結果を求めるために必要となる計算を,プログラムから静的に解析する.値が必要となる式は遅延させずに済ませることができるため,その式に対するサンクを生成しない効率的なコードを生成することができる.多くのプログラムにおいて,正格性解析によりサンクの生成を抑えられることは,すでに確認されているが,プログラムの文面から得られる静的な情報のみを用いるため,動的なふるまいを考慮すれば削減可能と判断できるようなサンクは,正格性解析による削減の対象ではない.たとえば,リストをどれだけの長さ読み進めるかというような実行時に決定する要素があると,リストの遅延に必要となるサンクの生成を正格性解析のみで抑制することは難しい.本論文は,サンクの削減という目的を達成するため,リストのような線形再帰的に定義される代数データ構造に注目し,既存のサンクを再利用する手法Thunk Recycling を提案する.Thunk Recycling は,すでに割り当てられているサンクを破壊的に更新して再利用し,新たなサンクの生成を抑える.たとえば,リストであれば,後続のリストの生成を遅延するサンクを再利用できる.本論文では,まず,Thunk Recycling の動作について述べ,その実現に必要となる機構についてまとめる.再利用を可能とするために,再利用が可能であるサンクを既存のサンクと区別して扱う.再利用機構は,破壊的な更新により矛盾が起こらないようにするコンパイル時の変換機構と,実行時に再利用を行う機構から構成される.プログラム変換の基本的な方針は,再利用可能なサンクへの参照を単一にすることである.また,実行時の再利用機構は,既存のサンクの生成・評価という仕組みの多くの部分を流用する.次に,Thunk Recycling の形式的な定義と,その正しさの証明について述べる.簡単な関数型言語を定義し,その言語に対するThunk Recycling のプログラム変換を定義した.さらに,サンクを再利用する操作的意味論を定義した.その意味論を用いて,Thunk Recycling の適用の有無により,プログラムのふるまいが変わらないことを証明した.次に,関数型プログラミング言語Haskell の処理系であるGlasgow Haskell Compiler(GHC)へのThunk Recycling の実装について述べる.GHC は,Haskell の標準的な処理系であり,多くの研究の基盤として用いられ,新しい言語概念など先進的な研究成果が取り入れられている.本論文では,Thunk Recycling の機構のGHC における実装について,考えうる各種の設計方針と,それぞれの設計方針を選択するに至った設計上の得失に関して論じる.GHC は,その大部分が関数型言語であるHaskell で記述されており,関数型言語による大規模で洗練されたシステムであるという面を持つ.そのため,Thunk Recycling の実装は,関数型言語による大規模なソフトウェアに対する開発事例の一例となっている.そこで,本論文では,関数型言語で書かれたプログラミング言語処理系に変更を加えるという観点から,遅延評価を行う関数型言語処理系の実装に関して得られた知見を論じる.最後に,GHC 上の実装について,ベンチマークプログラムを用いた実験について述べる.実行時間に関しては,適用するプログラムを選ぶものの,再利用によって総メモリ割当量を削減できた.電気通信大学201
    corecore