3 research outputs found

    DynaProg for Scala

    Get PDF
    Dynamic programming is an algorithmic technique to solve problems that follow the Bellman’s principle: optimal solutions depends on optimal sub-problem solutions. The core idea behind dynamic programming is to memoize intermediate results into matrices to avoid multiple computations. Solving a dynamic programming problem consists of two phases: filling one or more matrices with intermediate solutions for sub-problems and recomposing how the final result was constructed (backtracking). In textbooks, problems are usually described in terms of recurrence relations between matrices elements. Expressing dynamic programming problems in terms of recursive formulae involving matrix indices might be difficult, if often error prone, and the notation does not capture the essence of the underlying problem (for example aligning two sequences). Moreover, writing correct and efficient parallel implementation requires different competencies and often a significant amount of time. In this project, we present DynaProg, a language embedded in Scala (DSL) to address dynamic programming problems on heterogeneous platforms. DynaProg allows the programmer to write concise programs based on ADP [1], using a pair of parsing grammar and algebra; these program can then be executed either on CPU or on GPU. We evaluate the performance of our implementation against existing work and our own hand-optimized baseline implementations for both the CPU and GPU versions. Experimental results show that plain Scala has a large overhead and is recommended to be used with small sequences (≤1024) whereas the generated GPU version is comparable with existing implementations: matrix chain multiplication has the same performance as our hand-optimized version (142% of the execution time of [2]) for a sequence of 4096 matrices, Smith-Waterman is twice slower than [3] on a pair of sequences of 6144 elements, and RNA folding is on par with [4] (95% running time) for sequences of 4096 elements. [1] Robert Giegerich and Carsten Meyer. Algebraic Dynamic Programming. [2] Chao-Chin Wu, Jenn-Yang Ke, Heshan Lin and Wu Chun Feng. Optimizing dynamic programming on graphics processing units via adaptive thread-level parallelism. [3] Edans Flavius de O. Sandes, Alba Cristina M. A. de Melo. Smith-Waterman alignment of huge sequences with GPU in linear space. [4] Guillaume Rizk and Dominique Lavenier. GPU accelerated RNA folding algorithm

    EFFICIENT SCHEDULING OF DYNAMIC PROGRAMMING ALGORITHMS ON MULTICORE ARCHITECTURES

    Get PDF
    Dynamic programming is one of the Berkley 13 dwarfs widely used for solving various combinatorial and optimization problems, including matrix chain multiplication, longest common subsequence, binary (0/1) knapsack and so on. Due to nonuniformity in the inherent dependence in dynamic programming algorithms, it becomes necessary to schedule the subproblems of dynamic programming effectively to processing cores for optimal utilization of multicore technology. The computational matrix of dynamic programming is divided into three parts; growing region, stable region and shrinking region depending on whether the number of subproblems increases, remain stable or decreases uniformly phase by phase respectively. We realize the parallel implementations of matrix chain multiplication, longest common subsequence and 0/1 knapsack on Intel Xeon X5650 and E5-2695 using OpenMP with different scheduling policies and adequate chunk sizes. It is concluded that, for the growing or the shrinking region of dynamic programming parallelization adopted in this article, guided schedule is better as compared to other scheduling scheme. Static or dynamic schedule is better for the stable region of dynamic programming. Dynamic programming approach, where all three regions are present, more speedup is achieved by applying the mixed scheduling approach rather than applying only single scheduling technique for the entire computations. In LCS, approximately 20% more speedup is achieved using a mixed scheduling technique over the conventional single scheduling approach on Intel Xeon E5-2695

    Specialising Parsers for Queries

    Get PDF
    Many software systems consist of data processing components that analyse large datasets to gather information and learn from these. Often, only part of the data is relevant for analysis. Data processing systems contain an initial preprocessing step that filters out the unwanted information. While efficient data analysis techniques and methodologies are accessible to non-expert programmers, data preprocessing seems to be forgotten, or worse, ignored. This despite real performance gains being possible by efficiently preprocessing data. Implementations of the data preprocessing step traditionally have to trade modularity for performance: to achieve the former, one separates the parsing of raw data and filtering it, and leads to slow programs because of the creation of intermediate objects during execution. The efficient version is a low-level implementation that interleaves parsing and querying. In this dissertation we demonstrate a principled and practical technique to convert the modular, maintainable program into its interleaved efficient counterpart. Key to achieving this objective is the removal, or deforestation, of intermediate objects in a program execution. We first show that by encoding data types using Böhm-Berarducci encodings (often referred to as Church encodings), and combining these with partial evaluation for function composition we achieve deforestation. This allows us to implement optimisations themselves as libraries, with minimal dependence on an underlying optimising compiler. Next we illustrate the applicability of this approach to parsing and preprocessing queries. The approach is general enough to cover top-down and bottom-up parsing techniques, and deforestation of pipelines of operations on lists and streams. We finally present a set of transformation rules that for a parser on a nested data format and a query on the structure, produces a parser specialised for the query. As a result we preserve the modularity of writing parsers and queries separately while also minimising resource usage. These transformation rules combine deforested implementations of both libraries to yield an efficient, interleaved result
    corecore