Search CORE

380,915 research outputs found

Recommended from our members

Improving parallel program performance using critical path analysis

Author: Bic Lubomir
Gajski Daniel D.
Kwan Andrew W.
Publication venue: eScholarship, University of California
Publication date: 01/01/1989
Field of study

A programming tool that performs analysis of critical paths for parallel programs has been developed. This tool determines the critical path for the program as scheduled onto a parallel computer with P processing elements, the critical path for the program expressed as a data flow graph (when maximal parallelism can be expressed), and the minimum number of processing elements (P_opt) needed to obtain maximum program speedup. Experiments were performed using several versions of a Gaussian elimination program to examine how speedup varied with changes in granularity and critical path length. These experiments showed that when the available numer of processing elements P < P_opt, increasing granularity improved program speedup more than reducing (the data flow graph's) critical path length, whereas when P ≥ P_opt, increasing granularity degraded program speedup while reducing critical path length improved program speedup

eScholarship - University of California

Data flow analysis of parallel programs

Author: Vollmer Jürgen
Publication venue
Publication date: 02/08/2007
Field of study

Data flow analysis is the prerequisite of performing optimizations such as common subexpression eliminations or code motion of partial redundant expressions on imperative sequential programs. To apply these transformations to parallel imperative programs, the notion of data flow must be extended to concurrent programs.The additional source language features are: common address space (shared memory), nested parallel statements (PAR),or-parallelism, critical regions and message passing. The underlying interleaving semantics of the concurrently-executed processes result in the so-called state space explosion which on first appearance prevents the computation of the meet over all path solution needed for data flow analysis. For the class of one-bit data flow problems (also known as bit-vector problems) we can show that for the computation of the meet over all path solution, not all interleavings are needed. Based on that, we can give simple data flow equations representing the data flow effects of the PAR statement.The definition of a parallel control flow graph leads to an efficient extension of Killdal\u27s algorithm to compute the data flow of a concurrent program.The time complexity is the same as for analyzing a ``comparable\u27\u27 sequential program

KITopen

Testing Data Transformations in MapReduce Programs

Author: Morán Barbón Jesús
Riva Álvarez Claudio A. de la
Tuya González Pablo Javier
Publication venue: ACM
Publication date: 01/01/2015
Field of study

MapReduce is a parallel data processing paradigm oriented to process large volumes of information in data-intensive applications, such as Big Data environments. A characteristic of these applications is that they can have different data sources and data formats. For these reasons, the inputs could contain some poor quality data that could produce a failure if the program functionality does not handle properly the variety of input data. The output of these programs is obtained from a number of input transformations that represent the program logic. This paper proposes the testing technique called MRFlow that is based on data flow test criteria and oriented to transformations analysis between the input and the output in order to detect defects in MapReduce programs. MRFlow is applied over some MapReduce programs and detects several defect

Repositorio Institucional de la Universidad de Oviedo

木を用いた構造化並列プログラミング

Author: Shigeyuki Sato
佐藤重幸
Publication venue
Publication date: 02/09/2016
Field of study

High-level abstractions for parallel programming are still immature. Computations on complicated data structures such as pointer structures are considered as irregular algorithms. General graph structures, which irregular algorithms generally deal with, are difficult to divide and conquer. Because the divide-and-conquer paradigm is essential for load balancing in parallel algorithms and a key to parallel programming, general graphs are reasonably difficult. However, trees lead to divide-and-conquer computations by definition and are sufficiently general and powerful as a tool of programming. We therefore deal with abstractions of tree-based computations. Our study has started from Matsuzaki’s work on tree skeletons. We have improved the usability of tree skeletons by enriching their implementation aspect. Specifically, we have dealt with two issues. We first have implemented the loose coupling between skeletons and data structures and developed a flexible tree skeleton library. We secondly have implemented a parallelizer that transforms sequential recursive functions in C into parallel programs that use tree skeletons implicitly. This parallelizer hides the complicated API of tree skeletons and makes programmers to use tree skeletons with no burden. Unfortunately, the practicality of tree skeletons, however, has not been improved. On the basis of the observations from the practice of tree skeletons, we deal with two application domains: program analysis and neighborhood computation. In the domain of program analysis, compilers treat input programs as control-flow graphs (CFGs) and perform analysis on CFGs. Program analysis is therefore difficult to divide and conquer. To resolve this problem, we have developed divide-and-conquer methods for program analysis in a syntax-directed manner on the basis of Rosen’s high-level approach. Specifically, we have dealt with data-flow analysis based on Tarjan’s formalization and value-graph construction based on a functional formalization. In the domain of neighborhood computations, a primary issue is locality. A naive parallel neighborhood computation without locality enhancement causes a lot of cache misses. The divide-and-conquer paradigm is known to be useful also for locality enhancement. We therefore have applied algebraic formalizations and a tree-segmenting technique derived from tree skeletons to the locality enhancement of neighborhood computations.電気通信大学201

Creative Repository of Electro-Communications

Correction of DNA Sequencing Data with Spaced Seeds

Author: Lu Stephen
Publication venue: Scholarship@Western
Publication date: 13/12/2017
Field of study

The advent of next-generation sequencing technologies has allowed for the bridging of wet lab work and large data analysis into a cohesive work flow; with the increasing speed and efficiency of sequencing organisms, it becomes imperative that we are able to ensure the data that is produced is correct. We designed and implemented a new algorithm, QUESS, which based on using multiple spaced seeds to correct DNA sequencing data from Illumina MiSeq, HiSeq and NextSeq machines using C++ and OpenMP for parallel computing. We compared our method with ten leading programs, producing consistently better overall results for most tested measures. QUESS has the best average performance for all programs tested and is also competitive in terms of time and space complexity

Scholarship@Western

Exploring performance and power properties of modern multicore chips via simple machine models

Author: Chen
Hager
Hoisie
Hähnel
Kerbyson
Li
Nudd
Qian
Rotem
Succi
Suleman
Treibig
Treibig
Treibig
Wellein
Wolf-Gladrow
Zeiser
Ziegler
Publication venue: 'Wiley'
Publication date: 19/03/2014
Field of study

Modern multicore chips show complex behavior with respect to performance and power. Starting with the Intel Sandy Bridge processor, it has become possible to directly measure the power dissipation of a CPU chip and correlate this data with the performance properties of the running code. Going beyond a simple bottleneck analysis, we employ the recently published Execution-Cache-Memory (ECM) model to describe the single- and multi-core performance of streaming kernels. The model refines the well-known roofline model, since it can predict the scaling and the saturation behavior of bandwidth-limited loop kernels on a multicore chip. The saturation point is especially relevant for considerations of energy consumption. From power dissipation measurements of benchmark programs with vastly different requirements to the hardware, we derive a simple, phenomenological power model for the Sandy Bridge processor. Together with the ECM model, we are able to explain many peculiarities in the performance and power behavior of multicore processors, and derive guidelines for energy-efficient execution of parallel programs. Finally, we show that the ECM and power models can be successfully used to describe the scaling and power behavior of a lattice-Boltzmann flow solver code.Comment: 23 pages, 10 figures. Typos corrected, DOI adde

arXiv.org e-Print Archive

Crossref

Dead code elimination based pointer analysis for multithreaded programs

Author: El-Zawawy Mohamed A.
Publication venue
Publication date: 30/04/2012
Field of study

This paper presents a new approach for optimizing multitheaded programs with pointer constructs. The approach has applications in the area of certified code (proof-carrying code) where a justification or a proof for the correctness of each optimization is required. The optimization meant here is that of dead code elimination. Towards optimizing multithreaded programs the paper presents a new operational semantics for parallel constructs like join-fork constructs, parallel loops, and conditionally spawned threads. The paper also presents a novel type system for flow-sensitive pointer analysis of multithreaded programs. This type system is extended to obtain a new type system for live-variables analysis of multithreaded programs. The live-variables type system is extended to build the third novel type system, proposed in this paper, which carries the optimization of dead code elimination. The justification mentioned above takes the form of type derivation in our approach.Comment: 19 page

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Directory of Open Access Journals