Search CORE

141 research outputs found

Parallel computing 2011, ParCo 2011: book of abstracts

Author: D'Hollander Erik
Publication venue: Academia Press Scientific Publishers
Publication date: 01/01/2011
Field of study

This book contains the abstracts of the presentations at the conference Parallel Computing 2011, 30 August - 2 September 2011, Ghent, Belgiu

Ghent University Academic Bibliography

Structured parallelism discovery with hybrid static-dynamic analysis and evaluation technique

Author: Vasiladiotis Christos
Publication venue: The University of Edinburgh
Publication date: 17/01/2023
Field of study

Parallel computer architectures have dominated the computing landscape for the past two decades; a trend that is only expected to continue and intensify, with increasing specialization and heterogeneity. This creates huge pressure across the software stack to produce programming languages, libraries, frameworks and tools which will efficiently exploit the capabilities of parallel computers, not only for new software, but also revitalizing existing sequential code. Automatic parallelization, despite decades of research, has had limited success in transforming sequential software to take advantage of efficient parallel execution. This thesis investigates three approaches that use commutativity analysis as the enabler for parallelization. This has the potential to overcome limitations of traditional techniques. We introduce the concept of liveness-based commutativity for sequential loops. We examine the use of a practical analysis utilizing liveness-based commutativity in a symbolic execution framework. Symbolic execution represents input values as groups of constraints, consequently deriving the output as a function of the input and enabling the identification of further program properties. We employ this feature to develop an analysis and discern commutativity properties between loop iterations. We study the application of this approach on loops taken from real-world programs in the OLDEN and NAS Parallel Benchmark (NPB) suites, and identify its limitations and related overheads. Informed by these findings, we develop Dynamic Commutativity Analysis (DCA), a new technique that leverages profiling information from program execution with specific input sets. Using profiling information, we track liveness information and detect loop commutativity by examining the code’s live-out values. We evaluate DCA against almost 1400 loops of the NPB suite, discovering 86% of them as parallelizable. Comparing our results against dependence-based methods, we match the detection efficacy of two dynamic and outperform three static approaches, respectively. Additionally, DCA is able to automatically detect parallelism in loops which iterate over Pointer-Linked Data Structures (PLDSs), taken from wide range of benchmarks used in the literature, where all other techniques we considered failed. Parallelizing the discovered loops, our methodology achieves an average speedup of 3.6× across NPB (and up to 55×) and up to 36.9× for the PLDS-based loops on a 72-core host. We also demonstrate that our methodology, despite relying on specific input values for profiling each program, is able to correctly identify parallelism that is valid for all potential input sets. Lastly, we develop a methodology to utilize liveness-based commutativity, as implemented in DCA, to detect latent loop parallelism in the shape of patterns. Our approach applies a series of transformations which subsequently enable multiple applications of DCA over the generated multi-loop code section and match its loop commutativity outcomes against the expected criteria for each pattern. Applying our methodology on sets of sequential loops, we are able to identify well-known parallel patterns (i.e., maps, reduction and scans). This extends the scope of parallelism detection to loops, such as those performing scan operations, which cannot be determined as parallelizable by simply evaluating liveness-based commutativity conditions on their original form

Edinburgh Research Archive

Effective data parallel computing on multicore processors

Author: Byun Jong-Ho
NC DOCKS at The University of North Carolina at Charlotte
Publication venue
Publication date: 01/01/2010
Field of study

The rise of chip multiprocessing or the integration of multiple general purpose processing cores on a single chip (multicores), has impacted all computing platforms including high performance, servers, desktops, mobile, and embedded processors. Programmers can no longer expect continued increases in software performance without developing parallel, memory hierarchy friendly software that can effectively exploit the chip level multiprocessing paradigm of multicores. The goal of this dissertation is to demonstrate a design process for data parallel problems that starts with a sequential algorithm and ends with a high performance implementation on a multicore platform. Our design process combines theoretical algorithm analysis with practical optimization techniques. Our target multicores are quad-core processors from Intel and the eight-SPE IBM Cell B.E. Target applications include Matrix Multiplications (MM), Finite Difference Time Domain (FDTD), LU Decomposition (LUD), and Power Flow Solver based on Gauss-Seidel (PFS-GS) algorithms. These applications are popular computation methods in science and engineering problems and are characterized by unit-stride (MM, LUD, and PFS-GS) or 2-point stencil (FDTD) memory access pattern. The main contributions of this dissertation include a cache- and space-efficient algorithm model, integrated data pre-fetching and caching strategies, and in-core optimization techniques. Our multicore efficient implementations of the above described applications outperform nai¨ve parallel implementations by at least 2x and scales well with problem size and with the number of processing cores

The University of North Carolina at Greensboro

How to Safely Integrate Multiple Applications on Embedded Many-Core Systems by Applying the “Correctness by Construction” Principle

Author: Robert Hilbrich
Publication venue: 'Hindawi Limited'
Publication date
Field of study

Crossref

High-performance computing for acceleration of medical image processing and analysis techniques

Author: Carlos Alex Sander Juvencio Gulo
Publication venue
Publication date: 19/12/2019
Field of study

Repositório Aberto da Universidade do Porto

Recommended from our members

Energy-efficient mobile Web computing

Author: Zhu Yuhao
Publication venue
Publication date: 17/07/2017
Field of study

Next-generation Web services will be primarily accessed through mobile devices. However, mobile devices are low-performance and stringently energy-constrained. In my dissertation, I propose the design of a high-performance and energy-efficient mobile Web computing substrate. It is a hardware/software co-designed system that delivers satisfactory user quality-of-service (QoS) experience on a mobile energy budget. The key insight is that the traditional interfaces between different Web stacks need to be enhanced with new abstractions that express user QoS experience and that expose architectural-level complexities. On the basis of the enhanced interfaces, I propose synergistic cross-layer optimizations across the processor architecture, Web runtime, programming language, and application layers to maximize the whole system efficiency. The contributions made in this dissertation will likely have a long-term impact because the target application domain, the Web, is becoming a universal mobile development platform, and because our solutions target the fundamental computation layers of the Web domain.Electrical and Computer Engineerin

Texas ScholarWorks

SPOT: A DSL for Extending Fortran Programs with Metaprogramming

Author: Jeff Gray
Songqing Yue
Publication venue: 'Hindawi Limited'
Publication date
Field of study

Crossref

Veröffentlichungen und Vorträge 2009 der Mitglieder der Fakultät für Informatik

Author: Karlsruher Institut für Technologie
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2012
Field of study

KITopen

3rd Many-core Applications Research Community (MARC) Symposium. (KIT Scientific Reports ; 7598)

Author: Becker Jürgen
Göhringer Diana
Hübner Michael
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2011
Field of study

This manuscript includes recent scientific work regarding the Intel Single Chip Cloud computer and describes approaches for novel approaches for programming and run-time organization

KITopen