Search CORE

169 research outputs found

An approach to locality-conscious load balancing and transparent memory hierarchy management with a global-address-space parallel programming model

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

A pattern language for parallelizing irregular algorithms

Author: Monteiro Pedro Miguel Ferreira Costa
Publication venue: Faculdade de Ciências e Tecnologia
Publication date: 01/01/2009
Field of study

Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia InformáticaIn irregular algorithms, data set’s dependences and distributions cannot be statically predicted. This class of algorithms tends to organize computations in terms of data locality instead of parallelizing control in multiple threads. Thus, opportunities for exploiting parallelism vary dynamically, according to how the algorithm changes data dependences. As such, effective parallelization of such algorithms requires new approaches that account for that dynamic nature. This dissertation addresses the problem of building efficient parallel implementations of irregular algorithms by proposing to extract, analyze and document patterns of concurrency and parallelism present in the Galois parallelization framework for irregular algorithms. Patterns capture formal representations of a tangible solution to a problem that arises in a well defined context within a specific domain. We document the said patterns in a pattern language, i.e., a set of inter-dependent patterns that compose well-documented template solutions that can be reused whenever a certain problem arises in a well-known context

Repositório da Universidade Nova de Lisboa

Geometry–aware finite element framework for multi–physics simulations: an algorithmic and software-centric perspective

Author: Krause Rolf
Zulian Patrick
Publication venue
Publication date: 03/08/2017
Field of study

In finite element simulations, the handling of geometrical objects and their discrete representation is a critical aspect in both serial and parallel scientific software environments. The development of codes targeting such envinronments is subject to great development effort and man-hours invested. In this thesis we approach these issues from three fronts. First, stable and efficient techniques for the transfer of discrete fields between non matching volume or surface meshes are an essential ingredient for the discretization and numerical solution of coupled multi-physics and multi-scale problems. In particular L2-projections allows for the transfer of discrete fields between unstructured meshes, both in the volume and on the surface. We present an algorithm for parallelizing the assembly of the L2-transfer operator for unstructured meshes which are arbitrarily distributed among different processes. The algorithm requires no a priori information on the geometrical relationship between the different meshes. Second, the geometric representation is often a limiting factor which imposes a trade-off between how accurately the shape is described, and what methods can be employed for solving a system of differential equations. Parametric finite-elements and bijective mappings between polygons or polyhedra allow us to flexibly construct finite element discretizations with arbitrary resolutions without sacrificing the accuracy of the shape description. Such flexibility allows employing state-of-the-art techniques, such as geometric multigrid methods, on meshes with almost any shape.t, the way numerical techniques are represented in software libraries and approached from a development perspective, affect both usability and maintainability of such libraries. Completely separating the intent of high-level routines from the actual implementation and technologies allows for portable and maintainable performance. We provide an overview on current trends in the development of scientific software and showcase our open-source library utopia

RERO DOC Digital Library

Analysis and Optimization of Scientific Applications through Set and Relation Abstractions

Author: Tohid (Rastegar Tohid Mohammed), M.
Publication venue: LSU Digital Commons
Publication date: 01/01/2017
Field of study

Writing high performance code has steadily become more challenging since the design of computing systems has moved toward parallel processors in forms of multi and many-core architectures. This trend has resulted in exceedingly more heterogeneous architectures and programming models. Moreover, the prevalence of distributed systems, especially in fields relying on supercomputers, has caused the programming of such diverse environment more difficulties. To mitigate such challenges, an assortment of tools and programming models have been introduced in the past decade or so. Some efforts focused on the characteristics of the code, such as polyhedral compilers (e.g. Pluto, PPCG, etc.) while others took in consideration the aspects of the application domain and proposed domain specific languages (DSLs). DSLs are developed either in the form of a stand-alone language, like Halide for image processing, or as a part of a general purpose language (e.g., Firedrake- a DSL embedded in Python for solving PDEs using FEM.) called embedded. All these approaches attempt to provide the best input to the underlying common programming models like MPI and OpenMP for distributed and shared memory systems respectively. This dissertation introduces Kaashi, a high-level run-time system, embedded in C++ language, designed to manage memory and execution order of programs with large input data and complex dependencies. Kaashi provides a uniform front-end to multiple back-ends focusing on distributed systems. Kaashi abstractions allows the programmer to define the problem’s data domain as a collection of sets and relations between pairs of such sets. The aforesaid level of abstraction could enable series of optimizations which, otherwise, are very expensive to detect or not feasible at all. Furthermore, Kaashi’s API helps novice programmers to write their code more structurally without getting involved in details of data management and communication

Louisiana State University

Parallelizing Julia with a Non-Invasive DSL

Author: Anderson Todd A.
Kuper Lindsey
Liu Hai
Shpeisman Tatiana
Totoni Ehsan
Vitek Jan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st European Conference on Object-Oriented Programming (ECOOP 2017)
Publication date: 01/01/2017
Field of study

Computational scientists often prototype software using productivity languages that offer high-level programming abstractions. When higher performance is needed, they are obliged to rewrite their code in a lower-level efficiency language. Different solutions have been proposed to address this trade-off between productivity and efficiency. One promising approach is to create embedded domain-specific languages that sacrifice generality for productivity and performance, but practical experience with DSLs points to some road blocks preventing widespread adoption. This paper proposes a non-invasive domain-specific language that makes as few visible changes to the host programming model as possible. We present ParallelAccelerator, a library and compiler for high-level, high-performance scientific computing in Julia. ParallelAccelerator\u27s programming model is aligned with existing Julia programming idioms. Our compiler exposes the implicit parallelism in high-level array-style programs and compiles them to fast, parallel native code. Programs can also run in "library-only" mode, letting users benefit from the full Julia environment and libraries. Our results show encouraging performance improvements with very few changes to source code required. In particular, few to no additional type annotations are necessary

Dagstuhl Research Online Publication Server

ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization

Author: Antcheva Ilka
Ballintijn Maarten
Bellenot Bertrand
Biskup Marek
Brun Rene
Buncic Nenad
Canal Philippe
Casadei Diego
Couet Olivier
Fine Valery
Franco Leandro
Ganis Gerardo
Gheata Andrei
Goto Masaharu
Iwaszkiewicz Jan
Kreshuk Anna
Maline David Gonzalez
Maunder Richard
Moneta Lorenzo
Naumann Axel
Offermann Eddy
Onuchin Valeriy
Panacek Suzanne
Rademakers Fons
Russo Paul
Segura Diego Marcos
Tadel Matevz
Publication venue: 'Elsevier BV'
Publication date: 31/08/2015
Field of study

ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, ROOT offers packages for complex data modeling and fitting, as well as multivariate classification based on machine learning techniques. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way

arXiv.org e-Print Archive

CERN Document Server