Search CORE

180,925 research outputs found

Support to MPI Applications on the Grid

Author: Fernández-del-Castillo Enol
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 20/06/2012
Field of study

The current middleware stacks provide varying support for the Message Passing Interface (MPI) programming paradigm. Users face a complex and heterogeneous environment where too many low level details have to be specified to execute even the simplest parallel jobs. MPI-Start is a tool that provides an interoperable MPI execution framework across the different middleware implementations to abstract the user interfaces from the underlying middleware and to allow users to execute parallel applications in a uniform way, thus bridging the gap between HPC and HTC. In this work we present the latest developments in MPI-Start and how it can be integrated in the different middleware stacks available as part of EMI, providing a unified user experience for MPI jobs

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Recommended from our members

Galois : a system for parallel execution of irregular algorithms

Author: Nguyen Donald Do
Publication venue
Publication date: 04/09/2015
Field of study

textA programming model which allows users to program with high productivity and which produces high performance executions has been a goal for decades. This dissertation makes progress towards this elusive goal by describing the design and implementation of the Galois system, a parallel programming model for shared-memory, multicore machines. Central to the design is the idea that scheduling of a program can be decoupled from the core computational operator and data structures. However, efficient programs often require application-specific scheduling to achieve best performance. To bridge this gap, an extensible and abstract scheduling policy language is proposed, which allows programmers to focus on selecting high-level scheduling policies while delegating the tedious task of implementing the policy to a scheduler synthesizer and runtime system. Implementations of deterministic and prioritized scheduling also are described. An evaluation of a well-studied benchmark suite reveals that factoring programs into operators, schedulers and data structures can produce significant performance improvements over unfactored approaches. Comparison of the Galois system with existing programming models for graph analytics shows significant performance improvements, often orders of magnitude more, due to (1) better support for the restrictive programming models of existing systems and (2) better support for more sophisticated algorithms and scheduling, which cannot be expressed in other systems.Computer Science

Texas ScholarWorks

Java in the High Performance Computing arena: Research, practice and experience

Author: Doallo Ramón
Expósito Roberto R.
López Taboada Guillermo
Ramos Garea Sabela
Touriño Juan
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

This is a post-peer-review, pre-copyedit version of an article published in Science of Computer Programming. The final authenticated version is available online at: https://doi.org/10.1016/j.scico.2011.06.002[Abstract] The rising interest in Java for High Performance Computing (HPC) is based on the appealing features of this language for programming multi-core cluster architectures, particularly the built-in networking and multithreading support, and the continuous increase in Java Virtual Machine (JVM) performance. However, its adoption in this area is being delayed by the lack of analysis of the existing programming options in Java for HPC and thorough and up-to-date evaluations of their performance, as well as the unawareness on current research projects in this field, whose solutions are needed in order to boost the embracement of Java in HPC. This paper analyzes the current state of Java for HPC, both for shared and distributed memory programming, presents related research projects, and finally, evaluates the performance of current Java HPC solutions and research developments on two shared memory environments and two InfiniBand multi-core clusters. The main conclusions are that: (1) the significant interest in Java for HPC has led to the development of numerous projects, although usually quite modest, which may have prevented a higher development of Java in this field; (2) Java can achieve almost similar performance to natively compiled languages, both for sequential and parallel applications, being an alternative for HPC programming; (3) the recent advances in the efficient support of Java communications on shared memory and low-latency networks are bridging the gap between Java and natively compiled applications in HPC. Thus, the good prospects of Java in this area are attracting the attention of both industry and academia, which can take significant advantage of Java adoption in HPC.Ministerio de Ciencia e Innovación; TIN2010-16735Ministerio de Educación, Cultura y Deporte; AP2009-211

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Adapting the interior point method for the solution of linear programs on high performance computers

Author: Ashcroft
Bixby
Chen
Duff
Forrest
Gay
George
Golub
Karmarkar
Lai
Liu
Megiddo
Monteiro
Publication venue: Brunel University
Publication date: 01/01/1991
Field of study

In this paper we describe a unified algorithmic framework for the interior point method (IPM) of solving Linear Programs (LPs) which allows us to adapt it over a range of high performance computer architectures. We set out the reasons as to why IPM makes better use of high performance computer architecture than the sparse simplex method. In the inner iteration of the IPM a search direction is computed using Newton or higher order methods. Computationally this involves solving a sparse symmetric positive definite (SSPD) system of equations. The choice of direct and indirect methods for the solution of this system and the design of data structures to take advantage of coarse grain parallel and massively parallel computer architectures are considered in detail. Finally, we present experimental results of solving NETLIB test problems on examples of these architectures and put forward arguments as to why integration of the system within sparse simplex is beneficial

CiteSeerX

Crossref

Brunel University Research Archive

A Comparison of Parallel Graph Processing Implementations

Author: Norris Boyana
Pollard Samuel
Publication venue
Publication date: 16/05/2017
Field of study

The rapidly growing number of large network analysis problems has led to the emergence of many parallel and distributed graph processing systems---one survey in 2014 identified over 80. Since then, the landscape has evolved; some packages have become inactive while more are being developed. Determining the best approach for a given problem is infeasible for most developers. To enable easy, rigorous, and repeatable comparison of the capabilities of such systems, we present an approach and associated software for analyzing the performance and scalability of parallel, open-source graph libraries. We demonstrate our approach on five graph processing packages: GraphMat, the Graph500, the Graph Algorithm Platform Benchmark Suite, GraphBIG, and PowerGraph using synthetic and real-world datasets. We examine previously overlooked aspects of parallel graph processing performance such as phases of execution and energy usage for three algorithms: breadth first search, single source shortest paths, and PageRank and compare our results to Graphalytics.Comment: 10 pages, 10 figures, Submitted to EuroPar 2017 and rejected. Revised and submitted to IEEE Cluster 201

arXiv.org e-Print Archive

Crossref