Search CORE

81 research outputs found

Programmability of the HPCS Languages: A Case Study with a Quantum Chemistry Kernel (Extended Version)

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Using the High Productivity Language Chapel to Target GPGPU Architectures

Author: Chamberlain Bradford L.
Garzaran Maria J.
Padua David
Sidelnik Albert
Publication venue
Publication date: 25/04/2011
Field of study

It has been widely shown that GPGPU architectures offer large performance gains compared to their traditional CPU counterparts for many applications. The downside to these architectures is that the current programming models present numerous challenges to the programmer: lower-level languages, explicit data movement, loss of portability, and challenges in performance optimization. In this paper, we present novel methods and compiler transformations that increase productivity by enabling users to easily program GPGPU architectures using the high productivity programming language Chapel. Rather than resorting to different parallel libraries or annotations for a given parallel platform, we leverage a language that has been designed from first principles to address the challenge of programming for parallelism and locality. This also has the advantage of being portable across distinct classes of parallel architectures, including desktop multicores, distributed memory clusters, large-scale shared memory, and now CPU-GPU hybrids. We present experimental results from the Parboil benchmark suite which demonstrate that codes written in Chapel achieve performance comparable to the original versions implemented in CUDA.NSF CCF 0702260Cray Inc. Cray-SRA-2010-016962010-2011 Nvidia Research Fellowshipunpublishednot peer reviewe

Illinois Digital Environment for Access to Learning and Scholarship Repository

A New Parallel Programing Language Fortress:features And Applications

Author: Üney Erdem
Publication venue: 'National Institute of Informatics (NII)'
Publication date: 25/10/2016
Field of study

Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Bilişim Ensititüsü, 2009Thesis (M.Sc.) -- İstanbul Technical University, Institute of Informatics, 2009Bilgisayar sistemleri çok hızlı bir şekilde büyümektedirler. DARPA, 2010 yılı için peta-ölçekli bir bilgisayar sisteminin gereksinimi ve yapılabilirliğini öngördü. 2003 yılında bir çok firmayla bir proje başlattı. Şimdi, proje bitmek üzereyken ve milyonlarca dolar projeye aktarılmışken, projenin getirisi üç tane yüksek başarımlı yüksek işlevselli programlama dili oldu. Bu dillerden bir tanesi Fortress. Fortress, matematik gösterim temelli, değişken tipleri sıkı olarak takip edilen, blok tabanlı ve kesin olarak paralel bir bilgisayar programlama dilidir. Fortress'i ilginç kılan, yüksek işlevsellikli ve bilim yönlü yapısıdır. Bu çalışmada Fortress'in iç dinamiklerini inceledi. Performansını ölçmek için çeşitli testler yapıldı ve sonuçları tartışıldı.Bilgisayar sistemleri çok hızlı bir şekilde büyümektedirler. DARPA, 2010 yılı için peta-ölçekli bir bilgisayar sisteminin gereksinimi ve yapılabilirliğini öngördü. 2003 yılında bir çok firmayla bir proje başlattı. Şimdi, proje bitmek üzereyken ve milyonlarca dolar projeye aktarılmışken, projenin getirisi üç tane yüksek başarımlı yüksek işlevselli programlama dili oldu. Bu dillerden bir tanesi Fortress. Fortress, matematik gösterim temelli, değişken tipleri sıkı olarak takip edilen, blok tabanlı ve kesin olarak paralel bir bilgisayar programlama dilidir. Fortress'i ilginç kılan, yüksek işlevsellikli ve bilim yönlü yapısıdır. Bu çalışmada Fortress'in iç dinamiklerini inceledi. Performansını ölçmek için çeşitli testler yapıldı ve sonuçları tartışıldı.Yüksek LisansM.Sc

Ulusal Üniversitelerarası Açık Erişim Sistemi - İstanbul Teknik Üniversitesi

DART-MPI: An MPI-based Implementation of a PGAS Runtime System

Author: Fürlinger Karl
Glass Colin W.
Gracia José
Idrees Kamran
Mhedheb Yousri
Tao Jie
Zhou Huan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

A Partitioned Global Address Space (PGAS) approach treats a distributed system as if the memory were shared on a global level. Given such a global view on memory, the user may program applications very much like shared memory systems. This greatly simplifies the tasks of developing parallel applications, because no explicit communication has to be specified in the program for data exchange between different computing nodes. In this paper we present DART, a runtime environment, which implements the PGAS paradigm on large-scale high-performance computing clusters. A specific feature of our implementation is the use of one-sided communication of the Message Passing Interface (MPI) version 3 (i.e. MPI-3) as the underlying communication substrate. We evaluated the performance of the implementation with several low-level kernels in order to determine overheads and limitations in comparison to the underlying MPI-3.Comment: 11 pages, International Conference on Partitioned Global Address Space Programming Models (PGAS14

arXiv.org e-Print Archive

Crossref

An Incremental Parallel PGAS-based Tree Search Algorithm

Author: Carneiro Tiago
Melab Nouredine
Publication venue: HAL CCSD
Publication date: 15/07/2019
Field of study

International audienceIn this work, we show that the Chapel high-productivity language is suitable for the design and implementation of all aspects involved in the conception of parallel tree search algorithms for solving combinatorial problems. Initially, it is possible to hand-optimize the data structures involved in the search process in a way equivalent to C. As a consequence, the single-threaded search in Chapel is on average only 7% slower than its counterpart written in C. Whereas programming a multicore tree search in Chapel is equivalent to C-OpenMP in terms of performance and programmability, its productivity-aware features for distributed programming stand out. It is possible to incrementally conceive a distributed tree search algorithm starting from its multicore counterpart by adding few lines of code. The distributed implementation performs load balancing among different computer nodes and also exploits all CPU cores of the system. Chapel presents an interesting trade-off between programmability and performance despite the high level of its features. The distributed tree search in Chapel is on average 16% slower and reaches up to 80% of the scalability achieved by its C-MPI+OpenMP counterpart

User-Defined Data Distributions in High-Level Programming Languages

Author: Diaconescu Roxana E.
Zima Hans P.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

One of the characteristic features of today’s high performance computing systems is a physically distributed memory. Efficient management of locality is essential for meeting key performance requirements for these architectures. The standard technique for dealing with this issue has involved the extension of traditional sequential programming languages with explicit message passing, in the context of a processor-centric view of parallel computation. This has resulted in complex and error-prone assembly-style codes in which algorithms and communication are inextricably interwoven. This paper presents a high-level approach to the design and implementation of data distributions. Our work is motivated by the need to improve the current parallel programming methodology by introducing a paradigm supporting the development of efficient and reusable parallel code. This approach is currently being implemented in the context of a new programming language called Chapel, which is designed in the HPCS project Cascade

NASA Technical Reports Server

Caltech Authors

Partitioned Global Address Space Languages

Author: Bruno De Fraine
Lin Calvin
Mattias De Wael
Nieplocha Jarek
Nieplocha Jaroslaw
Stefan Marr
Tom Van Cutsem
Wolfgang De Meuter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/05/2015
Field of study

The Partitioned Global Address Space (PGAS) model is a parallel programming model that aims to improve programmer productivity while at the same time aiming for high performance. The main premise of PGAS is that a globally shared address space improves productivity, but that a distinction between local and remote data accesses is required to allow performance optimizations and to support scalability on large-scale parallel architectures. To this end, PGAS preserves the global address space while embracing awareness of non-uniform communication costs. Today, about a dozen languages exist that adhere to the PGAS model. This survey proposes a definition and a taxonomy along four axes: how parallelism is introduced, how the address space is partitioned, how data is distributed among the partitions and finally how data is accessed across partitions. Our taxonomy reveals that today's PGAS languages focus on distributing regular data and distinguish only between local and remote data access cost, whereas the distribution of irregular data and the adoption of richer data access cost models remain open challenges

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Kent Academic Repository

An Incremental Parallel PGAS-based Tree Search Algorithm

Author: Carneiro Tiago
Melab Nouredine
Publication venue: HAL CCSD
Publication date: 15/07/2019
Field of study

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Survey of Novel Programming Models for Parallelizing Applications at Exascale

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

HPCML: A Modeling Language Dedicated to High-Performance Scientific Computing

Author: Bruel Jean-Michel
Lugato David
Ober Ileana
Palyart Marc
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceTremendous computational resources are required to compute complex physical simulations. Unfortunately computers able to provide such computational power are difficult to program, especially since the rise of heterogeneous hardware architectures. This makes it particularly challenging to exploit efficiently and sustainably supercomputers resources. We think that model-driven engineering can help us tame the complexity of high-performance scientific computing software development by separating the different concerns such as mathematics, parallelism, or validation. The principles of our approach, named MDE4HPC, stem from this idea. In this paper, we describe the High-Performance Computing Modeling Language (HPCML), a domain-specific modeling language at the center of this approach

CiteSeerX

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL-CEA