Search CORE

1,483 research outputs found

Non-Strict Independence-Based Program Parallelization Using Sharing and Freeness Information.

Author: Bruynooghe
Bruynooghe
Bueno
Cabeza
Casas
Casas
Casas
Codish
Codish
Cortesi
Daniel Cabeza Gras
Debray
Debray
Gallagher
García de la Banda
Gupta
Gupta
Haridi
Hermenegildo
Hermenegildo
Hermenegildo
Hermenegildo
Hermenegildo
Hermenegildo
Hermenegildo
Hermenegildo
Hermenegildo
Hermenegildo
Hermenegildo
Hill
Hill
Jacobs
Jacobs
Janson
Karp
King
Li
López-García
Manuel V. Hermenegildo
Mera
Muthukumar
Muthukumar
Muthukumar
Muthukumar
Muthukumar
Navas
Pontelli
Pontelli
Ramkumar
Sato
Shen
Søndergaard
Vaucheret
Warren
Zaffanella
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

The current ubiquity of multi-core processors has brought renewed interest in program parallelization. Logic programs allow studying the parallelization of programs with complex, dynamic data structures with (declarative) pointers in a comparatively simple semantic setting. In this context, automatic parallelizers which exploit and-parallelism rely on notions of independence in order to ensure certain efficiency properties. “Non-strict” independence is a more relaxed notion than the traditional notion of “strict” independence which still ensures the relevant efficiency properties and can allow considerable more parallelism. Non-strict independence cannot be determined solely at run-time (“a priori”) and thus global analysis is a requirement. However, extracting non-strict independence information from available analyses and domains is non-trivial. This paper provides on one hand an extended presentation of our classic techniques for compile-time detection of non-strict independence based on extracting information from (abstract interpretation-based) analyses using the now well understood and popular Sharing + Freeness domain. This includes algorithms for combined compile-time/run-time detection which involve special run-time checks for this type of parallelism. In addition, we propose herein novel annotation (parallelization) algorithms, URLP and CRLP, which are specially suited to non-strict independence. We also propose new ways of using the Sharing + Freeness information to optimize how the run-time environments of goals are kept apart during parallel execution. Finally, we also describe the implementation of these techniques in our parallelizing compiler and recall some early performance results. We provide as well an extended description of our pictorial representation of sharing and freeness information

CiteSeerX

Elsevier - Publisher Connector

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Prototyping Parallel Simulations on Manycore Architectures Using Scala: A Case Study

Author: Hill David R.C.
Mazel Claude
Passerat-Palmbach Jonathan
Reuillon Romain
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2013
Field of study

International audienceAt the manycore era, every simulation practitioner can take advantage of the com-puting horsepower delivered by the available high performance computing devices. From multicoreCPUs (Central Processing Unit) to thousand-thread GPUs (Graphics Processing Unit), severalarchitectures are now able to offer great speed-ups to simulations. However, it is often tricky toharness them properly, and even more complicated to implement a few declinations of the samemodel to compare the parallelizations. Thus, simulation practitioners would mostly benefit of asimple way to evaluate the potential benefits of choosing one platform or another to parallelizetheir simulations. In this work, we study the ability of the Scala programming language to fulfillthis need. We compare the features of two frameworks in this study: Scala Parallel Collections andScalaCL. Both of them provide facilities to set up a data-parallelism approach on Scala collections.The capabilities of the two frameworks are benchmarked with three simulation models as well asa large set of parallel architectures. According to our results, these two Scala frameworks shouldbe considered by the simulation community to quickly prototype parallel simulations, and choosethe target platform on which investing in an optimized development will be rewarding

Crossref

HAL Clermont Université

HAL-Paris1

HAL-Polytechnique

Forecasting of commercial sales with large scale Gaussian Processes

Author: Carmen Marsit (334042)
Jia Chen (8203)
Ke Hao (50181)
Luca Lambertini (72724)
Maya Deyssenroth (4238833)
Shouneng Peng (493132)
Publication venue
Publication date: 01/01/2017
Field of study

This paper argues that there has not been enough discussion in the field of applications of Gaussian Process for the fast moving consumer goods industry. Yet, this technique can be important as it e.g., can provide automatic feature relevance determination and the posterior mean can unlock insights on the data. Significant challenges are the large size and high dimensionality of commercial data at a point of sale. The study reviews approaches in the Gaussian Processes modeling for large data sets, evaluates their performance on commercial sales and shows value of this type of models as a decision-making tool for management.Comment: 1o pages, 5 figure

arXiv.org e-Print Archive

Crossref

FigShare

Automatic Parallelization of a Gap Model using Java and OpenCL

Author: Corbara Bruno
Forest Arthur
Hill David R.C.
Pal Julien
Passerat-Palmbach Jonathan
Publication venue: HAL CCSD
Publication date: 22/10/2012
Field of study

International audienceNowadays, scientists are often disappointed by the outcome when parallelizing their simulations, in spite of all the tools at their disposal. They often invest much time and money, and do not obtain the expected speed-up. This can come from many factors going from a wrong parallel architecture choice to a model that simply does not present the criteria to be a good candidate for parallelization. However, when parallelization is successful, the reduced execution time can open new research perspectives, and allow to explore larger sets of parameters of a given simulation model. Thus, it is worth investing some time and workforce to figure out whether an algorithm is a good candidate to parallelization. Automatic parallelization tools can be of great help when trying to identify these properties. In this paper, we apply an automatic parallelization approach combining Java and OpenCL on an existing Gap Model. The two technologies are linked with a library from AMD called Aparapi. The latter allowed us to study the behavior of our automatically parallelized model on 10 different platforms, without modifying the source code

HAL Clermont Université

UPIR: Toward the Design of Unified Parallel Intermediate Representation for Parallel Programming Models

Author: Wang Anjia
Yan Yonghong
Yi Xinyao
Publication venue
Publication date: 28/10/2022
Field of study

The complexity of heterogeneous computing architectures, as well as the demand for productive and portable parallel application development, have driven the evolution of parallel programming models to become more comprehensive and complex than before. Enhancing the conventional compilation technologies and software infrastructure to be parallelism-aware has become one of the main goals of recent compiler development. In this paper, we propose the design of unified parallel intermediate representation (UPIR) for multiple parallel programming models and for enabling unified compiler transformation for the models. UPIR specifies three commonly used parallelism patterns (SPMD, data and task parallelism), data attributes and explicit data movement and memory management, and synchronization operations used in parallel programming. We demonstrate UPIR via a prototype implementation in the ROSE compiler for unifying IR for both OpenMP and OpenACC and in both C/C++ and Fortran, for unifying the transformation that lowers both OpenMP and OpenACC code to LLVM runtime, and for exporting UPIR to LLVM MLIR dialect.Comment: Typos corrected. Format update

arXiv.org e-Print Archive

Experimenting with independent and-parallel prolog using standard prolog

Author: Carro Liñares Manuel
Hermenegildo Manuel V.
Publication venue: Facultad de Informática (UPM)
Publication date: 01/10/1991
Field of study

This paper presents an approximation to the study of parallel systems using sequential tools. The Independent And-parallelism in Prolog is an example of parallel processing paradigm in the framework of logic programming, and implementations like <fc-Prolog uncover the potential performance of parallel processing. But this potential can also be explored using only sequential systems. Being the spirit of this paper to show how this can be done with a standard system, only standard Prolog will be used in the implementations included. Such implementations include tests for parallelism in And-Prolog, a correctnesschecking meta-interpreter of <fc-Prolog and a simulator of parallel execution for <fc-Prolog

Archivo Digital UPM

ReduxSTM: Optimizing STM designs for Irregular Applications

Author: Gutierrez-Carrasco Eladio Damian
Pedrero Luque Manuel
Plata-Gonzalez Oscar Guillermo
Romero-Montiel Sergio
Publication venue: 'Elsevier BV'
Publication date: 15/11/2018
Field of study

Repositorio Institucional Universidad de Málaga