1,483 research outputs found

    Non-Strict Independence-Based Program Parallelization Using Sharing and Freeness Information.

    Get PDF
    The current ubiquity of multi-core processors has brought renewed interest in program parallelization. Logic programs allow studying the parallelization of programs with complex, dynamic data structures with (declarative) pointers in a comparatively simple semantic setting. In this context, automatic parallelizers which exploit and-parallelism rely on notions of independence in order to ensure certain efficiency properties. “Non-strict” independence is a more relaxed notion than the traditional notion of “strict” independence which still ensures the relevant efficiency properties and can allow considerable more parallelism. Non-strict independence cannot be determined solely at run-time (“a priori”) and thus global analysis is a requirement. However, extracting non-strict independence information from available analyses and domains is non-trivial. This paper provides on one hand an extended presentation of our classic techniques for compile-time detection of non-strict independence based on extracting information from (abstract interpretation-based) analyses using the now well understood and popular Sharing + Freeness domain. This includes algorithms for combined compile-time/run-time detection which involve special run-time checks for this type of parallelism. In addition, we propose herein novel annotation (parallelization) algorithms, URLP and CRLP, which are specially suited to non-strict independence. We also propose new ways of using the Sharing + Freeness information to optimize how the run-time environments of goals are kept apart during parallel execution. Finally, we also describe the implementation of these techniques in our parallelizing compiler and recall some early performance results. We provide as well an extended description of our pictorial representation of sharing and freeness information

    Prototyping Parallel Simulations on Manycore Architectures Using Scala: A Case Study

    Get PDF
    International audienceAt the manycore era, every simulation practitioner can take advantage of the com-puting horsepower delivered by the available high performance computing devices. From multicoreCPUs (Central Processing Unit) to thousand-thread GPUs (Graphics Processing Unit), severalarchitectures are now able to offer great speed-ups to simulations. However, it is often tricky toharness them properly, and even more complicated to implement a few declinations of the samemodel to compare the parallelizations. Thus, simulation practitioners would mostly benefit of asimple way to evaluate the potential benefits of choosing one platform or another to parallelizetheir simulations. In this work, we study the ability of the Scala programming language to fulfillthis need. We compare the features of two frameworks in this study: Scala Parallel Collections andScalaCL. Both of them provide facilities to set up a data-parallelism approach on Scala collections.The capabilities of the two frameworks are benchmarked with three simulation models as well asa large set of parallel architectures. According to our results, these two Scala frameworks shouldbe considered by the simulation community to quickly prototype parallel simulations, and choosethe target platform on which investing in an optimized development will be rewarding

    Forecasting of commercial sales with large scale Gaussian Processes

    Full text link
    This paper argues that there has not been enough discussion in the field of applications of Gaussian Process for the fast moving consumer goods industry. Yet, this technique can be important as it e.g., can provide automatic feature relevance determination and the posterior mean can unlock insights on the data. Significant challenges are the large size and high dimensionality of commercial data at a point of sale. The study reviews approaches in the Gaussian Processes modeling for large data sets, evaluates their performance on commercial sales and shows value of this type of models as a decision-making tool for management.Comment: 1o pages, 5 figure

    Automatic Parallelization of a Gap Model using Java and OpenCL

    Get PDF
    International audienceNowadays, scientists are often disappointed by the outcome when parallelizing their simulations, in spite of all the tools at their disposal. They often invest much time and money, and do not obtain the expected speed-up. This can come from many factors going from a wrong parallel architecture choice to a model that simply does not present the criteria to be a good candidate for parallelization. However, when parallelization is successful, the reduced execution time can open new research perspectives, and allow to explore larger sets of parameters of a given simulation model. Thus, it is worth investing some time and workforce to figure out whether an algorithm is a good candidate to parallelization. Automatic parallelization tools can be of great help when trying to identify these properties. In this paper, we apply an automatic parallelization approach combining Java and OpenCL on an existing Gap Model. The two technologies are linked with a library from AMD called Aparapi. The latter allowed us to study the behavior of our automatically parallelized model on 10 different platforms, without modifying the source code

    UPIR: Toward the Design of Unified Parallel Intermediate Representation for Parallel Programming Models

    Full text link
    The complexity of heterogeneous computing architectures, as well as the demand for productive and portable parallel application development, have driven the evolution of parallel programming models to become more comprehensive and complex than before. Enhancing the conventional compilation technologies and software infrastructure to be parallelism-aware has become one of the main goals of recent compiler development. In this paper, we propose the design of unified parallel intermediate representation (UPIR) for multiple parallel programming models and for enabling unified compiler transformation for the models. UPIR specifies three commonly used parallelism patterns (SPMD, data and task parallelism), data attributes and explicit data movement and memory management, and synchronization operations used in parallel programming. We demonstrate UPIR via a prototype implementation in the ROSE compiler for unifying IR for both OpenMP and OpenACC and in both C/C++ and Fortran, for unifying the transformation that lowers both OpenMP and OpenACC code to LLVM runtime, and for exporting UPIR to LLVM MLIR dialect.Comment: Typos corrected. Format update

    Experimenting with independent and-parallel prolog using standard prolog

    Get PDF
    This paper presents an approximation to the study of parallel systems using sequential tools. The Independent And-parallelism in Prolog is an example of parallel processing paradigm in the framework of logic programming, and implementations like <fc-Prolog uncover the potential performance of parallel processing. But this potential can also be explored using only sequential systems. Being the spirit of this paper to show how this can be done with a standard system, only standard Prolog will be used in the implementations included. Such implementations include tests for parallelism in And-Prolog, a correctnesschecking meta-interpreter of <fc-Prolog and a simulator of parallel execution for <fc-Prolog
    • …
    corecore