42 research outputs found
Compilation pour cibles hétérogènes : automatisation des analyses, transformations et décisions nécessaires
8 pagesInternational audienceLes accélérateurs matériels, telles les cartes FPGA ou les cartes graphiques, apportent une alternative ou un complément intéressant aux processeurs multi-coeurs classiques pour de nombreuses applications scientifiques. Il est cependant coûteux et difficile d'y porter des applications existantes ; et les compilateurs standards, traditionnellement portés sur la génération de code pour processeurs séquentiels, ne disposent pas des abstractions nécessaires à la génération automatique et re-ciblable de code pour ces nouvelles cibles. Cet article présente un ensemble de transformations de code de haut niveau reposant sur une abstraction à plusieurs niveaux de l'architecture des accélérateurs actuels et permettant de construire des compilateurs spécifiques à chaque cible en se basant sur une infrastructure commune. Ces transformations ont été utilisées pour construire avec PIPS deux compilateurs complètement automatisés pour un processeur embarqué à base de FPGA et pour GPU NVIDIA avec PAR4ALL
PyPS a programmable pass manager
International audiencePIPS4U :1- Complex Environment2- Source-to-Source3- Model for Code Transformations4- Based on a Scripting Language5- Abstractions6- Control Structures7- Target
Quonops©, la prévision opérationnelle en acoustique sous-marine sur grille de calcul
National audienceQuonops©, la prévision opérationnelle en acoustique sous-marine sur grille de calcu
Quonops©, la prévision opérationnelle en acoustique sous-marine sur grille de calcul
National audienceQuonops©, la prévision opérationnelle en acoustique sous-marine sur grille de calcu
NFLlib: NTT-based Fast Lattice Library
International audienceRecent years have witnessed an increased interest in lattice cryptography. Besides its strong security guarantees, its simplicity and versatility make this powerful theoretical tool a promising competitive alternative to classical cryptographic schemes. In this paper, we introduce NFLlib, an efficient and open-source C++ library dedicated to ideal lattice cryptography in the widely-spread polynomial ring Zp[x]/(x n + 1) for n a power of 2. The library combines al-gorithmic optimizations (Chinese Remainder Theorem, optimized Number Theoretic Transform) together with programming optimization techniques (SSE and AVX2 specializations, C++ expression templates, etc.), and will be fully available under the GPL license. The library compares very favorably to other libraries used in ideal lattice cryptography implementations (namely the generic number theory libraries NTL and flint implementing polynomial arithmetic, and the optimized library for lattice homomorphic encryption HElib): restricting the library to the aforementioned polynomial ring allows to gain several orders of magnitude in efficiency
PENCIL: Towards a Platform-Neutral Compute Intermediate Language for DSLs
We motivate the design and implementation of a platform-neutral compute
intermediate language (PENCIL) for productive and performance-portable
accelerator programming
PIPS Is not (just) Polyhedral Software Adding GPU Code Generation in PIPS
6 pagesInternational audienceParallel and heterogeneous computing are growing in audience thanks to the increased performance brought by ubiquitous manycores and GPUs. However, available programming models, like OPENCL or CUDA, are far from being straightforward to use. As a consequence, several automated or semi-automated approaches have been proposed to automatically generate hardware-level codes from high-level sequential sources. Polyhedral models are becoming more popular because of their combination of expressiveness, compactness, and accurate abstraction of the data-parallel behaviour of programs. These models provide automatic or semi-automatic parallelization and code transformation capabilities that target such modern parallel architectures. PIPS is a quarter-century old source-to-source transformation framework that initially targeted parallel machines but then evolved to include other targets. PIPS uses abstract interpretation on an integer polyhedral lattice to represent program code, allowing linear relation analysis on integer variables in an interprocedural way. The same representation is used for the dependence test and the convex array region analysis. The polyhedral model is also more classically used to schedule code from linear constraints. In this paper, we illustrate the features of this compiler infrastructure on an hypothetical input code, demonstrating the combination of polyhedral and non polyhedral transformations. PIPS interprocedural polyhedral analyses are used to generate data transfers and are combined with non-polyhedral transformations to achieve efficient CUDA code generation
A System for Interactive Spatial Analysis via Potential Maps
International audienceThis paper presents a new cartographic tool for spatial analysis of social data, using the potential smoothing method. The purpose of this method is to view the spreading of a phenomenon (demographic, economical, social, etc.) in a continuous way, at a macroscopic scale, from data sampled on administrative areas. We aim to offer an interactive tool, accessible through the Web, but guarantying the confidentiality of data. The biggest difficulty is induced by the high complexity of the calculus, dealing with a great amount of data. A distributed architecture is proposed: map computation is made on server-side, using particular optimization techniques, whereas map visualization and parameterisation of the analysis are done on a web-based client, the two parts communicating through a Web protocol
IV Grid Plugtests: composing dedicated tools to run an application efficiently on Grid'5000
Exploiting efficiently the resources of whole Grid'5000 with the same application requires to solve several issues:
1) resources reservation;
2) application's processes deployment;
3) application's tasks scheduling.
For the IV Grid Plugtests, we used a dedicated tool for each issue to solve.
The N-Queens contest rules imposed ProActive for the resources reservations (issue 1).
Issue 2 was solved using TakTuk which allows to deploy a large set of remote nodes. Deployed nodes take part in the deployment using an adaptive algorithm that makes it very efficient.
For the 3rd issue, we wrote our application with Athapascan API whose model is based on the concepts of tasks and shared data. The application is described as a data-flow graph using the Shared and Fork keywords. This high level abstraction of hardware gives us an efficient execution with the Kaapi runtime engine using a work-stealing scheduling algorithm to balance the workload between all the distributed processes
Par4All: From Convex Array Regions to Heterogeneous Computing
2 pagesInternational audienceRecent compilers comprise an incremental way for converting software toward accelerators. For instance, the pgi Accelerator [14] or hmpp [3] require the use of directives. The programmer must select the pieces of source that are to be executed on the accelerator, providing optional directives that act as hints for data allocations and transfers. The compiler generates all code automatically. [...] Unlike these approaches, Par4All [13] is an automatic parallelizing and optimizing compiler for C and Fortran sequential programs funded by the hpc Project startup. The purpose of this source-to-source compiler is to integrate several compilation tools into an easy-to-use yet powerful compiler that automatically transforms existing programs to target various hardware platforms