Search CORE

58 research outputs found

Prefetching in functional languages

Author: Baer Jean-Loup
Cahoon Brendon
Salvucci Jérémie
Publication venue: Proceedings of the 2020 ACM SIGPLAN International Symposium on Memory Management
Publication date: 16/06/2020
Field of study

Functional programming languages contain a number of runtime and language features, such as garbage collection, indirect memory accesses, linked data structures and immutability, that interact with a processor’s memory system. These conspire to cause a variety of unintuitive memory performance effects. For example, it is slower to traverse through linked lists and arrays of data that have been sorted than to traverse the same data accessed in the order it was allocated. We seek to understand these issues and mitigate them in a manner consistent with functional languages, taking advantage of the features themselves where possible. For example, immutability and garbage collection force linked lists to be allocated roughly sequentially in memory, even when the data pointed to within each node is not. We add language primitives for software-prefetching to the OCaml language to exploit this, and observe significant performance improvements a variety of micro- and macro-benchmarks, resulting in speedups of up to 2× on the out-of-order superscalar Intel Haswell and Xeon Phi Knights Landing systems, and up to 3× on the in-order Arm Cortex-A53.Arm Limite

Crossref

Apollo (Cambridge)

Cost-effective compiler directed memory prefetching and bypassing

Author: Ayguadé Parra Eduard
Baer Jean-Loup
Ortega Fernández Daniel
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

Ever increasing memory latencies and deeper pipelines push memory farther from the processor. Prefetching techniques aim is to bridge these two gaps by fetching data in advance to both the L1 cache and the register file. Our main contribution in this paper is a hybrid approach to the prefetching problem that combines both software and hardware prefetching in a cost-effective way by needing very little hardware support and impacting minimally the design of the processor pipeline. The prefetcher is built on-top of a static memory instruction bypassing, which is in charge of bringing prefetched values in the register file. In this paper we also present a thorough analysis of the limits of both prefetching and memory instruction bypassing. We also compare our prefetching technique with a prior speculative proposal that attacked the same problem, and we show that at much lower cost, our hybrid solution is better than a realistic implementation of speculative prefetching and bypassing. On average, our hybrid implementation achieves a 13% speed-up improvement over a version with software prefetching in a subset of numerical applications and an average of 43% over a version with no software prefetching (achieving up to a 102% for specific benchmarks).Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Matrice de connexion minimale d'une matrice de précédence donnée

Author: Baer Jean-Loup
Publication venue: 'EDP Sciences'
Publication date: 01/01/1969
Field of study

Crossref

Numérisation de Documents Anciens Mathématiques

Matrice de connexion minimale d\u27une matrice de précédence donnée

Author: Baer Jean-Loup
Publication venue
Publication date: 01/01/1969
Field of study

Numérisation de Documents Anciens Mathématiques

International Conference on Parallel Processing

Author: Baer Jean Loup
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1977
Field of study

CERN Document Server

Matrice de connexion minimale d'une matrice de précédence donnée

Author: Jean-Loup Baer
Publication venue: 'EDP Sciences'
Publication date: 04/05/2009
Field of study

EDP Sciences OAI-PMH repository (1.2.0)

Etude critique et données de compilation du langage Cobol

Author: Baer Jean-Loup
Publication venue: HAL CCSD
Publication date: 27/06/1963
Field of study

Première thèse de 3ème cycle en informatique passé à l'IMAGThe goal of this study is to quickly present (in French) the Cobol language, to compare it with the commercial languages which preceded it and to make criticism as a commercial language of it, to compare it with Algol and to criticize it as a not very formal language, to present the broad outline of a compiler on a binary machine at words with all the difficulties that that represents, and to conclude on the future from COBOLLe but de cette étude est de présenter rapidement (en français) le langage Cobol, de le comparer aux langages commerciaux qui l'ont précédé et d'en faire la critique en tant que langage commercial, de le comparer à Algol et de le critiquer en tant que langage peu formel, de présenter les grandes lignes d'un compilateur sur un machine binaire à mots avec toutes les difficultés que cela représente, et de conclure sur l'avenir de Cobo

Thèses en Ligne

Hal - Université Grenoble Alpes

Reducing Memory Latency via Non-blocking and Prefetching Caches

Author: Jean-loup Baer
Jean-loup Baer
Tien-Fu Chen
Tien-fu Chen
Publication venue
Publication date: 01/01/1992
Field of study

Non-blocking caches and prefetching caches are two techniques for hiding memory latency by exploiting the overlap of processor computations with data accesses. A non-blocking cache allows execution to proceed concurrently with cache misses as long as dependency constraints are observed, thus exploiting post-miss operations. A prefetching cache generates prefetch requests to bring data in the cache before it is actually needed, thus allowing overlap with pre-miss computations. In this paper, we evaluate the effectiveness of these two hardware-based schemes. We propose a hybrid design based on the combination of these approaches. We also consider compiler-based optimizations to enhance the effectiveness of non-blocking caches. Results from instruction level simulations on the SPEC benchmarks show that the hardware prefetching caches generally outperform non-blocking caches. Also, the relative effectiveness of non-blocking caches is more adversely affected by an increase in memory latency..

CiteSeerX