Search CORE

200 research outputs found

TreeMatch : Un algorithme de placement de processus sur architectures multicœurs

Author: Jeannot Emmanuel
Mercier Guillaume
Tessier François
Publication venue: HAL CCSD
Publication date: 16/01/2013
Field of study

Conférence ComPAR/RenPAR 2013National audienceDepuis quelques années, les clusters de nœuds NUMA à processeurs multi-cœurs deviennent très répandus. Programmer efficacement ces architectures est un réel défi compte tenu de leur hiérarchie complexe. Afin d'en tirer pleinement profit, il est nécessaire de prendre en compte cette structure de façon précise et d'y faire correspondre le schéma de communication de l'application. Ce faisant, les coûts de communication sont réduits et l'on observe des gains sur le temps d'exécution total de l'application. Nous présentons ici comment nous utilisons d'un côté le schéma de communication et de l'autre une représentation fidèle de l'architecture pour produire une permutation des processus d'une application donnée, permettant ainsi une réduction des coûts de communication

INRIA a CCSD electronic archive server

TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers

Author: Jeannot Emmanuel
Tessier François
Vishwanath Venkatram
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2017
Field of study

International audienceReading and writing data efficiently from storage system is necessary for most scientific simulations to achieve good performance at scale. Many software solutions have been developed to decrease the I/O bottleneck. One well-known strategy, in the context of collective I/O operations, is the two-phase I/O scheme. This strategy consists of selecting a subset of processes to aggregate contiguous pieces of data before performing reads/writes. In this paper, we present TAPIOCA, an MPI-based library implementing an efficient topology-aware two-phase I/O algorithm. We show how TAPIOCA can take advantage of double-buffering and one-sided communication to reduce as much as possible the idle time during data aggregation. We also introduce our cost model leading to a topology-aware aggregator placement optimizing the movements of data. We validate our approach at large scale on two leadership-class supercomputers: Mira (IBM BG/Q) and Theta (Cray XC40). We present the results obtained with TAPIOCA on a micro-benchmark and the I/O kernel of a large-scale simulation. On both architectures, we show a substantial improvement of I/O performance compared with the default MPI I/O implementation. On BG/Q+GPFS, for instance, our algorithm leads to a performance improvement by a factor of twelve while on the Cray XC40 system associated with a Lustre filesystem, we achieve an improvement of four

Crossref

INRIA a CCSD electronic archive server

Matching communication pattern with underlying hardware architecture

Author: Jeannot Emmanuel
Mercier Guillaume
Tessier François
Publication venue: HAL CCSD
Publication date: 01/07/2014
Field of study

International audienceMATCHING COMMUNICATION PATTERN WITH UNDERLYING HARDWARE ARCHITECTUR

INRIA a CCSD electronic archive server

Topology and affinity aware hierarchical and distributed load-balancing in Charm++

Author: Jeannot Emmanuel
Mercier Guillaume
Tessier François
Publication venue: HAL CCSD
Publication date: 18/11/2016
Field of study

International audienceThe evolution of massively parallel supercomputers make palpable two issues in particular: the load imbalance and the poor management of data locality in applications. Thus, with the increase of the number of cores and the drastic decrease of amount of memory per core, the large performance needs imply to particularly take care of the load-balancing and as much as possible of the locality of data. One mean to take into account this locality issue relies on the placement of the processing entities and load balancing techniques are relevant in order to improve application performance. With large-scale platforms in mind, we developed a hierarchical and distributed algorithm which aim is to perform a topology-aware load balancing tailored for Charm++ applications. This algorithm is based on both LibTopoMap for the network awareness aspects and on TREEMATCH to determine a relevant placement of the processing entities. We show that the proposed algorithm improves the overall execution time in both the cases of real applications and a synthetic benchmark as well. For this last experiment, we show a scalability up to one millions processing entities

INRIA a CCSD electronic archive server

Formal Detection of Attentional Tunneling in Human Operator-Automation Interactions

Author: Causse Mickaël
Dehais Frédéric
Pizziol Sergio
Rachelson Emmanuel
Regis Nicolas
Tessier Catherine
Thooris Charles
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

The allocation of visual attention is a key factor for the humans when operating complex systems under time pressure with multiple information sources. In some situations, attentional tunneling is likely to appear and leads to excessive focus and poor decision making. In this study, we propose a formal approach to detect the occurrence of such an attentional impairment that is based on machine learning techniques. An experiment was conducted to provoke attentional tunneling during which psycho-physiological and oculomotor data from 23 participants were collected. Data from 18 participants were used to train an adaptive neuro-fuzzy inference system (ANFIS). From a machine learning point of view, the classification performance of the trained ANFIS proved the validity of this approach. Furthermore, the resulting classification rules were consistent with the attentional tunneling literature. Finally, the classifier was robust to detect attentional tunneling when performing over test data from four participants

Open Archive Toulouse Archive Ouverte

Reconnaissance multi-bandes de la parole bruitee par couplage entre les niveaux primitifs et d'identification

Author: Berthommier Frédéric
Bourlard Hervé
Glotin Hervé
Tessier Emmanuel
Publication venue
Publication date: 10/03/2006
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Interfacing of CASA and partial recognition based on a multistream technique

Author: Berthommier Frédéric
Bourlard Hervé
Glotin Hervé
Tessier Emmanuel
Publication venue: Sidney
Publication date: 10/03/2006
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Reconnaissance robuste de la parole par segmentation signal/bruit en sous-bandes

Author: Berthommier Frédéric
Bourlard Hervé
Glotin Hervé
Tessier Emmanuel
Publication venue
Publication date: 10/03/2006
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Fleury-sur-Orne – Rue Louise-Michel, centre de maintenance du tramway

Author: Carpentier Vincent
Charraud François
Ghesquière Emmanuel
Ghesquière Emmanuel
Giazzon David
Hachem Lamys
Tessier Vincent
Publication venue: ADLFI. Archéologie de la France - Informations
Publication date: 02/06/2021
Field of study

La fouille a permis de mettre en évidence les fossés correspondant à deux monuments funéraires néolithiques partiels et un monument double entier de type Passy. Ils s’inscrivent dans la continuité de la nécropole de Fleury-sur-Orne « Les Hauts de l’Orne », avec un des monuments du diagnostic déjà partiellement fouillé en 2014 (mon. 24). L’autre monument partiel, no 7, est inscrit dans la partie nord de l’emprise. Il mesure plus de 118 m de long pour 15 de large Ses fossés sont sub-parallèles...

OpenEdition

Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers

Author: Isaila Florin
Jeannot Emmanuel
Malakar Preeti
Tessier François
Vishwanath Venkatram
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/11/2016
Field of study

International audienceReading and writing data efficiently from storage systems is critical for high performance data-centric applications. These I/O systems are being increasingly characterized by complex topologies and deeper memory hierarchies. Effective parallel I/O solutions are needed to scale applications on current and future supercomputers. Data aggregation is an efficient approach consisting of electing some processes in charge of aggregating data from a set of neighbors and writing the aggregated data into storage. Thus, the bandwidth use can be optimized while the contention is reduced. In this work, we take into account the network topology for mapping aggregators and we propose an optimized buffering system in order to reduce the aggregation cost. We validate our approach using micro-benchmarks and the I/O kernel of a large-scale cosmology simulation. We show improvements up to 15× faster for I/O operations compared to a standard implementation of MPI I/O

INRIA a CCSD electronic archive server