Search CORE

8 research outputs found

Integration of Scheduler Knowledge into CiGri Control Loop

Author: Noura Ali,
Publication venue: HAL CCSD
Publication date: 30/08/2022
Field of study

International audienceEven with the progress done in the field of Cloud and High Performance Computing in hardware and programming structures, state of the art schedulers of these systems are still not able to take advantage of computing nodes at full capacity. A more economical and environmental friendly desires, require higher utilization of computing cluster's power. Harvesting idle resources in data-centers can be done by injecting best-effort jobs using CiGri a sclable software, in a way that minimizes unused resources while not perturbing higher priority jobs. Controlling this injection has been done using control-theory approaches, in a purely reactive fashion, independently of the scheduler and Resource and Jobs Management System (RJMS) in the system. This work explores a more pro-active approach, where information from the scheduler is used in a Feed-Forward Control loop, in order to achieve better performance. This approach can be done by conducting experimental campaigns to access newly identified scheduler's identified signals

INRIA a CCSD electronic archive server

Monte Carlo validation of a mu-SPECT imaging system on the lightweight grid CiGri

Author: Aoun Joe
Breton Vincent
Bzeznik B.
Desbat L.
Dimastromatteo J.
Georgiou Y.
Leabad M.
Neyron P.
Richard O.
Publication venue: HAL CCSD
Publication date: 24/02/2009
Field of study

à paraître dans Future Generation Computer SystemsMonte Carlo Simulations (MCS) are nowadays widely used in the field of nuclear medicine for system and algorithms designs. They are valuable for accurately reproducing experimental data, but at the expense of a long computing time. An efficient solution for shorter elapsed time has recently been proposed: grid computing. The aim of this work is to validate a small animal gamma camera MCS and to confirm the usefulness of grid computing for such a study. Good matches between measured and simulated data were achieved and a crunching factor up to 70 was attained on a lightweight campus grid

HAL-IN2P3

Hal - Université Grenoble Alpes

HAL Clermont Université

HAL Descartes

A control-theory approach for cluster autonomic management: maximizing usage while avoiding overload

Author: Bzeznik Bruno
Richard Olivier
Robu Bogdan
Rutten Eric
Yabo Agustín,
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/08/2019
Field of study

International audienc

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Efficiency of a hierarchical protocol for high throughput structure-based virtual screening on GRID5000 cluster grid

Author
Publication venue: 'Dove Medical Press Ltd.'
Publication date
Field of study

Crossref

Controlling a cluster of computing resources: the model free control approach

Author: Donkor David,
Publication venue: HAL CCSD
Publication date: 29/06/2021
Field of study

International audienceThe recent feedback control approaches adopted in cluster and le server management of High Performance Computing (HPC) systems all utilizes the model-based control approach; these model-based control laws inherit the striking aws associated with this approach which include system modelling errors, complex parameter estimation etc., hence the need to take an alternative approach in this study. In this work, we propose to use the model free control approach with the following features: 1) a 'virtual model' representing the unknown system dynamics, 2) Elimination of the need of complex system parameter estimation and 3) assurance of less computational costs. This study rst explores and adopts model free control using the intelligent PID control law, which assures both robustness and quality reference tracking in respect to time varying system dynamics. Results from both simulations and real time experiments were presented in this work to explore the intuitive abilities of this approach. From the results, the Model free control approach guaranteed good reference tracking property and robustness to external disturbances which are key performance metric in HPCapplications. This MFC approach via an Intelligent Proportional controller showed a 16:51% improvement in cluster usage without overloading the le-server compared to the 15:81% improvement by the classical Proportional-Integrator controlle

INRIA a CCSD electronic archive server

3rd EGEE User Forum

Author: Floros Vangelis
Goisset Anne Lise
Harris Frank
Kereksizova Merim
Publication venue: EGEE
Publication date: 01/01/2008
Field of study

We have organized this book in a sequence of chapters, each chapter associated with an application or technical theme introduced by an overview of the contents, and a summary of the main conclusions coming from the Forum for the chapter topic. The first chapter gathers all the plenary session keynote addresses, and following this there is a sequence of chapters covering the application flavoured sessions. These are followed by chapters with the flavour of Computer Science and Grid Technology. The final chapter covers the important number of practical demonstrations and posters exhibited at the Forum. Much of the work presented has a direct link to specific areas of Science, and so we have created a Science Index, presented below. In addition, at the end of this book, we provide a complete list of the institutes and countries involved in the User Forum

CERN Document Server

Contribution à la convergence d'infrastructure entre le calcul haute performance et le traitement de données à large échelle

Author: Mercier Michael
Publication venue: HAL CCSD
Publication date: 01/07/2019
Field of study

The amount of produced data, either in the scientific community or the commercialworld, is constantly growing. The field of Big Data has emerged to handle largeamounts of data on distributed computing infrastructures. High-Performance Computing (HPC) infrastructures are traditionally used for the execution of computeintensive workloads. However, the HPC community is also facing an increasingneed to process large amounts of data derived from high definition sensors andlarge physics apparati. The convergence of the two fields -HPC and Big Data- iscurrently taking place. In fact, the HPC community already uses Big Data tools,which are not always integrated correctly, especially at the level of the file systemand the Resource and Job Management System (RJMS).In order to understand how we can leverage HPC clusters for Big Data usage, andwhat are the challenges for the HPC infrastructures, we have studied multipleaspects of the convergence: We initially provide a survey on the software provisioning methods, with a focus on data-intensive applications. We contribute a newRJMS collaboration technique called BeBiDa which is based on 50 lines of codewhereas similar solutions use at least 1000 times more. We evaluate this mechanism on real conditions and in simulated environment with our simulator Batsim.Furthermore, we provide extensions to Batsim to support I/O, and showcase thedevelopments of a generic file system model along with a Big Data applicationmodel. This allows us to complement BeBiDa real conditions experiments withsimulations while enabling us to study file system dimensioning and trade-offs.All the experiments and analysis of this work have been done with reproducibilityin mind. Based on this experience, we propose to integrate the developmentworkflow and data analysis in the reproducibility mindset, and give feedback onour experiences with a list of best practices.RésuméLa quantité de données produites, que ce soit dans la communauté scientifiqueou commerciale, est en croissance constante. Le domaine du Big Data a émergéface au traitement de grandes quantités de données sur les infrastructures informatiques distribuées. Les infrastructures de calcul haute performance (HPC) sont traditionnellement utilisées pour l’exécution de charges de travail intensives en calcul. Cependant, la communauté HPC fait également face à un nombre croissant debesoin de traitement de grandes quantités de données dérivées de capteurs hautedéfinition et de grands appareils physique. La convergence des deux domaines-HPC et Big Data- est en cours. En fait, la communauté HPC utilise déjà des outilsBig Data, qui ne sont pas toujours correctement intégrés, en particulier au niveaudu système de fichiers ainsi que du système de gestion des ressources (RJMS).Afin de comprendre comment nous pouvons tirer parti des clusters HPC pourl’utilisation du Big Data, et quels sont les défis pour les infrastructures HPC, nousavons étudié plusieurs aspects de la convergence: nous avons d’abord proposé uneétude sur les méthodes de provisionnement logiciel, en mettant l’accent sur lesapplications utilisant beaucoup de données. Nous contribuons a l’état de l’art avecune nouvelle technique de collaboration entre RJMS appelée BeBiDa basée sur 50lignes de code alors que des solutions similaires en utilisent au moins 1000 fois plus.Nous évaluons ce mécanisme en conditions réelles et en environnement simuléavec notre simulateur Batsim. En outre, nous fournissons des extensions à Batsimpour prendre en charge les entrées/sorties et présentons le développements d’unmodèle de système de fichiers générique accompagné d’un modèle d’applicationBig Data. Cela nous permet de compléter les expériences en conditions réellesde BeBiDa en simulation tout en étudiant le dimensionnement et les différentscompromis autours des systèmes de fichiers.Toutes les expériences et analyses de ce travail ont été effectuées avec la reproductibilité à l’esprit. Sur la base de cette expérience, nous proposons d’intégrerle flux de travail du développement et de l’analyse des données dans l’esprit dela reproductibilité, et de donner un retour sur nos expériences avec une liste debonnes pratiques

Evaluations of the Lightweight Grid CIGRI upon the Grid5000 Platform

Author: Capit Nicolas
Georgiou Yiannis
Richard Olivier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

International audienceA widely used method for large scale experiment execution upon P2P or cluster computing platforms, is the exploitation of idle resources. Specifically in the case of clusters, administrators share their cluster's idle cycles into computational grids, for the execution of the so called bag-of-tasks applications. Fault-tolerance and scheduling are some of the important challenges that have arisen on the specific research field. On this paper, we present a simple, scalable, fault tolerant and user-transparent approach of harnessing idle cluster resources for executing grid bag-of-tasks applications. Our main interest lies on the large-scale deployment and evaluation of our lightweight grid computing approach, under real-life parameters. Under this context, we experiment with CIGRI and a fully transparent system-level checkpointing feature for scheduling and turnaround-time optimisation. We discuss the value of experimentation on computer science, we propose reproducible experiments based on real workload traces and we explain how our experimental methodology contributes on the development and evaluation of our grid platform

CiteSeerX

Crossref

Hal - Université Grenoble Alpes