Search CORE

INRIA a CCSD electronic archive server

Adaptive Optimal Control of MapReduce Performance, Availability and Costs

Author: Berekmeri Mihaly
Bouchenak Sara
Cerf Sophie
Marchand Nicolas
Robu Bogdan
Publication venue: HAL CCSD
Publication date: 19/07/2016
Field of study

International audienceMapReduce is a popular programming model for distributed data processing and Big Data applications running on clouds. Extensive research has been conducted either to improve the dependability or to increase performance of MapReduce, ranging from adaptive and on-demand fault-tolerance solutions, adaptive task scheduling techniques to optimized job execution mechanisms. This paper investigates an optimization-based solution to control MapReduce systems in order to provide guarantees in terms of both performance and availability while reducing utilization costs. We follow a control theoretical approach for MapReduce cluster scaling and admission control. Moreover, we aim to be robust to changes in MapRe-duce and in it's environment by adapting the controller online to those changes. This paper highlights the major challenges of combining system adaptation and optimal control to take the best of both approaches. CCS Concepts • Networks → Cloud computing; • Software and its engineering → Software configuration management and version control systems; • Computer systems organization → Dependable and fault-tolerant systems and networks

HAL

Adaptive Modelling and Control in Distributed Systems

Author: Berekmeri Mihaly
Bouchenak Sara
Cerf Sophie
Marchand Nicolas
Robu Bogdan
Publication venue: HAL CCSD
Publication date: 28/09/2015
Field of study

International audienceCompanies have growing amounts of data to store and to process. In response to these new processing challenges, Google developed MapReduce, a parallel programming paradigm which is becoming the major tool for BigData treatment. Even if MapReduce is used by most IT companies, ensuring its performances while minimizing costs is a real challenge requiring a high level of expertise. Modelling and control of MapReduce have been developed in the last years, however there are still many problems caused by the software's high variability. To tackle the latter issue, this paper proposes an on-line model estimation algorithm for MapReduce systems. An adaptive control strategy is developed and implemented to guarantee response time performances under a concurrent workload while minimizing resource use. Results have been validated using a 40 nodes MapReduce cluster under a data intensive Business Intelligence workload running on Grid5000, a French national cloud. The experiments show that the adaptive control algorithm manages to guarantee performances and low costs even in a highly variable environment

Towards Control of MapReduce Performance and Availability

Author: Berekmeri Mihaly
Bouchenak Sara
Cerf Sophie
Marchand Nicolas
Robu Bogdan
Publication venue: HAL CCSD
Publication date: 01/06/2016
Field of study

International audienceMapReduce is a popular programming model for distributed data processing and Big Data applications. Extensive research has been conducted either to improve the dependability or to increase performance of MapReduce, ranging from adaptive and on-demand fault-tolerance solutions, adaptive task scheduling techniques to optimized job execution mechanisms. This paper investigates a novel solution that controls MapReduce systems and provides guarantees in terms of both performance and availability, while reducing utilization costs. We follow a control theoretic approach for MapReduce cluster scaling and admission control. Preliminary results based on a simulation environment, previously validated on a real MapReduce cluster, show the effectiveness of the proposed control solutions for a Hadoop MapReduce cluster

HAL

Cost Function based Event Triggered Model Predictive Controllers - Application to Big Data Cloud Services

Author: Berekmeri Mihaly
Bouchenak Sara
Cerf Sophie
Marchand Nicolas
Robu Bogdan
Publication venue: HAL CCSD
Publication date: 12/12/2016
Field of study

International audienceHigh rate cluster reconfigurations is a costly issue in Big Data Cloud services. Current control solutions manage to scale the cluster according to the workload, however they do not try to minimize the number of system reconfigurations. Event-based control is known to reduce the number of control updates typically by waiting for the system states to degrade below a given threshold before reacting. However, computer science systems often have exogenous inputs (such as clients connections) with delayed impacts that can enable to anticipate states degradation. In this paper, a novel event-triggered approach is proposed. This triggering mechanism relies on a Model Predictive Controller and is defined upon the value of the optimal cost function instead of the state or output error. This controller reduces the number of control changes, in the normal operation mode, through constraints in the MPC formulation but also assures a very reactive behavior to changes of exogenous inputs. This novel control approach is evaluated using a model validated on a real Big Data system. The controller efficiently scales the cluster according to specifications, meanwhile reducing its reconfigurations

Crossref

Directory of Open Access Journals

Apprendre à déceler le potentiel de développement des situations de travail : l’exemple de conseillers agricoles face aux enjeux de l’agro‑écologie

Author: Cerf Marianne
Duhamel Sophie
Olry Paul
Publication venue: 'OpenEdition'
Publication date: 01/04/2021
Field of study

Les professionnels qui agissent avec et pour autrui, dont les activités et situations évoluent et ne sont pas stabilisées, ont pour enjeu de reconstruire avec leurs bénéficiaires le sens à donner à leurs activités et aux situations de travail conjointes. C’est le cas des conseillers agricoles mis en situation d’accompagner des agriculteurs dans une évolution de leurs pratiques pour répondre à des problématiques environnementales. Un dispositif d’échange entre pairs a été mis en place auprès de ces professionnels, réunis pour penser collectivement ce qui rend leur action efficace aujourd’hui et pour tenter d’autres pratiques, pour oser faire autrement demain. Nous avons réalisé une observation participante au sein de ce dispositif. Nous étudions plus particulièrement la façon dont, collectivement et individuellement, les participants se saisissent de deux « mises en milieu » conçues par les animatrices du dispositif autour d’une situation emblématique du conseil en production végétale : le tour de plaine. Nos descriptions et analyses montrent la façon différenciée dont les conseillers s’approprient le potentiel de développement des situations ainsi créées pour changer leur activité en situation réelle de conseil. Nous discutons du développement professionnel que cela traduit et pointons des pistes d’améliorations possibles de l’animation pour mieux intégrer la diversité des participants et de leur environnement de travail.For professionals who work with and for others and whose activities and situations evolve and are not yet stabilized, one of the challenges is that of reconstructing, with their beneficiaries, the purpose of their activities and of reconfiguring their work situations. This is the case for agricultural advisors who are in the position of accompanying farmers in a change of practices in order to cope with environmental problems. A system was designed to support exchanges between advisors so that they can collectively rethink the efficacy of their action, and dare to act differently. We conducted a participatory observation within this system. We studied the way in which, individually and collectively, the participants grasp the two “milieus” designed by the facilitators in relation to an emblematic advisory situation for crop production: the field tour. Our illustrations and analyses show the different ways that the advisors grasp the developmental potential of these designed situations in order to change their own activity in real work situations. We discuss the professional development that this creates. Finally, we highlight possible avenues for improving the facilitation work to better cope with the diversity of the participants and of their workplace

OpenEdition

Playing with power at runtime: Slightly slowed applications, major energy savings

Author: Bleuse Raphaël
Cerf Sophie
Rutten Éric
Publication venue: HAL CCSD
Publication date: 10/10/2022
Field of study

National audienceSoberness—in terms of electrical power—of Data Centers and high-performance computing (HPC) systems is becoming an important design issue, as the global energy consumption of Information Technologies (IT) is rising at considerable levels. This question is all the more complex as these systems are increasingly heterogeneous and variable in their behavior with respect to their performance and power consumption. As applications struggle to make use of increasingly heterogeneous compute nodes, maintaining high efficiency (performance per watt) for the whole platform becomes a challenge. Additionally, applications tend to present phases (I/O, computing- or memory-intensive, check-pointing) which vary over time, and to be executed on an environment subject to external constraints (e.g., concurrency or energy envelop).This increasing complexity makes HPC less predictable offline (prior to the execution). Therefore, dealing with time variations and unpredictable disturbances demands runtime management. In this work, we realize dynamical adaptation using feedback control, falling into the scope of autonomic computing, using control theory. Particularly, we address the problem of the control of the power allocated to processors, and hence their energy consumption and performance. The use of feedback control allows to reduce the energy consumption by decreasing the speed with limited and configurable performance loss, by exploiting periods where read/write operations slow down the progress. The proposed controller has an easily conﬁgured behavior: the user has to supply only an acceptable degradation level. An HPC application such as our system undergoes many variations of its behavior, depending on (i) the cluster, (ii) the node, (iii) the run, and even (iv) during the runtime.We evaluate our approach on top of an existing resource management framework, the Argo Node Resource Manager, deployed on several clusters of Grid'5000, using a standard memory-bound HPC benchmark. Our results show the existence of a family of trade-offs to save energy, depending on the allowed degradation (from 0 to 20%). In particular, our control approach allows, on average, saving 22% energy at the cost of a 7% execution time, and climbs up to 25% energy savings with the adaptation. Our solution has shown to be robust to variations of the machines (from one node to another) and of the runs (from one execution of the application to another).The experiments conducted in this work require to instrument low-level software stacks. Conducting this work on top of Grid'5000 was key as it allowed us to study various hardware setups (varying number of sockets, varying amount of memory) and their impact on the controller. The presence of clusters composed of homogeneous hardware allowed us to study the robustness of the devised control with respect to the variability in hardware performance despite identical specifications. Finally, our work relied on power measures as provided by the integrated sensors: we could extend this work by exploiting the available power sensors.Our future works will tackle three remaining challenges: (i) handling various types of phases and their chaining in a application, (ii) distributed execution (different powercap enforced on each processor or core) and (iii) non-instrumented applications (for which an instrumentation is not possible)

INRIA a CCSD electronic archive server

Toward an Easy Configuration of Location Privacy Protection Mechanisms

Author: Ben Mokhtar Sonia
Bouchenak Sara
Boutet Antoine
Cerf Sophie
Marchand Nicolas
Primault Vincent
Robu Bogdan
Publication venue: HAL CCSD
Publication date: 12/12/2016
Field of study

Communication orale sur posterInternational audienceThe widespread adoption of Location-Based Services (LBSs) has come with controversy about privacy. While leverag-ing location information leads to improving services through geo-contextualization, it rises privacy concerns as new knowledge can be inferred from location records, such as home/work places, habits or religious beliefs. To overcome this problem, several Location Privacy Protection Mechanisms (LPPMs) have been proposed in the literature these last years. However , every mechanism comes with its own configuration parameters that directly impact the privacy guarantees and the resulting utility of protected data. In this context, it can be difficult for a non-expert system designer to choose appropriate configuration parameters to use according to the expected privacy and utility. In this paper, we present a framework enabling the easy configuration of LPPMs. To achieve that, our framework performs an offline, in-depth automated analysis of LPPMs to provide the formal relationship between their configuration parameters and both privacy and the utility metrics. This framework is modular: by using different metrics, a system designer is able to fine-tune her LPPM according to her expected privacy and utility guarantees (i.e., the guarantee itself and the level of this guarantee). To illustrate the capability of our framework, we analyse Geo-Indistinguishability (a well known differentially private LPPM) and we provide the formal relationship between its configuration parameter and two privacy and utility metrics

Crossref

Développer la capacité des conseillers à agir face à la diversité des situations de conseil en grande culture

Author: Cerf Marianne
Guillot Marie-Noël
Olry Paul
Omon Bertrand
Petit Marie-Sophie
Publication venue: 'OpenEdition'
Publication date: 12/09/2013
Field of study

Face aux enjeux d’une transition agro-écologique, le conseil en grande culture s’ouvre à de nouvelles formes de raisonnement agronomique et implique de nouvelles compétences, ou plus exactement de nouvelles capacités à agir des conseillers. Les auteurs proposent une analyse de l’activité de conseil destiné à comprendre l’origine des difficultés que les conseillers rencontrent dans leurs interactions avec les agriculteurs, en postulant que de nouvelles capacités à agir se développent dans une réflexion de ces difficultés. En comparant trois situations de conseil réalisées par un même conseiller, les auteurs pointent les réussites ou les défauts de coordination et identifient ce que le conseiller peut faire évoluer dans la construction de son action de conseil.Arable farming advisors have to develop new agronomic reasoning and new skills to address the challenges of an agro-ecological transition. To support this professional development, we propose an analytical framework of the advisory activity. It stresses the operations carried out by an advisor in order to orient his performance and highlights the need for the advisor to reflect on his way to co-ordinate with his environment and ”the others” for building a common action. We analyse three advisory situations and show that this framework enables us to identify the obstacles met in the coordination process and in the coupling of the advisor’s activity and the situation. We discuss how such results can support a reflexive process oriented towards the professional development of the advisors

OpenEdition

Automatic Privacy and Utility Preservation of Mobility Data: A Nonlinear Model-Based Approach

Author: Ben Mokhtar Sonia
Bouchenak Sara
Boutet Antoine
Cerf Sophie
Chen Lydia Y.
Marchand Nicolas
Primault Vincent
Robu Bogdan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/11/2018
Field of study

International audienceThe widespread use of mobile devices and location-based services has generated a large number of mobility databases. While processing these data is highly valuable, privacy issues can occur if personal information is revealed. The prior art has investigated ways to protect mobility data by providing a wide range of Location Privacy Protection Mechanisms (LPPMs). However, the privacy level of the protected data significantly varies depending on the protection mechanism used, its configuration and on the characteristics of the mobility data. Meanwhile, the protected data still needs to enable some useful processing. To tackle these issues, we present PULP, a framework that finds the suitable protection mechanism and automatically configures it for each user in order to achieve user-defined objectives in terms of both privacy and utility. PULP uses nonlinear models to capture the impact of each LPPM on data privacy and utility levels. Evaluation of our framework is carried out with two protectionmechanisms from the literature and four real-world mobility datasets. Results show the efficiency of PULP, its robustness and adaptability. Comparisons between LPPMs’ configurators and the state of the art further illustrate that PULP better realizes users’ objectives, and its computation time is in orders of magnitude faster

INRIA a CCSD electronic archive server

HAL