51 research outputs found

    Toward Behavioral Modeling of a Grid System: Mining the Logging and Bookkeeping files

    Get PDF
    International audienceGrid systems are complex heterogeneous systems, and their modeling constitutes a highly challenging goal. This paper is interested in modeling the jobs handled by the EGEE grid, by mining the Logging and Bookkeeping files. The goal is to discover meaningful job clusters, going beyond the coarse categories of ”successfully terminated jobs” and ”other jobs”. The presented approach is a threestep process: i) Data slicing is used to alleviate the job heterogeneity and afford discriminant learning; ii) Constructive induction proceeds by learning discriminant hypotheses from each data slice; iii) Finally, double clustering is used on the representation built by constructive induction; the clusters are fully validated after the stability criteria proposed by Meila (2006). Lastly, the job clusters are submitted to the experts and some meaningful interpretations are foun

    Discovering Linear Models of Grid Workload

    Get PDF
    Despite extensive research focused on enabling QoS for grid users through economic and intelligent resource provisioning, no consensus has emerged on the most promising strategies. On top of intrinsically challenging problems, the complexity and size of data has so far drastically limited the number of comparative experiments. An alternative to experimenting on real, large, and complex data, is to look for well-founded and parsimonious representations. The goal of this paper is to answer a set of preliminary questions, which may help steering the design of those along feasible paths: is it possible to exhibit consistent models of the grid workload? If such models do exist, which classes of models are more appropriate, considering both simplicity and descriptive power? How can we actually discover such models? And finally, how can we assess the quality of these models on a statistically rigorous basis? Our main contributions are twofold. First we found that grid workload models can consistently be discovered from the real data, and that limiting the range of models to piecewise linear time series models is sufficiently powerful. Second, we presents a bootstrapping strategy for building more robust models from the limited samples at hand. This study is based on exhaustive information representative of a significant fraction of e-science computing activity in Europe

    Predicting Bounds on Queuing Delay in the EGEE grid

    Get PDF
    International audiencePredicting the performance of schedulers is a notoriously difficult task. As a consequence, grid users might be tempted to work around the standard grid middleware by designing specific strategies, which would be counterproductive if generally adopted. On the other hand, Machine Learning has been successfully applied to performance prediction in distributed and shared environments. This paper reports on experiments on predicting the basic parameters of scheduling in the EGEE framework

    Multi-objective reinforcement learning for responsive grids

    Get PDF
    The original publication is available at www.springerlink.comInternational audienceGrids organize resource sharing, a fundamental requirement of large scientific collaborations. Seamless integration of grids into everyday use requires responsiveness, which can be provided by elastic Clouds, in the Infrastructure as a Service (IaaS) paradigm. This paper proposes a model-free resource provisioning strategy supporting both requirements. Provisioning is modeled as a continuous action-state space, multi-objective reinforcement learning (RL) problem, under realistic hypotheses; simple utility functions capture the high level goals of users, administrators, and shareholders. The model-free approach falls under the general program of autonomic computing, where the incremental learning of the value function associated with the RL model provides the so-called feedback loop. The RL model includes an approximation of the value function through an Echo State Network. Experimental validation on a real data-set from the EGEE grid shows that introducing a moderate level of elasticity is critical to ensure a high level of user satisfaction

    Discovering Piecewise Linear Models of Grid Workload

    Get PDF
    International audienceDespite extensive research focused on enabling QoS for grid users through economic and intelligent resource provisioning, no consensus has emerged on the most promising strategies. On top of intrinsically challenging problems, the complexity and size of data has so far drastically limited the number of comparative experiments. An alternative to experimenting on real, large, and complex data, is to look for well-founded and parsimonious representations. This study is based on exhaustive information about the gLite-monitored jobs from the EGEE grid, representative of a significant fraction of e-science computing activity in Europe. Our main contributions are twofold. First we found that workload models for this grid can consistently be discovered from the real data, and that limiting the range of models to piecewise linear time series models is sufficiently powerful. Second, we present a bootstrapping strategy for building more robust models from the limited samples at hand

    Grid Scheduling for Interactive Analysis

    Get PDF
    Grids are facing the challenge of moving from batch systems to interactive computing. In the 70s, standalone computer systems have met this challenge, and this was the starting point of pervasive computing. Meeting this challenge will allow grids to be the infrastructure for ambient intelligence and ubiquitous computing. This paper shows that EGEE, the largest world grid, does not yet provide the services required for interactive computing, but that it is amenable to this evolution through relatively modest middleware evolution. A case study on medical image analysis exemplifies the particular needs of ultra-short jobs

    Non-Markovian Reinforcement Learning for Reactive Grid scheduling

    Get PDF
    International audienceTwo recurrent questions often appear when solving numerous real world policy search problems. First, the variables defining the so called Markov Decision Process are often continuous, that leads to the necessity for discretization of the considered state/action space or the use of a regression model, often non-linear, to approach the Q-function nee- ded in the reinforcement learning paradigm. Second, the markovian hypothesis is made which is often strongly discutable and can lead to unacceptably suboptimal resulting policies. In this paper, the job scheduling problem in grid infrastructure is modeled as a continuous action-state space, multi-objective reinforcement learning problem, under realistic assumptions ; the high level goals of users, administrators, and shareholders are captured through simple utility functions. So, formalizing the problem as a par- tially observable Markov decision process (POMDP), we detail the algorithm of fitted Q-function learning using an Echo State Network. The experiment, conducted on simu- lation of real grid activity will demonstrate the significative gain of the method against native scheduling infrastructure and a classic feed forward back-propagated neural net- work (FFNN) for Q function learning in the most difficult cases

    Efficient fault monitoring with Collaborative Prediction

    Get PDF
    Journées scientifiques mésocentres et France GrillesIsolating users from the inevitable faults in large distributed systems is critical to Quality of Experience. We formulate the problem of probe selection for fault prediction based on end-to-end probing as a Collaborative Prediction (CP) problem. On an extensive experimental dataset from the EGI grid, the combination of the Maximum Margin Matrix Factorization approach to CP and Active Learning shows excellent performance, reducing the number of probes typically by 80% to 90%

    The Grid Observatory

    Get PDF
    International audienceThe goal of the Grid Observatory project (GO) is to contribute to an experimental theory of large grid systems by integrating the collection of data on the behaviour of the flagship European Grid Infrastructure (EGI) and its users, the development of models, and an ontology for the domain knowledge. The GO gives access to a database of grid usage traces available to the wider computer science community without the need of grid credentials. The paper presents the architecture of the digital curation process enacted by the GO and examples of their exploitation.L'objectif du projet Grid Observatoiry (GO) est de contribuer à une théorie expérimentale de systÚmes globalisés à grande échelle en intégrant l'acquisition de données sur le comportement de l'infrastructure de la grille européenne phare (EGI) et de ses utilisateurs, avec le développement de modÚles, et d'une ontologie du domaine. Le GO donne accÚs à une base de données des traces d'utilisation de la grille, mise à la disposition de la communauté scientifique. L'article présente l'architecture du processus de conservation numérique adoptée par le GO et des exemples de l'exploitation des traces collectées

    Grid Analysis of Radiological Data

    Get PDF
    IGI-Global Medical Information Science Discoveries Research Award 2009International audienceGrid technologies and infrastructures can contribute to harnessing the full power of computer-aided image analysis into clinical research and practice. Given the volume of data, the sensitivity of medical information, and the joint complexity of medical datasets and computations expected in clinical practice, the challenge is to fill the gap between the grid middleware and the requirements of clinical applications. This chapter reports on the goals, achievements and lessons learned from the AGIR (Grid Analysis of Radiological Data) project. AGIR addresses this challenge through a combined approach. On one hand, leveraging the grid middleware through core grid medical services (data management, responsiveness, compression, and workflows) targets the requirements of medical data processing applications. On the other hand, grid-enabling a panel of applications ranging from algorithmic research to clinical use cases both exploits and drives the development of the services
    • 

    corecore