128 research outputs found

    The archive solution for distributed workflow management agents of the CMS experiment at LHC

    Full text link
    The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregate O\mathcal{O}(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.Comment: This is a pre-print of an article published in Computing and Software for Big Science. The final authenticated version is available online at: https://doi.org/10.1007/s41781-018-0005-

    Gaining insight from large data volumes with ease

    Get PDF
    Efficient handling of large data-volumes becomes a necessity in today's world. It is driven by the desire to get more insight from the data and to gain a better understanding of user trends which can be transformed into economic incentives (profits, cost-reduction, various optimization of data workflows, and pipelines). In this paper, we discuss how modern technologies are transforming well established patterns in HEP communities. The new data insight can be achieved by embracing Big Data tools for a variety of use-cases, from analytics and monitoring to training Machine Learning models on a terabyte scale. We provide concrete examples within context of the CMS experiment where Big Data tools are already playing or would play a significant role in daily operations

    Towards Provenance and Traceability in CRISTAL for HEP

    Full text link
    This paper discusses the CRISTAL object lifecycle management system and its use in provenance data management and the traceability of system events. This software was initially used to capture the construction and calibration of the CMS ECAL detector at CERN for later use by physicists in their data analysis. Some further uses of CRISTAL in different projects (CMS, neuGRID and N4U) are presented as examples of its flexible data model. From these examples, applications are drawn for the High Energy Physics domain and some initial ideas for its use in data preservation HEP are outlined in detail in this paper. Currently investigations are underway to gauge the feasibility of using the N4U Analysis Service or a derivative of it to address the requirements of data and analysis logging and provenance capture within the HEP long term data analysis environment.Comment: 5 pages and 1 figure. 20th International Conference on Computing in High Energy and Nuclear Physics (CHEP13). 14-18th October 2013. Amsterdam, Netherlands. To appear in Journal of Physics Conference Serie

    Automatic log analysis with NLP for the CMS workflow handling

    Get PDF
    The central Monte-Carlo production of the CMS experiment utilizes the WLCG infrastructure and manages daily thousands of tasks, each up to thousands of jobs. The distributed computing system is bound to sustain a certain rate of failures of various types, which are currently handled by computing operators a posteriori. Within the context of computing operations, and operation intelligence, we propose a Machine Learning technique to learn from the operators with a view to reduce the operational workload and delays. This work is in continuation of CMS work on operation intelligence to try and reach accurate predictions with Machine Learning. We present an approach to consider the log files of the workflows as regular text to leverage modern techniques from Natural Language Processing (NLP). In general, log files contain a substantial amount of text that is not human language. Therefore, different log parsing approaches are studied in order to map the log files’ words to high dimensional vectors. These vectors are then exploited as feature space to train a model that predicts the action that the operator has to take. This approach has the advantage that the information of the log files is extracted automatically and the format of the logs can be arbitrary. In this work the performance of the log file analysis with NLP is presented and compared to previous approaches

    3rd EGEE User Forum

    Get PDF
    We have organized this book in a sequence of chapters, each chapter associated with an application or technical theme introduced by an overview of the contents, and a summary of the main conclusions coming from the Forum for the chapter topic. The first chapter gathers all the plenary session keynote addresses, and following this there is a sequence of chapters covering the application flavoured sessions. These are followed by chapters with the flavour of Computer Science and Grid Technology. The final chapter covers the important number of practical demonstrations and posters exhibited at the Forum. Much of the work presented has a direct link to specific areas of Science, and so we have created a Science Index, presented below. In addition, at the end of this book, we provide a complete list of the institutes and countries involved in the User Forum

    Grid Virtualization Engine: Design, Implementation, and Evaluation

    Full text link
    • …
    corecore