5,424 research outputs found

    The archive solution for distributed workflow management agents of the CMS experiment at LHC

    Full text link
    The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregate O\mathcal{O}(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.Comment: This is a pre-print of an article published in Computing and Software for Big Science. The final authenticated version is available online at: https://doi.org/10.1007/s41781-018-0005-

    Developing Resource Usage Service in WLCG

    No full text
    According to the Memorandum of Understanding (MoU) of the World-wide LHC Computing Grid (WLCG) project, participating sites are required to provide resource usage or accounting data to the Grid Operational Centre (GOC) to enrich the understanding of how shared resources are used, and to provide information for improving the effectiveness of resource allocation. As a multi-grid environment, the accounting process of WLCG is currently enabled by four accounting systems, each of which was developed independently by constituent grid projects. These accounting systems were designed and implemented based on project-specific local understanding of requirements, and therefore lack interoperability. In order to automate the accounting process in WLCG, three transportation methods are being introduced for streaming accounting data metered by heterogeneous accounting systems into GOC at Rutherford Appleton Laboratory (RAL) in the UK, where accounting data are aggregated and accumulated throughout the year. These transportation methods, however, were introduced on a per accounting-system basis, i.e. targeting at a particular accounting system, making them hard to reuse and customize to new requirements. This paper presents the design of WLCG-RUS system, a standards-compatible solution providing a consistent process for streaming resource usage data across various accounting systems, while ensuring interoperability, portability, and customization

    BDWatchdog: real-time monitoring and profiling of Big Data applications and frameworks

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Future Generation Computer Systems. The final authenticated version is available online at: https://doi.org/10.1016/j.future.2017.12.068[Abstract] Current Big Data applications are characterized by a heavy use of system resources (e.g., CPU, disk) generally distributed across a cluster. To effectively improve their performance there is a critical need for an accurate analysis of both Big Data workloads and frameworks. This means to fully understand how the system resources are being used in order to identify potential bottlenecks, from resource to code bottlenecks. This paper presents BDWatchdog, a novel framework that allows real-time and scalable analysis of Big Data applications by combining time series for resource monitorization and flame graphs for code profiling, focusing on the processes that make up the workload rather than the underlying instances on which they are executed. This shift from the traditional system-based monitorization to a process-based analysis is interesting for new paradigms such as software containers or serverless computing, where the focus is put on applications and not on instances. BDWatchdog has been evaluated on a Big Data cloud-based service deployed at the CESGA supercomputing center. The experimental results show that a process-based analysis allows for a more effective visualization and overall improves the understanding of Big Data workloads. BDWatchdog is publicly available at http://bdwatchdog.dec.udc.es.Ministerio de Economía, Industria y Competitividad; TIN2016-75845-PMinsiterio de Educación; FPU15/0338

    Automatic Rescaling and Tuning of Big Data Applications on Container-Based Virtual Environments

    Get PDF
    Programa Oficial de Doutoramento en Investigación en Tecnoloxías da Información. 524V01[Resumo] As aplicacións Big Data actuais evolucionaron dun xeito significativo, dende fluxos de traballo baseados en procesamento por lotes ata outros máis complexos que poden requirir múltiples etapas de procesamento usando diferentes tecnoloxías, e mesmo executándose en tempo real. Doutra banda, para despregar estas aplicacións, os clusters ‘commodity’ foron substituídos nalgúns casos por paradigmas máis flexibles como o Cloud, ou mesmo por outros emerxentes como a computación ‘serverless’, precisando ambos paradigmas de tecnoloxías de virtualización. Esta Tese propón dúas contornas que proporcionan modos alternativos de realizar unha análise en profundidade e unha mellor xestión dos recursos de aplicacións Big Data despregadas en contornas virtuais baseadas en contedores software. Por unha banda, a contorna BDWatchdog permite realizar unha análise de gran fino e en tempo real en termos do uso dos recursos do sistema e do perfilado do código. Doutra banda, descríbese unha contorna para o reescalado dinámico e en tempo real dos recursos segundo un conxunto de políticas configurables. A primeira política proposta céntrase no reescalado automático dos recursos dos contedores segundo o uso real que as aplicacións fan dos mesmos, proporcionando así unha contorna ‘serverless’. Ademais, preséntase unha política alternativa centrada na xestión enerxética que permite implementar os conceptos de limitación e presuposto de potencia, que poden aplicarse a contedores, aplicacións ou mesmo usuarios. En xeral, as contornas propostas nesta Tese tratan de poñer de relevo o potencial de aplicar novos xeitos de analizar e axustar os recursos das aplicacións Big Data despregadas en clusters de contedores, mesmo en tempo real. Os casos de uso presentados son exemplos diso, demostrando que as aplicacións Big Data poden adaptarse a novas tecnoloxías ou paradigmas sen teren que cambiar as súas características máis intrínsecas.[Resumen] Las aplicaciones Big Data actuales han evolucionado de forma significativa, desde flujos de trabajo basados en procesamiento por lotes hasta otros más complejos que pueden requerir múltiples etapas de procesamiento usando distintas tecnologías, e incluso ejecutándose en tiempo real. Por otra parte, para desplegar estas aplicaciones, los clusters ‘commodity’ se han reemplazado en algunos casos por paradigmas más flexibles como el Cloud, o incluso por otros emergentes como la computación ‘serverless’, requiriendo ambos paradigmas de tecnologías de virtualización. Esta Tesis propone dos entornos que proporcionan formas alternativas de realizar un análisis en profundidad y una mejor gestión de los recursos de aplicaciones Big Data desplegadas en entornos virtuales basados en contenedores software. Por un lado, el entorno BDWatchdog permite realizar un análisis de grano fino y en tiempo real en lo que respecta a la monitorización de los recursos del sistema y al perfilado del código. Por otro lado, se describe un entorno para el reescalado dinámico y en tiempo real de los recursos de acuerdo a un conjunto de políticas configurables. La primera política propuesta se centra en el reescalado automático de los recursos de los contenedores de acuerdo al uso real que las aplicaciones hacen de los mismos, proporcionando así un entorno ‘serverless’. Además, se presenta una política alternativa centrada en la gestión energética que permite implementar los conceptos de limitación y presupuesto de potencia, pudiendo aplicarse a contenedores, aplicaciones o incluso usuarios. En general, los entornos propuestos en esta Tesis tratan de resaltar el potencial de aplicar nuevas formas de analizar y ajustar los recursos de las aplicaciones Big Data desplegadas en clusters de contenedores, incluso en tiempo real. Los casos de uso que se han presentado son ejemplos de esto, demostrando que las aplicaciones Big Data pueden adaptarse a nuevas tecnologías o paradigmas sin tener que cambiar su características más intrínsecas.[Abstract] Current Big Data applications have significantly evolved from its origins, moving from mostly batch workloads to more complex ones that may involve many processing stages using different technologies or even working in real time. Moreover, to deploy these applications, commodity clusters have been in some cases replaced in favor of newer and more flexible paradigms such as the Cloud or even emerging ones such as serverless computing, usually involving virtualization techniques. This Thesis proposes two frameworks that provide alternative ways to perform indepth analysis and improved resource management for Big Data applications deployed on virtual environments based on software containers. On the one hand, the BDWatchdog framework is capable of performing real-time, fine-grain analysis in terms of system resource monitoring and code profiling. On the other hand, a framework for the dynamic and real-time scaling of resources according to several tuning policies is described. The first proposed policy revolves around the automatic scaling of the containers’ resources according to the real usage of the applications, thus providing a serverless environment. Furthermore, an alternative policy focused on energy management is presented in a scenario where power capping and budgeting functionalities are implemented for containers, applications or even users. Overall, the frameworks proposed in this Thesis aim to showcase how novel ways of analyzing and tuning the resources given to Big Data applications in container clusters are possible, even in real time. The supported use cases that were presented are examples of this, and show how Big Data applications can be adapted to newer technologies or paradigms without having to lose their distinctive characteristics

    Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies

    Full text link
    Grid is an infrastructure that involves the integrated and collaborative use of computers, networks, databases and scientific instruments owned and managed by multiple organizations. Grid applications often involve large amounts of data and/or computing resources that require secure resource sharing across organizational boundaries. This makes Grid application management and deployment a complex undertaking. Grid middlewares provide users with seamless computing ability and uniform access to resources in the heterogeneous Grid environment. Several software toolkits and systems have been developed, most of which are results of academic research projects, all over the world. This chapter will focus on four of these middlewares--UNICORE, Globus, Legion and Gridbus. It also presents our implementation of a resource broker for UNICORE as this functionality was not supported in it. A comparison of these systems on the basis of the architecture, implementation model and several other features is included.Comment: 19 pages, 10 figure

    A NeISS collaboration to develop and use e-infrastructure for large-scale social simulation

    Get PDF
    The National e-Infrastructure for Social Simulation (NeISS) project is focused on developing e-Infrastructure to support social simulation research. Part of NeISS aims to provide an interface for running contemporary dynamic demographic social simulation models as developed in the GENESIS project. These GENESIS models operate at the individual person level and are stochastic. This paper focuses on support for a simplistic demographic change model that has a daily time steps, and is typically run for a number of years. A portal based Graphical User Interface (GUI) has been developed as a set of standard portlets. One portlet is for specifying model parameters and setting a simulation running. Another is for comparing the results of different simulation runs. Other portlets are for monitoring submitted jobs and for interfacing with an archive of results. A layer of programs enacted by the portlets stage data in and submit jobs to a Grid computer which then runs a specific GENESIS model program executable. Once a job is submitted, some details are communicated back to a job monitoring portlet. Once the job is completed, results are stored and made available for download and further processing. Collectively we call the system the Genesis Simulator. Progress in the development of the Genesis Simulator was presented at the UK e- Science All Hands Meeting in September 2011 by way of a video based demonstration of the GUI, and an oral presentation of a working paper. Since then, an automated framework has been developed to run simulations for a number of years in yearly time steps. The demographic models have also been improved in a number of ways. This paper summarises the work to date, presents some of the latest results and considers the next steps we are planning in this work
    corecore