5,424 research outputs found
The archive solution for distributed workflow management agents of the CMS experiment at LHC
The CMS experiment at the CERN LHC developed the Workflow Management Archive
system to persistently store unstructured framework job report documents
produced by distributed workflow management agents. In this paper we present
its architecture, implementation, deployment, and integration with the CMS and
CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster.
The system leverages modern technologies such as a document oriented database
and the Hadoop eco-system to provide the necessary flexibility to reliably
process, store, and aggregate (1M) documents on a daily basis. We
describe the data transformation, the short and long term storage layers, the
query language, along with the aggregation pipeline developed to visualize
various performance metrics to assist CMS data operators in assessing the
performance of the CMS computing system.Comment: This is a pre-print of an article published in Computing and Software
for Big Science. The final authenticated version is available online at:
https://doi.org/10.1007/s41781-018-0005-
Developing Resource Usage Service in WLCG
According to the Memorandum of Understanding (MoU) of the World-wide LHC Computing Grid (WLCG) project, participating sites are required to provide resource usage or accounting data to the Grid Operational Centre (GOC) to enrich the understanding of how shared resources are used, and to provide information for improving the effectiveness of resource allocation. As a multi-grid environment, the accounting process of WLCG is currently enabled by four accounting systems, each of which was developed independently by constituent grid projects. These accounting systems were designed and implemented based on project-specific local understanding of requirements, and therefore lack interoperability. In order to automate the accounting process in WLCG, three transportation methods are being introduced for streaming accounting data metered by heterogeneous accounting systems into GOC at Rutherford Appleton Laboratory (RAL) in the UK, where accounting data are aggregated and accumulated throughout the year. These transportation methods, however, were introduced on a per accounting-system basis, i.e. targeting at a particular accounting system, making them hard to reuse and customize to new requirements. This paper presents the design of WLCG-RUS system, a standards-compatible solution providing a consistent process for streaming resource usage data across various accounting systems, while ensuring interoperability, portability, and customization
BDWatchdog: real-time monitoring and profiling of Big Data applications and frameworks
This is a post-peer-review, pre-copyedit version of an article published in Future Generation Computer Systems. The final authenticated version is available online at: https://doi.org/10.1016/j.future.2017.12.068[Abstract] Current Big Data applications are characterized by a heavy use of system resources (e.g., CPU, disk) generally distributed across a cluster. To effectively improve their performance there is a critical need for an accurate analysis of both Big Data workloads and frameworks. This means to fully understand how the system resources are being used in order to identify potential bottlenecks, from resource to code bottlenecks. This paper presents BDWatchdog, a novel framework that allows real-time and scalable analysis of Big Data applications by combining time series for resource monitorization and flame graphs for code profiling, focusing on the processes that make up the workload rather than the underlying instances on which they are executed. This shift from the traditional system-based monitorization to a process-based analysis is interesting for new paradigms such as software containers or serverless computing, where the focus is put on applications and not on instances. BDWatchdog has been evaluated on a Big Data cloud-based service deployed at the CESGA supercomputing center. The experimental results show that a process-based analysis allows for a more effective visualization and overall improves the understanding of Big Data workloads. BDWatchdog is publicly available at http://bdwatchdog.dec.udc.es.Ministerio de Economía, Industria y Competitividad; TIN2016-75845-PMinsiterio de Educación; FPU15/0338
Automatic Rescaling and Tuning of Big Data Applications on Container-Based Virtual Environments
Programa Oficial de Doutoramento en Investigación en Tecnoloxías da Información. 524V01[Resumo]
As aplicacións Big Data actuais evolucionaron dun xeito significativo, dende
fluxos de traballo baseados en procesamento por lotes ata outros máis complexos
que poden requirir múltiples etapas de procesamento usando diferentes tecnoloxías,
e mesmo executándose en tempo real. Doutra banda, para despregar estas aplicacións,
os clusters ‘commodity’ foron substituídos nalgúns casos por paradigmas máis
flexibles como o Cloud, ou mesmo por outros emerxentes como a computación ‘serverless’,
precisando ambos paradigmas de tecnoloxías de virtualización. Esta Tese
propón dúas contornas que proporcionan modos alternativos de realizar unha análise
en profundidade e unha mellor xestión dos recursos de aplicacións Big Data despregadas
en contornas virtuais baseadas en contedores software. Por unha banda, a
contorna BDWatchdog permite realizar unha análise de gran fino e en tempo real
en termos do uso dos recursos do sistema e do perfilado do código. Doutra banda,
descríbese unha contorna para o reescalado dinámico e en tempo real dos recursos
segundo un conxunto de políticas configurables. A primeira política proposta
céntrase no reescalado automático dos recursos dos contedores segundo o uso real
que as aplicacións fan dos mesmos, proporcionando así unha contorna ‘serverless’.
Ademais, preséntase unha política alternativa centrada na xestión enerxética que
permite implementar os conceptos de limitación e presuposto de potencia, que poden
aplicarse a contedores, aplicacións ou mesmo usuarios. En xeral, as contornas
propostas nesta Tese tratan de poñer de relevo o potencial de aplicar novos xeitos de
analizar e axustar os recursos das aplicacións Big Data despregadas en clusters de
contedores, mesmo en tempo real. Os casos de uso presentados son exemplos diso,
demostrando que as aplicacións Big Data poden adaptarse a novas tecnoloxías ou
paradigmas sen teren que cambiar as súas características máis intrínsecas.[Resumen]
Las aplicaciones Big Data actuales han evolucionado de forma significativa, desde
flujos de trabajo basados en procesamiento por lotes hasta otros más complejos que
pueden requerir múltiples etapas de procesamiento usando distintas tecnologías, e incluso
ejecutándose en tiempo real. Por otra parte, para desplegar estas aplicaciones,
los clusters ‘commodity’ se han reemplazado en algunos casos por paradigmas más
flexibles como el Cloud, o incluso por otros emergentes como la computación ‘serverless’,
requiriendo ambos paradigmas de tecnologías de virtualización. Esta Tesis
propone dos entornos que proporcionan formas alternativas de realizar un análisis en
profundidad y una mejor gestión de los recursos de aplicaciones Big Data desplegadas
en entornos virtuales basados en contenedores software. Por un lado, el entorno
BDWatchdog permite realizar un análisis de grano fino y en tiempo real en lo que
respecta a la monitorización de los recursos del sistema y al perfilado del código. Por
otro lado, se describe un entorno para el reescalado dinámico y en tiempo real de
los recursos de acuerdo a un conjunto de políticas configurables. La primera política
propuesta se centra en el reescalado automático de los recursos de los contenedores
de acuerdo al uso real que las aplicaciones hacen de los mismos, proporcionando así
un entorno ‘serverless’. Además, se presenta una política alternativa centrada en la
gestión energética que permite implementar los conceptos de limitación y presupuesto
de potencia, pudiendo aplicarse a contenedores, aplicaciones o incluso usuarios.
En general, los entornos propuestos en esta Tesis tratan de resaltar el potencial de
aplicar nuevas formas de analizar y ajustar los recursos de las aplicaciones Big Data
desplegadas en clusters de contenedores, incluso en tiempo real. Los casos de uso
que se han presentado son ejemplos de esto, demostrando que las aplicaciones Big
Data pueden adaptarse a nuevas tecnologías o paradigmas sin tener que cambiar su
características más intrínsecas.[Abstract]
Current Big Data applications have significantly evolved from its origins, moving
from mostly batch workloads to more complex ones that may involve many processing
stages using different technologies or even working in real time. Moreover, to
deploy these applications, commodity clusters have been in some cases replaced
in favor of newer and more flexible paradigms such as the Cloud or even emerging
ones such as serverless computing, usually involving virtualization techniques.
This Thesis proposes two frameworks that provide alternative ways to perform indepth
analysis and improved resource management for Big Data applications deployed
on virtual environments based on software containers. On the one hand,
the BDWatchdog framework is capable of performing real-time, fine-grain analysis
in terms of system resource monitoring and code profiling. On the other hand, a
framework for the dynamic and real-time scaling of resources according to several
tuning policies is described. The first proposed policy revolves around the automatic
scaling of the containers’ resources according to the real usage of the applications,
thus providing a serverless environment. Furthermore, an alternative policy focused
on energy management is presented in a scenario where power capping and budgeting
functionalities are implemented for containers, applications or even users.
Overall, the frameworks proposed in this Thesis aim to showcase how novel ways
of analyzing and tuning the resources given to Big Data applications in container
clusters are possible, even in real time. The supported use cases that were presented
are examples of this, and show how Big Data applications can be adapted to newer
technologies or paradigms without having to lose their distinctive characteristics
Recommended from our members
Research and development of accounting system in grid environment
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The Grid has been recognised as the next-generation distributed computing paradigm by seamlessly integrating heterogeneous resources across administrative domains as a single virtual system. There are an increasing number of scientific and business projects that employ Grid computing technologies for large-scale resource sharing and collaborations. Early adoptions of Grid computing technologies have custom middleware implemented to bridge gaps between heterogeneous computing backbones. These custom solutions form the basis to the emerging Open Grid Service Architecture (OGSA), which aims at addressing common concerns of Grid systems by defining a set of interoperable and reusable Grid services. One of common concerns as defined in OGSA is the Grid accounting service. The main objective of the Grid accounting service is to ensure resources to be shared within a Grid environment in an accountable manner by metering and logging accurate resource usage information. This thesis discusses the origins and fundamentals of Grid computing and accounting service in the context of OGSA profile. A prototype was developed and evaluated based on OGSA accounting-related standards enabling sharing accounting data in a multi-Grid environment, the World-wide Large Hadron Collider Grid (WLCG). Based on this prototype and lessons learned, a generic middleware solution was also implemented as a toolkit that eases migration of existing accounting system to be standard compatible.Engineering and Physical Sciences Research Council (EPSRC), Stanford Universit
Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies
Grid is an infrastructure that involves the integrated and collaborative use
of computers, networks, databases and scientific instruments owned and managed
by multiple organizations. Grid applications often involve large amounts of
data and/or computing resources that require secure resource sharing across
organizational boundaries. This makes Grid application management and
deployment a complex undertaking. Grid middlewares provide users with seamless
computing ability and uniform access to resources in the heterogeneous Grid
environment. Several software toolkits and systems have been developed, most of
which are results of academic research projects, all over the world. This
chapter will focus on four of these middlewares--UNICORE, Globus, Legion and
Gridbus. It also presents our implementation of a resource broker for UNICORE
as this functionality was not supported in it. A comparison of these systems on
the basis of the architecture, implementation model and several other features
is included.Comment: 19 pages, 10 figure
A NeISS collaboration to develop and use e-infrastructure for large-scale social simulation
The National e-Infrastructure for Social Simulation (NeISS) project is focused on
developing e-Infrastructure to support social simulation research. Part of NeISS aims to
provide an interface for running contemporary dynamic demographic social simulation
models as developed in the GENESIS project. These GENESIS models operate at the
individual person level and are stochastic. This paper focuses on support for a simplistic
demographic change model that has a daily time steps, and is typically run for a number
of years.
A portal based Graphical User Interface (GUI) has been developed as a set
of standard portlets. One portlet is for specifying model parameters and setting a
simulation running. Another is for comparing the results of different simulation runs.
Other portlets are for monitoring submitted jobs and for interfacing with an archive of
results. A layer of programs enacted by the portlets stage data in and submit jobs to a
Grid computer which then runs a specific GENESIS model program executable. Once a
job is submitted, some details are communicated back to a job monitoring portlet. Once
the job is completed, results are stored and made available for download and further
processing. Collectively we call the system the Genesis Simulator.
Progress in the development of the Genesis Simulator was presented at the UK e-
Science All Hands Meeting in September 2011 by way of a video based demonstration
of the GUI, and an oral presentation of a working paper. Since then, an automated
framework has been developed to run simulations for a number of years in yearly time
steps. The demographic models have also been improved in a number of ways. This
paper summarises the work to date, presents some of the latest results and considers the
next steps we are planning in this work
- …