25 research outputs found

    Testing performance of standards-based protocols in DPM

    Get PDF
    In the interests of the promotion of the increased use of non-proprietary protocols in grid storage systems, we perform tests on the performance of WebDAV and pNFS transport with the DPM storage solution. We find that the standards-based protocols behave similarly to the proprietary standards currently in use, despite encountering some issues with the state of the implementation itself. We thus conclude that there is no performance-based reason to avoid using such protocols for data management in future

    A NeISS collaboration to develop and use e-infrastructure for large-scale social simulation

    Get PDF
    The National e-Infrastructure for Social Simulation (NeISS) project is focused on developing e-Infrastructure to support social simulation research. Part of NeISS aims to provide an interface for running contemporary dynamic demographic social simulation models as developed in the GENESIS project. These GENESIS models operate at the individual person level and are stochastic. This paper focuses on support for a simplistic demographic change model that has a daily time steps, and is typically run for a number of years. A portal based Graphical User Interface (GUI) has been developed as a set of standard portlets. One portlet is for specifying model parameters and setting a simulation running. Another is for comparing the results of different simulation runs. Other portlets are for monitoring submitted jobs and for interfacing with an archive of results. A layer of programs enacted by the portlets stage data in and submit jobs to a Grid computer which then runs a specific GENESIS model program executable. Once a job is submitted, some details are communicated back to a job monitoring portlet. Once the job is completed, results are stored and made available for download and further processing. Collectively we call the system the Genesis Simulator. Progress in the development of the Genesis Simulator was presented at the UK e- Science All Hands Meeting in September 2011 by way of a video based demonstration of the GUI, and an oral presentation of a working paper. Since then, an automated framework has been developed to run simulations for a number of years in yearly time steps. The demographic models have also been improved in a number of ways. This paper summarises the work to date, presents some of the latest results and considers the next steps we are planning in this work

    Taking the C out of CVMFS

    Get PDF
    The Cern Virtual Machine File System is most well known as a distribution mechanism for the WLCG VOs@@ experiment software; as a result, almost all the existing expertise is in installing clients mount the central Cern repositories. We report the results of an initial experiment in using the cvmfs server packages to provide Glasgow-based repository aimed at software provisioning for small UK-local VOs. In general, although the documentation is sparse, server configuration is reasonably easy, with some experimentation. We discuss the advantages of local CVMFS repositories for sites, with some examples from our test VOs, vo.optics.ac.uk and neiss.org.uk

    Monitoring in a grid cluster

    Get PDF
    The monitoring of a grid cluster (or of any piece of reasonably scaled IT infrastructure) is a key element in the robust and consistent running of that site. There are several factors which are important to the selection of a useful monitoring framework, which include ease of use, reliability, data input and output. It is critical that data can be drawn from different instrumentation packages and collected in the framework to allow for a uniform view of the running of a site. It is also very useful to allow different views and transformations of this data to allow its manipulation for different purposes, perhaps unknown at the initial time of installation. In this context, we present the findings of an investigation of the Graphite monitoring framework and its use at the ScotGrid Glasgow site. In particular, we examine the messaging system used by the framework and means to extract data from different tools, including the existing framework Ganglia which is in use at many sites, in addition to adapting and parsing data streams from external monitoring frameworks and websites

    Extending DIRAC File Management with Erasure-Coding for efficient storage

    Get PDF
    The state of the art in Grid style data management is to achieve increased resilience of data via multiple complete replicas of data files across multiple storage endpoints. While this is effective, it is not the most space-efficient approach to resilience, especially when the reliability of individual storage endpoints is sufficiently high that only a few will be inactive at any point in time. We report on work performed as part of GridPP\cite{GridPP}, extending the Dirac File Catalogue and file management interface to allow the placement of erasure-coded files: each file distributed as N identically-sized chunks of data striped across a vector of storage endpoints, encoded such that any M chunks can be lost and the original file can be reconstructed. The tools developed are transparent to the user, and, as well as allowing up and downloading of data to Grid storage, also provide the possibility of parallelising access across all of the distributed chunks at once, improving data transfer and IO performance. We expect this approach to be of most interest to smaller VOs, who have tighter bounds on the storage available to them, but larger (WLCG) VOs may be interested as their total data increases during Run 2. We provide an analysis of the costs and benefits of the approach, along with future development and implementation plans in this area. In general, overheads for multiple file transfers provide the largest issue for competitiveness of this approach at present.Comment: 21st International Conference on Computing for High Energy and Nuclear Physics (CHEP2015

    A voyage to Arcturus: a model for automated management of a WLCG Tier-2 facility

    Get PDF
    With the current trend towards "On Demand Computing" in big data environments it is crucial that the deployment of services and resources becomes increasingly automated. Deployment based on cloud platforms is available for large scale data centre environments but these solutions can be too complex and heavyweight for smaller, resource constrained WLCG Tier-2 sites. Along with a greater desire for bespoke monitoring and collection of Grid related metrics, a more lightweight and modular approach is desired. In this paper we present a model for a lightweight automated framework which can be use to build WLCG grid sites, based on "off the shelf" software components. As part of the research into an automation framework the use of both IPMI and SNMP for physical device management will be included, as well as the use of SNMP as a monitoring/data sampling layer such that more comprehensive decision making can take place and potentially be automated. This could lead to reduced down times and better performance as services are recognised to be in a non-functional state by autonomous systems

    Analysis and improvement of data-set level file distribution in Disk Pool Manager

    Get PDF
    Of the three most widely used implementations of the WLCG Storage Element specification, Disk Pool Manager[1, 2] (DPM) has the simplest implementation of file placement balancing (StoRM doesn't attempt this, leaving it up to the underlying filesystem, which can be very sophisticated in itself). DPM uses a round-robin algorithm (with optional filesystem weighting), for placing files across filesystems and servers. This does a reasonable job of evenly distributing files across the storage array provided to it. However, it does not offer any guarantees of the evenness of distribution of that subset of files associated with a given "dataset" (which often maps onto a "directory" in the DPM namespace (DPNS)). It is useful to consider a concept of "balance", where an optimally balanced set of files indicates that the files are distributed evenly across all of the pool nodes. The best case performance of the round robin algorithm is to maintain balance, it has no mechanism to improve balance.<p></p> In the past year or more, larger DPM sites have noticed load spikes on individual disk servers, and suspected that these were exacerbated by excesses of files from popular datasets on those servers. We present here a software tool which analyses file distribution for all datasets in a DPM SE, providing a measure of the poorness of file location in this context. Further, the tool provides a list of file movement actions which will improve dataset-level file distribution, and can action those file movements itself. We present results of such an analysis on the UKI-SCOTGRID-GLASGOW Production DPM

    Enabling object storage via shims for grid middleware

    Get PDF
    The Object Store model has quickly become the basis of most commercially successful mass storage infrastructure, backing so-called "Cloud" storage such as Amazon S3, but also underlying the implementation of most parallel distributed storage systems. Many of the assumptions in Object Store design are similar, but not identical, to concepts in the design of Grid Storage Elements, although the requirement for "POSIX-like" filesystem structures on top of SEs makes the disjunction seem larger. As modern Object Stores provide many features that most Grid SEs do not (block level striping, parallel access, automatic file repair, etc.), it is of interest to see how easily we can provide interfaces to typical Object Stores via plugins and shims for Grid tools, and how well experiments can adapt their data models to them. We present evaluation of, and first-deployment experiences with, (for example) Xrootd-Ceph interfaces for direct object-store access, as part of an initiative within GridPP[1] hosted at RAL. Additionally, we discuss the tradeoffs and experience of developing plugins for the currently-popular Ceph parallel distributed filesystem for the GFAL2 access layer, at Glasgow

    Evaluation of containers as a virtualisation alternative for HEP workloads

    Get PDF
    In this paper the emerging technology of Linux containers is examined and evaluated for use in the High Energy Physics (HEP) community. Key technologies required to enable containerisation will be discussed along with emerging technologies used to manage container images. An evaluation of the requirements for containers within HEP will be made and benchmarking will be carried out to asses performance over a range of HEP workflows. The use of containers will be placed in a broader context and recommendations on future work will be given

    Using the Autopilot pattern to deploy container resources at a WLCG Tier-2

    Get PDF
    Containers are becoming ubiquitous within the WLCG, with CMS announcing a requirement for its sites to provide Singularity during 2018. The ubiquity of containers means it is now possible to reify the combination of an application and its configuration into a single easy-to-deploy unit, avoiding the need to make use of a myriad of configuration management tools such as Puppet, Ansible or Salt. This allows use to be made of industry-standard devops techniques within the operations domain, such as Continuous Integration (CI) and Continuous Deployment (CD), which can lead to faster upgrades and greater system security. One interesting technique is the Autopilot pattern, which provides mechanisms for application life-cycle management which are accessible from within the container itself. Using modern service discovery techniques, each container manages its own configuration, monitors its own health, and adapts to changing requirements through the use of event triggers. In this paper, we expand on previous work to create and deploy resources to a WLCG Tier-2 via containers, and investigate the viability of using the Autopilot pattern at a WLCG site to deploy and manage computational resources
    corecore