14 research outputs found
Extending DIRAC File Management with Erasure-Coding for efficient storage
The state of the art in Grid style data management is to achieve increased
resilience of data via multiple complete replicas of data files across multiple
storage endpoints. While this is effective, it is not the most space-efficient
approach to resilience, especially when the reliability of individual storage
endpoints is sufficiently high that only a few will be inactive at any point in
time. We report on work performed as part of GridPP\cite{GridPP}, extending the
Dirac File Catalogue and file management interface to allow the placement of
erasure-coded files: each file distributed as N identically-sized chunks of
data striped across a vector of storage endpoints, encoded such that any M
chunks can be lost and the original file can be reconstructed. The tools
developed are transparent to the user, and, as well as allowing up and
downloading of data to Grid storage, also provide the possibility of
parallelising access across all of the distributed chunks at once, improving
data transfer and IO performance. We expect this approach to be of most
interest to smaller VOs, who have tighter bounds on the storage available to
them, but larger (WLCG) VOs may be interested as their total data increases
during Run 2. We provide an analysis of the costs and benefits of the approach,
along with future development and implementation plans in this area. In
general, overheads for multiple file transfers provide the largest issue for
competitiveness of this approach at present.Comment: 21st International Conference on Computing for High Energy and
Nuclear Physics (CHEP2015
Using the Autopilot pattern to deploy container resources at a WLCG Tier-2
Containers are becoming ubiquitous within the WLCG, with CMS announcing a requirement for its sites to provide Singularity during 2018. The ubiquity of containers means it is now possible to reify the combination of an application and its configuration into a single easy-to-deploy unit, avoiding the need to make use of a myriad of configuration management tools such as Puppet, Ansible or Salt. This allows use to be made of industry-standard devops techniques within the operations domain, such as Continuous Integration (CI) and Continuous Deployment (CD), which can lead to faster upgrades and greater system security. One interesting technique is the Autopilot pattern, which provides mechanisms for application life-cycle management which are accessible from within the container itself. Using modern service discovery techniques, each container manages its own configuration, monitors its own health, and adapts to changing requirements through the use of event triggers. In this paper, we expand on previous work to create and deploy resources to a WLCG Tier-2 via containers, and investigate the viability of using the Autopilot pattern at a WLCG site to deploy and manage computational resources
Enabling object storage via shims for grid middleware
The Object Store model has quickly become the basis of most commercially successful mass storage infrastructure, backing so-called "Cloud" storage such as Amazon S3, but also underlying the implementation of most parallel distributed storage systems. Many of the assumptions in Object Store design are similar, but not identical, to concepts in the design of Grid Storage Elements, although the requirement for "POSIX-like" filesystem structures on top of SEs makes the disjunction seem larger. As modern Object Stores provide many features that most Grid SEs do not (block level striping, parallel access, automatic file repair, etc.), it is of interest to see how easily we can provide interfaces to typical Object Stores via plugins and shims for Grid tools, and how well experiments can adapt their data models to them. We present evaluation of, and first-deployment experiences with, (for example) Xrootd-Ceph interfaces for direct object-store access, as part of an initiative within GridPP[1] hosted at RAL. Additionally, we discuss the tradeoffs and experience of developing plugins for the currently-popular Ceph parallel distributed filesystem for the GFAL2 access layer, at Glasgow
Evaluation of containers as a virtualisation alternative for HEP workloads
In this paper the emerging technology of Linux containers is examined and evaluated for use in the High Energy Physics (HEP) community. Key technologies required to enable containerisation will be discussed along with emerging technologies used to manage container images. An evaluation of the requirements for containers within HEP will be made and benchmarking will be carried out to asses performance over a range of HEP workflows. The use of containers will be placed in a broader context and recommendations on future work will be given
Storageless and caching Tier-2 models in the UK context
Operational and other pressures have lead to WLCG experiments moving increasingly to a stratified model for Tier-2 resources, where ``fat" Tier-2s (``T2Ds") and ``thin" Tier-2s (``T2Cs") provide different levels of service.
In the UK, this distinction is also encouraged by the terms of the current GridPP5 funding model. In anticipation of this, testing has been performed on the implications, and potential implementation, of such a distinction in our resources.
In particular, this presentation presents the results of testing of storage T2Cs, where the ``thin" nature is expressed by the site having either no local data storage, or only a thin caching layer; data is streamed or copied from a ``nearby" T2D when needed by jobs.
In OSG, this model has been adopted successfully for CMS AAA sites; but the network topology and capacity in the USA is significantly different to that in the UK (and much of Europe).
We present the result of several operational tests: the in-production University College London (UCL) site, which runs ATLAS workloads using storage at the Queen Mary University of London (QMUL) site; the Oxford site, which has had scaling tests performed against T2Ds in various locations in the UK (to test network effects); and the Durham site, which has been testing the specific ATLAS caching solution of ``Rucio Cache" integration with ARC's caching layer
Caching technologies for Tier-2 sites: a UK perspective
Pressures from both WLCG VOs and externalities have led to a desire to "simplify" data access and handling for Tier-2 resources across the Grid. This has mostly been imagined in terms of reducing book-keeping for VOs, and total replicas needed across sites. One common direction of motion is to increasing the amount of remote-access to data for jobs, which is also seen as enabling the development of administratively-cheaper Tier-2 subcat-egories, reducing manpower and equipment costs. Caching technologies are often seen as a "cheap" way to ameliorate the increased latency (and decreased bandwidth) introduced by ubiquitous remote-access approaches, but the usefulness of caches is strongly dependant on the reuse of the data thus cached. We report on work done in the UK at four GridPP Tier-2 sites - ECDF, Glasgow, RALPP and Durham - to investigate the suitability of transparent caching via the recently-rebranded XCache (Xrootd Proxy Cache) for both ATLAS and CMS workloads, and to support workloads by other caching approaches (such as the ARC CE Cache)
Cavity QED and correlation effects in sharply intersecting conducting structures
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Using the Autopilot pattern to deploy container resources at a WLCG Tier-2
Containers are becoming ubiquitous within the WLCG, with CMS announcing a requirement for its sites to provide Singularity during 2018. The ubiquity of containers means it is now possible to reify the combination of an
application and its configuration into a single easy-to-deploy unit, avoiding the need to make use of a myriad of configuration management tools such as Puppet, Ansible or Salt. This allows use to be made of industry-standard devops techniques within the operations domain, such as Continuous Integration (CI) and Continuous Deployment (CD), which can lead to faster upgrades and greater system security. One interesting technique is the Autopilot pattern, which provides mechanisms for application life-cycle management which are accessible from within the container itself. Using modern service discovery techniques, each container manages its own configuration, monitors its own health, and adapts to changing requirements through the use of event triggers. In this paper, we expand on previous work to create and deploy resources to a WLCG Tier-2 via containers, and investigate the viability of using the Autopilot pattern at a WLCG site to deploy and manage computational resources
Caching technologies for Tier-2 sites: A UK perspective
Pressures from both WLCG VOs and externalities have led to a desire to "simplify" data access and handling for Tier-2 resources across the Grid. This has mostly been imagined in terms of reducing book-keeping for VOs, and total replicas needed across sites. One common direction of motion is to increasing the amount of remote-access to data for jobs, which is also seen as enabling the development of administratively-cheaper Tier-2 subcat-egories, reducing manpower and equipment costs. Caching technologies are often seen as a "cheap" way to ameliorate the increased latency (and decreased bandwidth) introduced by ubiquitous remote-access approaches, but the usefulness of caches is strongly dependant on the reuse of the data thus cached. We report on work done in the UK at four GridPP Tier-2 sites - ECDF, Glasgow, RALPP and Durham - to investigate the suitability of transparent caching via the recently-rebranded XCache (Xrootd Proxy Cache) for both ATLAS and CMS workloads, and to support workloads by other caching approaches (such as the ARC CE Cache)
Caching technologies for Tier-2 sites: A UK perspective
Pressures from both WLCG VOs and externalities have led to a desire to "simplify" data access and handling for Tier-2 resources across the Grid. This has mostly been imagined in terms of reducing book-keeping for VOs, and total replicas needed across sites. One common direction of motion is to increasing the amount of remote-access to data for jobs, which is also seen as enabling the development of administratively-cheaper Tier-2 subcat-egories, reducing manpower and equipment costs. Caching technologies are often seen as a "cheap" way to ameliorate the increased latency (and decreased bandwidth) introduced by ubiquitous remote-access approaches, but the usefulness of caches is strongly dependant on the reuse of the data thus cached. We report on work done in the UK at four GridPP Tier-2 sites - ECDF, Glasgow, RALPP and Durham - to investigate the suitability of transparent caching via the recently-rebranded XCache (Xrootd Proxy Cache) for both ATLAS and CMS workloads, and to support workloads by other caching approaches (such as the ARC CE Cache)