10 research outputs found
Autonomic Management of Large Clusters and Their Integration into the Grid
We present a framework for the co-ordinated, autonomic management of multiple clusters in a compute center and their integration into a Grid environment. Site autonomy and the automation of administrative tasks are prime aspects in this framework. The system behavior is continuously monitored in a steering cycle and appropriate actions are taken to resolve any problems. All presented components have been implemented in the course of the EU project DataGrid: The Lemon monitoring components, the FT fault-tolerance mechanism, the quattor system for software installation and configuration, the RMS job and resource management system, and the Gridification scheme that integrates clusters into the Grid
CERN Tape Archive - from development to production deployment
The first production version of the CERN Tape Archive (CTA) software is planned to be released during 2019. CTA is designed to replace CASTOR as the CERN tape archive solution, to face the scalability and performance challenges arriving with LHC Run–3. In this paper, we describe the main commonalities and differences between CTA and CASTOR. We outline the functional enhancements and integration steps required to add the CTA tape back-end to an EOS disk storage system. We present and discuss the different deployment and migration scenarios for replacing the five CASTOR instances at CERN, including a description of how the File Transfer Service (FTS) will interface with EOS and CTA
CERN Tape Archive — from development to production deployment
The first production version of the CERN Tape Archive (CTA) software is planned to be released during 2019. CTA is designed to replace CASTOR as the CERN tape archive solution, to face the scalability and performance challenges arriving with LHC Run–3.
In this paper, we describe the main commonalities and differences between CTA and CASTOR. We outline the functional enhancements and integration steps required to add the CTA tape back-end to an EOS disk storage system. We present and discuss the different deployment and migration scenarios for replacing the five CASTOR instances at CERN, including a description of how the File Transfer Service (FTS) will interface with EOS and CTA
CERN Tape Archive: a distributed, reliable and scalable scheduling system
The CERN Tape Archive (CTA) provides a tape backend to disk systems and, in conjunction with EOS, is managing the data of the LHC experiments at CERN.
Magnetic tape storage offers the lowest cost per unit volume today, followed by hard disks and flash. In addition, current tape drives deliver a solid bandwidth (typically 360MB/s per device), but at the cost of high latencies, both for mounting a tape in the drive and for positioning when accessing non-adjacent files. As a consequence, the transfer scheduler should queue transfer requests before the volume warranting a tape mount is reached. In spite of these transfer latencies, user-interactive operations should have a low latency.
The scheduling system for CTA was built from the experience gained with CASTOR. Its implementation ensures reliability and predictable performance, while simplifying development and deployment. As CTA is expected to be used for a long time, lock-in to vendors or technologies was minimized.
Finally, quality assurance systems were put in place to validate reliability and performance while allowing fast and safe development turnaround
CERN Tape Archive: production status, migration from CASTOR and new features
During 2019 and 2020, the CERN tape archive (CTA) will receive new data from LHC experiments and import existing data from CASTOR, which will be phased out for LHC experiments before Run 3.
This contribution will present the statuses of CTA as a service and of its integration with EOS and FTS and the data flow chains of LHC experiments.
The latest enhancements and additions to the software as well as the development outlook will be presented. With the development of the repack function, a necessary behind-the-scenes feature, CTA can now take over custodial data and handle media migration, compaction and failures. Further metadata handling optimisations allowed doubling the maximum file rate performance to 200Hz per queue.
New retrieve scheduling options are being developed at the request of experiments, with optional FIFO behaviour to ensure better control of the timing for datasets retrieve, and fair share support for competing activities within the same VO.
Support for multiple backend databases (Oracle, PostgreSQL, MySQL) have been developed at CERN and contributed by external institutes.
This contribution will also report on the challenges of and solutions for migrating data from the decades old CASTOR to CTA. The practical example of the preparation for the migration of ATLAS data will be presented
CERN Tape Archive: production status, migration from CASTOR and new features
During 2019 and 2020, the CERN tape archive (CTA) will receive new data from LHC experiments and import existing data from CASTOR, which will be phased out for LHC experiments before Run 3.
This contribution will present the statuses of CTA as a service and of its integration with EOS and FTS and the data flow chains of LHC experiments.
The latest enhancements and additions to the software as well as the development outlook will be presented. With the development of the repack function, a necessary behind-the-scenes feature, CTA can now take over custodial data and handle media migration, compaction and failures. Further metadata handling optimisations allowed doubling the maximum file rate performance to 200Hz per queue.
New retrieve scheduling options are being developed at the request of experiments, with optional FIFO behaviour to ensure better control of the timing for datasets retrieve, and fair share support for competing activities within the same VO.
Support for multiple backend databases (Oracle, PostgreSQL, MySQL) have been developed at CERN and contributed by external institutes.
This contribution will also report on the challenges of and solutions for migrating data from the decades old CASTOR to CTA. The practical example of the preparation for the migration of ATLAS data will be presented
CERN Services for Long Term Data Preservation: Paper - iPRES 2016 - Swiss National Library, Bern
In this paper we describe the services that are offered by CERN [3] for Long Term preservation of High Energy Physics (HEP) data, with the Large Hadron Collider (LHC) as a key use case. Data preservation is a strategic goal for European High Energy Physics (HEP) [9], as well as for the HEP community worldwide and we position our work in this global content. Specifically, we target the preservation of the scientific data, together with the software, documentation and computing environment needed to process, (re-)analyse or otherwise (re-)use the data. The target data volumes range from hundreds of petabytes (PB – 1015 bytes) to hundreds of exabytes (EB – 1018 bytes) for a target duration of several decades. The Use Cases driving data preservation are presented together with metrics that allow us to measure how close we are to meeting our goals, including the possibility for formal certification for at least part of this work. Almost all of the services that we describe are fully generic – the exception being Analysis Preservation that has some domain-specific aspects (where the basic technology could nonetheless be adapted)
Trends in computing technologies and markets: The HEPiX TechWatch WG
Driven by the need to carefully plan and optimise the resources for the next data taking periods of Big Science projects, such as CERN’s Large Hadron Collider and others, sites started a common activity, the HEPiX Technology Watch Working Group, tasked with tracking the evolution of technologies and markets of concern to the data centres. The talk will give an overview of general and semiconductor markets, server markets, CPUs and accelerators, memories, storage and networks; it will highlight important areas of uncertainties and risks
Trends in computing technologies and markets: The HEPiX TechWatch WG
Driven by the need to carefully plan and optimise the resources for the next data taking periods of Big Science projects, such as CERN’s Large Hadron Collider and others, sites started a common activity, the HEPiX Technology Watch Working Group, tasked with tracking the evolution of technologies and markets of concern to the data centres. The talk will give an overview of general and semiconductor markets, server markets, CPUs and accelerators, memories, storage and networks; it will highlight important areas of uncertainties and risks
Status Report of the DPHEP Collaboration: A Global Effort for Sustainable Data Preservation in High Energy Physics
Data from High Energy Physics (HEP) experiments are collected with significant financial and human effort and are mostly unique. An inter-experimental study group on HEP data preservation and long-term analysis was convened as a panel of the International Committee for Future Accelerators (ICFA). The group was formed by large collider-based experiments and investigated the technical and organizational aspects of HEP data preservation. An intermediate report was released in November 2009 addressing the general issues of data preservation in HEP and an extended blueprint paper was published in 2012. In July 2014 the DPHEP collaboration was formed as a result of the signature of the Collaboration Agreement by seven large funding agencies (others have since joined or are in the process of acquisition) and in June 2015 the first DPHEP Collaboration Workshop and Collaboration Board meeting took place. This status report of the DPHEP collaboration details the progress during the period from 2013 to 2015 inclusive.Data from High Energy Physics (HEP) experiments are collected with significant financial and human effort and are mostly unique. An inter-experimental study group on HEP data preservation and long-term analysis was convened as a panel of the International Committee for Future Accelerators (ICFA). The group was formed by large collider-based experiments and investigated the technical and organizational aspects of HEP data preservation. An intermediate report was released in November 2009 addressing the general issues of data preservation in HEP and an extended blueprint paper was published in 2012. In July 2014 the DPHEP collaboration was formed as a result of the signature of the Collaboration Agreement by seven large funding agencies (others have since joined or are in the process of acquisition) and in June 2015 the first DPHEP Collaboration Workshop and Collaboration Board meeting took place. This status report of the DPHEP collaboration details the progress during the period from 2013 to 2015 inclusive