Search CORE

173 research outputs found

ATLAS Distributed Computing Evolution: Developments and Demonstrators Towards HL–LHC

Author: Cameron David
Lassnig Mario
South David
Publication venue: EDP Sciences
Publication date: 01/01/2024
Field of study

The computing challenges at the HL–LHC require fundamental changes to the distributed computing models that have served experiments well throughout LHC. ATLAS planning for HL–LHC computing started back in 2020 with a Conceptual Design Report outlining various challenges to explore. This was followed in 2022 by a roadmap defining concrete milestones and associated effort required. Today, ATLAS is proceeding further with a set of "demonstrators" with focused R&D in specific topics described in the roadmap. The demonstrators cover areas such as optimised tape writing and access, data recreation on–demand and the use of commercial clouds

Directory of Open Access Journals

WLCG Authorisation from X.509 to Tokens

Author: Bockelman Brian
Ceccanti Andrea
Collier Ian
Cornwall Linda
Dack Thomas
Guenther Jaroslav
Lassnig Mario
Litmaath Maarten
Millar Paul
Sallé Mischa
Short Hannah
Teheran Jeny
Wartel Romain
Publication venue: 'EDP Sciences'
Publication date: 01/01/2020
Field of study

The WLCG Authorisation Working Group was formed in July 2017 with the objective to understand and meet the needs of a future-looking Authentication and Authorisation Infrastructure (AAI) for WLCG experiments. Much has changed since the early 2000s when X.509 certificates presented the most suitable choice for authorisation within the grid; progress in token based authorisation and identity federation has provided an interesting alternative with notable advantages in usability and compatibility with external (commercial) partners. The need for interoperability in this new model is paramount as infrastructures and research communities become increasingly interdependent. Over the past two years, the working group has made significant steps towards identifying a system to meet the technical needs highlighted by the community during staged requirements gathering activities. Enhancement work has been possible thanks to externally funded projects, allowing existing AAI solutions to be adapted to our needs. A cornerstone of the infrastructure is the reliance on a common token schema in line with evolving standards and best practices, allowing for maximum compatibility and easy cooperation with peer infrastructures and services. We present the work of the group and an analysis of the anticipated changes in authorisation model by moving from X.509 to token based authorisation. A concrete example of token integration in Rucio is presented.Comment: 8 pages, 3 figures, to appear in the proceedings of CHEP 201

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

CERN Document Server

Updates to the ATLAS Data Carousel Project

Author: Borodin Mikhail
Cameron David
Klimentov Alexei
Korchuganova Tatiana
Lassnig Mario
Maeno Tadashi
Musheghyan Haykuhi
South David
Zhao Xin
Publication venue: EDP Sciences
Publication date: 01/01/2024
Field of study

The High Luminosity upgrade to the LHC (HL-LHC) is expected to deliver scientific data at the multi-exabyte scale. In order to address this unprecedented data storage challenge, the ATLAS experiment launched the Data Carousel project in 2018. Data Carousel is a tape-driven workflow whereby bulk production campaigns with input data resident on tape are executed by staging and promptly processing a sliding window to disk buffer such that only a small fraction of inputs are pinned on disk at any one time. Data Carousel is now in production for ATLAS in Run3. In this paper, we provide updates on recent Data Carousel R&D projects, including data-on-demand and tape smart writing. Data-on-demand removes from disk data that has not been accessed for a predefined period, when users request them, they will be either staged from tape or recreated by following the original production steps. Tape smart writing employs intelligent algorithms for file placement on tape in order to retrieve data back more efficiently, which is our long term strategy to achieve optimal tape usage in Data Carousel

Directory of Open Access Journals

Methods of Data Popularity Evaluation in the ATLAS Experiment at the LHC

Author: Beermann Thomas
Chuchuk Olga
Di Girolamo Alessandro
Grigorieva Maria
Klimentov Alexei
Lassnig Mario
Schulz Markus
Sciaba Andrea
Tretyakov Eugeny
Publication venue: 'EDP Sciences'
Publication date: 01/01/2021
Field of study

International audienceThe ATLAS Experiment at the LHC generates petabytes of data that is distributed among 160 computing sites all over the world and is processed continuously by various central production and user analysis tasks. The popularity of data is typically measured as the number of accesses and plays an important role in resolving data management issues: deleting, replicating, moving between tapes, disks and caches. These data management procedures were still carried out in a semi-manual mode and now we have focused our efforts on automating it, making use of the historical knowledge about existing data management strategies. In this study we describe sources of information about data popularity and demonstrate their consistency. Based on the calculated popularity measurements, various distributions were obtained. Auxiliary information about replication and task processing allowed us to evaluate the correspondence between the number of tasks with popular data executed per site and the number of replicas per site. We also examine the popularity of user analysis data that is much less predictable than in the central production and requires more indicators than just the number of accesses

EDP Sciences OAI-PMH repository (1.2.0)

INRIA a CCSD electronic archive server

CERN Document Server

Extending Rucio with modern cloud storage support

Author: Barisits Martin
Barnsley Robert
Barreiro Megino Fernando Harald
Elmsheuser Johannes
Lassnig Mario
Patrascoiu Mihai
Perry James
Serfon Cedric
Vendrell Moya Alba
Wegner Tobias
Publication venue: EDP Sciences
Publication date: 01/01/2024
Field of study

Rucio is a software framework designed to facilitate scientific collaborations in efficiently organising, managing, and accessing extensive volumes of data through customizable policies. The framework enables data distribution across globally distributed locations and heterogeneous data centres, integrating various storage and network technologies into a unified federated entity. Rucio offers advanced features like distributed data recovery and adaptive replication, and it exhibits high scalability, modularity, and extensibility. Originally developed to meet the requirements of the high-energy physics experiment ATLAS, Rucio has been continuously expanded to support LHC experiments and diverse scientific communities. Recent R&D projects within these communities have evaluated the integration of both private and commercially-provided cloud storage systems, leading to the development of additional functionalities for seamless integration within Rucio. Furthermore, the underlying systems, FTS and GFAL/Davix, have been extended to cater to specific use cases. This contribution focuses on the technical aspects of this work, particularly the challenges encountered in building a generic interface for self-hosted cloud storage, such as MinIO or CEPH S3 Gateway, and established providers like Google Cloud Storage and Amazon Simple Storage Service. Additionally, the integration of decentralised clouds like SEAL is explored. Key aspects, including authentication and authorisation, direct and remote access, throughput and cost estimation, are highlighted, along with shared experiences in daily operations

Directory of Open Access Journals

The ATLAS experiment software on ARM

Author: Barreiro Megino Fernando
De Salvo Alessandro
De Silva Asoka
Elmsheuser Johannes
Hauser Reiner
Konstantinov Dmitri
Krasznahorkay Attila
Lassnig Mario
Sailer Andre
Snyder Scott
Publication venue: EDP Sciences
Publication date: 01/01/2024
Field of study

With an increased dataset obtained during the Run 3 of the LHC at CERN and the even larger expected increase of the dataset by more than one order of magnitude for the HL-LHC, the ATLAS experiment is reaching the limits of the current data processing model in terms of traditional CPU resources based on x86_64 architectures and an extensive program for software upgrades towards the HL-LHC has been set up. The ARM architecture is becoming a competitive and energy efficient alternative. Some surveys indicate its increased presence in HPCs and commercial clouds, and some WLCG sites have expressed their interest. Chip makers are also developing their next generation solutions on ARM architectures, sometimes combining ARM and GPU processors in the same chip. Consequently it is important that the ATLAS software embraces the change and is able to successfully exploit this architecture. We report on the successful porting to ARM of the Athena software framework, which is used by ATLAS for both online and offline computing operations. Furthermore we report on the successful validation of simulation workflows running on ARM resources. For this we have set up an ATLAS Grid site using ARM compatible middleware and containers on Amazon Web Services (AWS) ARM resources. The ARM version of Athena is fully integrated in the regular software build system and distributed in the same way as other software releases. In addition, the workflows have been integrated into the HEPscore benchmark suite which is the planned WLCG wide replacement of the HepSpec06 benchmark used for Grid site pledges. In the overall porting process we have used resources on AWS, Google Cloud Platform (GCP) and CERN. A performance comparison of different architectures and resources will be discussed

Directory of Open Access Journals

Evolution of the open-source data management system Rucio for LHC Run-3 and beyond ATLAS

Author: Barisits Martin
Beermann Thomas
Bogado García Joaquín Ignacio
Garonne Vincent
Javurek Tomas
Lassnig Mario
Manzi Andrea
Martelli Edoardo
Serfon Cedric
The ATLAS Collaboration
Wegner Tobias
Zhao Xin
Publication venue
Publication date: 27/08/2021
Field of study

Rucio, the distributed data management system of the ATLAS experiment already manages more than 400 Petabytes of physics data on the grid. Rucio was incrementally improved throughout LHC Run-2 and is currently being prepared for the HL-LHC era of the experiment. Next to these improvements the system is currently evolving into a full-scale generic data management system for application beyond ATLAS, or even beyond high-energy physics. This contribution focuses on the development roadmap of Rucio for LHC Run-3, such as event level data management, generic meta-data support and increased usage of networks and tapes. At the same time Rucio is evolving beyond the original ATLAS requirements. This includes additional authentication mechanisms, generic database compatibility, deployment and packaging of the software stack in containers, and a project paradigm shift to a full-scale open source project.Facultad de Informátic

Servicio de Difusión de la Creación Intelectual

Rucio - Scientific data management

Author: Barisits Martin
Beermann Thomas
Berghaus Frank
Bockelman Brian
Bogado Joaquin
Cameron David
Christidis Dimitrios
Ciangottini Diego
di Girolamo Alessandro
Dimitrov Gancho
Elsing Markus
Garonne Vincent
Goossens Luc
Guan Wen
Guenther Jaroslav
Javurek Tomas
Kuhn Dietmar
Lassnig Mario
Lopez Fernando
Magini Nicolo
Molfetas Angelos
Nairz Armin
Ould-Saada Farid
Prenner Stefan
Serfon Cedric
Stewart Graeme
Vaandering Eric
Vasileva Petya
Vigne Ralph
Wegner Tobias
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely distributed locations. Rucio was originally developed to meet the requirements of the high-energy physics experiment ATLAS, and now is continuously extended to support the LHC experiments and other diverse scientific communities. In this article, we detail the fundamental concepts of Rucio, describe the architecture along with implementation details, and give operational experience from production usage.Comment: 21 pages, 11 figure

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

International Linear Collider

CERN Document Server

Accelerating science: The usage of commercial clouds in ATLAS Distributed Computing

Author: Barreiro Megino Fernando
Borodin Mikhail
De Kaushik
Di Girolamo Alessandro
Elmsheuser Johannes
Hartmann Nikolai
Heinrich Lukas
Klimentov Alexei
Lassnig Mario
Lin FaHui
Maeno Tadashi
Marshall Zachary
Merino Gonzalo
Nilsson Paul
Sandesara Jay
Serfon Cedric
Singh Harinder
South David
Publication venue: EDP Sciences
Publication date: 01/01/2024
Field of study

The ATLAS experiment at CERN is one of the largest scientific machines built to date and will have ever growing computing needs as the Large Hadron Collider collects an increasingly larger volume of data over the next 20 years. ATLAS is conducting R&D projects on Amazon Web Services and Google Cloud as complementary resources for distributed computing, focusing on some of the key features of commercial clouds: lightweight operation, elasticity and availability of multiple chip architectures. The proof of concept phases have concluded with the cloud-native, vendoragnostic integration with the experiment’s data and workload management frameworks. Google Cloud has been used to evaluate elastic batch computing, ramping up ephemeral clusters of up to O(100k) cores to process tasks requiring quick turnaround. Amazon Web Services has been exploited for the successful physics validation of the Athena simulation software on ARM processors. We have also set up an interactive facility for physics analysis allowing endusers to spin up private, on-demand clusters for parallel computing with up to 4 000 cores, or run GPU enabled notebooks and jobs for machine learning applications. The success of the proof of concept phases has led to the extension of the Google Cloud project, where ATLAS will study the total cost of ownership of a production cloud site during 15 months with 10k cores on average, fully integrated with distributed grid computing resources and continue the R&D projects

Directory of Open Access Journals