2,987 research outputs found
BigPanDA monitoring system evolution in the ATLAS Experiment
Monitoring services play a crucial role in the day-to-day operation of distributed computing systems. The ATLAS Experiment at LHC uses the Production and Distributed Analysis workload management system (PanDA WMS), which allows a million computational jobs to run daily at over 170 computing centers of the WLCG and opportunistic resources, utilizing 600k cores simultaneously on average. The BigPanDA monitor is an essential part of the monitoring infrastructure for the ATLAS Experiment that provides a wide range of views, from top-level summaries to a single computational job and its logs. Over the past few years of the PanDA WMS advancement in the ATLAS Experiment, several new components were developed, such as Harvester, iDDS, Data Carousel, and Global Shares. Due to its modular architecture, the BigPanDA monitor naturally grew into a platform where the relevant data from all PanDA WMS components and accompanying services are accumulated and displayed in the form of interactive charts and tables. Moreover the system has been adopted by other experiments beyond HEP. In this paper we describe the evolution of the BigPanDA monitor system, the development of new modules, and the integration process into other experiments
New science on the Open Science Grid
The Open Science Grid (OSG) includes work to enable new science, new scientists, and new modalities in support of computationally based research. There are frequently significant sociological and organizational changes required in transformation from the existing to the new. OSG leverages its deliverables to the large-scale physics experiment member communities to benefit new communities at all scales through activities in education, engagement, and the distributed facility. This paper gives both a brief general description and specific examples of new science enabled on the OSG. More information is available at the OSG web site: www.opensciencegrid.org
An intelligent Data Delivery Service for and beyond the ATLAS experiment
The intelligent Data Delivery Service (iDDS) has been developed to cope with
the huge increase of computing and storage resource usage in the coming LHC
data taking. iDDS has been designed to intelligently orchestrate workflow and
data management systems, decoupling data pre-processing, delivery, and main
processing in various workflows. It is an experiment-agnostic service around a
workflow-oriented structure to work with existing and emerging use cases in
ATLAS and other experiments. Here we will present the motivation for iDDS, its
design schema and architecture, use cases and current status, and plans for the
future.Comment: 6 pages, 5 figure
AI4EIC Hackathon: PID with the ePIC dRICH
The inaugural AI4EIC Hackathon unfolded as a high-point satellite event during the second AI4EIC Workshop at William & Mary. The workshop itself boasted over two hundred participants in a hybrid format and delved into the myriad applications of Artificial Intelligence and Machine Learning (AI/ML) for the Electron-Ion Collider (EIC). This workshop aimed to catalyze advancements in AI/ML with applications ranging from advancements in accelerator and detector technologies—highlighted by the ongoing work on the ePIC detector and potential development of a second detector for the EIC—to data analytics, reconstruction, and particle identification, as well as the synergies between theoretical and experimental research. Complementing the technical agenda was an enriched educational outreach program that featured tutorials from leading AI/ML experts representing academia, national laboratories, and industry. The hackathon, held on the final day, showcased international participation with ten teams from around the globe. Each team, comprising up to four members, focused on the dual-radiator Ring Imaging Cherenkov (dRICH) detector, an integral part of the particle identification (PID) system in ePIC. The data for the hackathon were generated using the ePIC software suite. While the hackathon presented questions of increasing complexity, its challenges were designed with deliberate simplifications to serve as a preliminary step toward the integration of machine learning and deep learning techniques in PID with the dRICH detector. This article encapsulates the key findings and insights gained from this unique experience
Recommended from our members
The Open Science Grid status and architecture
The Open Science Grid (OSG) provides a distributed facility where the Consortium members provide guaranteed and opportunistic access to shared computing and storage resources. The OSG project[1] is funded by the National Science Foundation and the Department of Energy Scientific Discovery through Advanced Computing program. The OSG project provides specific activities for the operation and evolution of the common infrastructure. The US ATLAS and US CMS collaborations contribute to and depend on OSG as the US infrastructure contributing to the World Wide LHC Computing Grid on which the LHC experiments distribute and analyze their data. Other stakeholders include the STAR RHIC experiment, the Laser Interferometer Gravitational-Wave Observatory (LIGO), the Dark Energy Survey (DES) and several Fermilab Tevatron experiments- CDF, D0, MiniBoone etc. The OSG implementation architecture brings a pragmatic approach to enabling vertically integrated community specific distributed systems over a common horizontal set of shared resources and services. More information can be found at the OSG web site: www.opensciencegrid.org
The Open Science Grid Status and Architecture The Open Science Grid Executive Board on behalf of the OSG Consortium: Ruth Pordes, Don Petravick: Fermi National Accelerator Laboratory
Abstract. The Open Science Grid (OSG) provides a distributed facility where the Consortium members provide guaranteed and opportunistic access to shared computing and storage resources. The OSG project[1] is funded by the National Science Foundation and the Department of Energy Scientific Discovery through Advanced Computing program. The OSG project provides specific activities for the operation and evolution of the common infrastructure. The US ATLAS and US CMS collaborations contribute to and depend on OSG as the US infrastructure contributing to the World Wide LHC Computing Grid on which the LHC experiments distribute and analyze their data. Other stakeholders include the STAR RHIC experiment, the Laser Interferometer Gravitational-Wave Observatory (LIGO), the Dark Energy Survey (DES) and several Fermilab Tevatron experiments-CDF, D0, MiniBoone etc. The OSG implementation architecture brings a pragmatic approach to enabling vertically integrated community specific distributed systems over a common horizontal set of shared resources and services. More information can be found at the OSG web site: www.opensciencegrid.org
Utilizing Distributed Heterogeneous Computing with PanDA in ATLAS
In recent years, advanced and complex analysis workflows have gained increasing importance in the ATLAS experiment at CERN, one of the large scientific experiments at LHC. Support for such workflows has allowed users to exploit remote computing resources and service providers distributed worldwide, overcoming limitations on local resources and services. The spectrum of computing options keeps increasing across the Worldwide LHC Computing Grid (WLCG), volunteer computing, high-performance computing, commercial clouds, and emerging service levels like Platform-as-a-Service (PaaS), Container-as-a-Service (CaaS) and Function-as-a-Service (FaaS), each one providing new advantages and constraints. Users can significantly benefit from these providers, but at the same time, it is cumbersome to deal with multiple providers, even in a single analysis workflow with fine-grained requirements coming from their applications’ nature and characteristics. In this paper, we will first highlight issues in geographically-distributed heterogeneous computing, such as the insulation of users from the complexities of dealing with remote providers, smart workload routing, complex resource provisioning, seamless execution of advanced workflows, workflow description, pseudointeractive analysis, and integration of PaaS, CaaS, and FaaS providers. We will also outline solutions developed in ATLAS with the Production and Distributed Analysis (PanDA) system and future challenges for LHC Run4
Distributed Machine Learning Workflow with PanDA and iDDS in LHC ATLAS
Machine Learning (ML) has become one of the important tools for High Energy Physics analysis. As the size of the dataset increases at the Large Hadron Collider (LHC), and at the same time the search spaces become bigger and bigger in order to exploit the physics potentials, more and more computing resources are required for processing these ML tasks. In addition, complex advanced ML workflows are developed in which one task may depend on the results of previous tasks. How to make use of vast distributed CPUs/GPUs in WLCG for these big complex ML tasks has become a popular research area. In this paper, we present our efforts enabling the execution of distributed ML workflows on the Production and Distributed Analysis (PanDA) system and intelligent Data Delivery Service (iDDS). First, we describe how PanDA and iDDS deal with large-scale ML workflows, including the implementation to process workloads on diverse and geographically distributed computing resources. Next, we report real-world use cases, such as HyperParameter Optimization, Monte Carlo Toy confidence limits calculation, and Active Learning. Finally, we conclude with future plans
Jet energy measurement with the ATLAS detector in proton-proton collisions at root s=7 TeV
The jet energy scale and its systematic uncertainty are determined for jets measured with the ATLAS detector at the LHC in proton-proton collision data at a centre-of-mass energy of √s = 7TeV corresponding to an integrated luminosity of 38 pb-1. Jets are reconstructed with the anti-kt algorithm with distance parameters R=0. 4 or R=0. 6. Jet energy and angle corrections are determined from Monte Carlo simulations to calibrate jets with transverse momenta pT≥20 GeV and pseudorapidities {pipe}η{pipe}<4. 5. The jet energy systematic uncertainty is estimated using the single isolated hadron response measured in situ and in test-beams, exploiting the transverse momentum balance between central and forward jets in events with dijet topologies and studying systematic variations in Monte Carlo simulations. The jet energy uncertainty is less than 2. 5 % in the central calorimeter region ({pipe}η{pipe}<0. 8) for jets with 60≤pT<800 GeV, and is maximally 14 % for pT<30 GeV in the most forward region 3. 2≤{pipe}η{pipe}<4. 5. The jet energy is validated for jet transverse momenta up to 1 TeV to the level of a few percent using several in situ techniques by comparing a well-known reference such as the recoiling photon pT, the sum of the transverse momenta of tracks associated to the jet, or a system of low-pT jets recoiling against a high-pT jet. More sophisticated jet calibration schemes are presented based on calorimeter cell energy density weighting or hadronic properties of jets, aiming for an improved jet energy resolution and a reduced flavour dependence of the jet response. The systematic uncertainty of the jet energy determined from a combination of in situ techniques is consistent with the one derived from single hadron response measurements over a wide kinematic range. The nominal corrections and uncertainties are derived for isolated jets in an inclusive sample of high-pT jets. Special cases such as event topologies with close-by jets, or selections of samples with an enhanced content of jets originating from light quarks, heavy quarks or gluons are also discussed and the corresponding uncertainties are determined. © 2013 CERN for the benefit of the ATLAS collaboration
- …