2,300 research outputs found
Reinforce-lib: A Reinforcement Learning Library for Scientific Research
Reinforcement Learning (RL) has already achieved several breakthroughs on complex, high-dimensional, and even multi-agent tasks, gaining increasingly interest from not only the research
community. Although very powerful in principle, its applicability is still limited to solving games and control problems, leaving plenty opportunities to apply and develop RL algorithms for (but
not limited to) scientific domains like physics, and biology. Apart from the domain of interest, the applicability of RL is also limited by numerous difficulties encountered while training agents, like training instabilities and sensitivity to hyperparameters. For such reasons, we propose a modern,
modular, simple and understandable Python RL library called reinforce-lib. Our main aim is to enable newcomers, practitioners, and researchers to easily employ RL to solve new scientific problems. Our library is available at https://github.com/Luca96/reinforce-lib
Predicting dataset popularity for the CMS experiment
The CMS experiment at the LHC accelerator at CERN relies on its computing
infrastructure to stay at the frontier of High Energy Physics, searching for
new phenomena and making discoveries. Even though computing plays a significant
role in physics analysis we rarely use its data to predict the system behavior
itself. A basic information about computing resources, user activities and site
utilization can be really useful for improving the throughput of the system and
its management. In this paper, we discuss a first CMS analysis of dataset
popularity based on CMS meta-data which can be used as a model for dynamic data
placement and provide the foundation of data-driven approach for the CMS
computing infrastructure.Comment: Submitted to proceedings of 17th International workshop on Advanced
Computing and Analysis Techniques in physics research (ACAT
Machine Learning as a Service for High Energy Physics on heterogeneous computing resources
Machine Learning (ML) techniques in the High-Energy Physics (HEP) domain are ubiquitous and will play a significant role also in the upcoming High-Luminosity LHC (HL-LHC) upgrade foreseen at CERN: a huge amount of data will be produced by LHC and collected by the ex- periments, facing challenges at the exascale. Despite ML models are successfully applied in many use-cases (online and offline reconstruction, particle identification, detector simulation, Monte Carlo generation, just to name a few) there is a constant seek for scalable, performant, and production-quality operations of ML-enabled workflows. In addition, the scenario is complicated by the gap among HEP physicists and ML experts, caused by the specificity of some parts of the HEP typical workflows and solutions, and by the difficulty to formulate HEP problems in a way that match the skills of the Computer Science (CS) and ML community and hence its potential ability to step in and help. Among other factors, one of the technical obstacles resides in the difference of data-formats used by ML-practitioners and physicists, where the former use mostly flat-format data representations while the latter use to store data in tree-based objects via the ROOT data format. Another obstacle to further development of ML techniques in HEP resides in the difficulty to secure the adequate computing resources for training and inference of ML models, in a scalable and transparent way in terms of CPU vs GPU vs TPU vs other resources, as well as local vs cloud resources. This yields a technical barrier that prevents a relatively large portion of HEP physicists from fully accessing the potential of ML-enabled systems for scientific research. In order to close this gap, a Machine Learning as a Service for HEP (MLaaS4HEP) solution is presented as a product of R&D activities within the CMS experiment. It offers a service that is capable to directly read ROOT-based data, use the ML solution provided by the user, and ultimately serve predictions by pre-trained ML models “as a service” accessible via HTTP protocol. This solution can be used by physicists or experts outside of HEP domain and it provides access to local or remote data storage without requiring any modification or integration with the experiment specific framework. Moreover, MLaaS4HEP is built with a modular design allowing independent resource allocation that opens up a possibility to train ML models on PB-size datasets remotely accessible from the WLCG sites without physically downloading data into local storage.
To prove the feasibility and utility of the MLaaS4HEP service with large datasets and thus be ready for the next future when an increase of data produced is expected, an exploration of different hardware resources is required. In particular, this work aims to provide the MLaaS4HEP service transparent access to heterogeneous resources, which opens up the usage of more powerful resources without requiring any effort from the user side during the access and use phase
Prototype of a cloud native solution of Machine Learning as Service for HEP
To favor the usage of Machine Learning (ML) techniques in High-Energy Physics (HEP) analyses it would be useful to have a service allowing to perform the entire ML pipeline (in terms of reading the data, training a ML model, and serving predictions) directly using ROOT files of arbitrary size from local or remote distributed data sources. The MLaaS4HEP framework aims to provide such kind of solution. It was successfully validated with a CMS physics use case which gave important feedback about the needs of analysts. For instance, we introduced the possibility for the user to provide pre-processing operations, such as defining new branches and applying cuts. To provide a real service for the user and to integrate it into the INFN Cloud, we started working on MLaaS4HEP cloudification. This would allow to use cloud resources and to work in a distributed environment. In this work, we provide updates on this topic, and in particular, we discuss our first working prototype of the service. It includes an OAuth2 proxy server as authentication/authorization layer, a MLaaS4HEP server, an XRootD proxy server for enabling access to remote ROOT data, and the TensorFlow as a Service (TFaaS) service in charge of the inference phase. With this architecture the user is able to submit ML pipelines, after being authenticated and authorized, using local or remote ROOT files simply using HTTP call
- …