Search CORE

3,092 research outputs found

Disk storage management for LHCb based on Data Popularity estimator

Author: Charpentier Philippe
Hushchyn Mikhail
Ustyuzhanin Andrey
Publication venue: 'IOP Publishing'
Publication date: 01/10/2015
Field of study

This paper presents an algorithm providing recommendations for optimizing the LHCb data storage. The LHCb data storage system is a hybrid system. All datasets are kept as archives on magnetic tapes. The most popular datasets are kept on disks. The algorithm takes the dataset usage history and metadata (size, type, configuration etc.) to generate a recommendation report. This article presents how we use machine learning algorithms to predict future data popularity. Using these predictions it is possible to estimate which datasets should be removed from disk. We use regression algorithms and time series analysis to find the optimal number of replicas for datasets that are kept on disk. Based on the data popularity and the number of replicas optimization, the algorithm minimizes a loss function to find the optimal data distribution. The loss function represents all requirements for data distribution in the data storage system. We demonstrate how our algorithm helps to save disk space and to reduce waiting times for jobs using this data

arXiv.org e-Print Archive

Crossref

CERN Document Server

Numerical optimization for Artificial Retina Algorithm

Author: Belous Mikhail
Borisyak Maxim
Derkach Denis
Ustyuzhanin Andrey
Publication venue: 'IOP Publishing'
Publication date: 25/09/2017
Field of study

High-energy physics experiments rely on reconstruction of the trajectories of particles produced at the interaction point. This is a challenging task, especially in the high track multiplicity environment generated by p-p collisions at the LHC energies. A typical event includes hundreds of signal examples (interesting decays) and a significant amount of noise (uninteresting examples). This work describes a modification of the Artificial Retina algorithm for fast track finding: numerical optimization methods were adopted for fast local track search. This approach allows for considerable reduction of the total computational time per event. Test results on simplified simulated model of LHCb VELO (VErtex LOcator) detector are presented. Also this approach is well-suited for implementation of paralleled computations as GPGPU which look very attractive in the context of upcoming detector upgrades

arXiv.org e-Print Archive

CERN Document Server

GRID Storage Optimization in Transparent and User-Friendly Way for LHCb Datasets

Author: Charpentier Philippe
Haen Christophe
Hushchyn Mikhail
Ustyuzhanin Andrey
Publication venue: 'IOP Publishing'
Publication date: 12/05/2017
Field of study

The LHCb collaboration is one of the four major experiments at the Large Hadron Collider at CERN. Many petabytes of data are produced by the detectors and Monte-Carlo simulations. The LHCb Grid interware LHCbDIRAC is used to make data available to all collaboration members around the world. The data is replicated to the Grid sites in different locations. However the Grid disk storage is limited and does not allow keeping replicas of each file at all sites. Thus it is essential to optimize number of replicas to achieve a better Grid performance. In this study, we present a new approach of data replication and distribution strategy based on data popularity prediction. The popularity is performed based on the data access history and metadata, and uses machine learning techniques and time series analysis methods

arXiv.org e-Print Archive

Crossref

CERN Document Server

Event Index - an LHCb Event Search System

Author: Artemov Alexey
Kazeev Nikita
Redkin Artem
Ustyuzhanin Andrey
Publication venue: 'IOP Publishing'
Publication date: 01/01/2015
Field of study

During LHC Run 1, the LHCb experiment recorded around

10^{11}

collision events. This paper describes Event Index - an event search system. Its primary function is to quickly select subsets of events from a combination of conditions, such as the estimated decay channel or number of hits in a subdetector. Event Index is essentially Apache Lucene optimized for read-only indexes distributed over independent shards on independent nodes.Comment: Report for the proceedings of the CHEP-2015 conferenc

arXiv.org e-Print Archive

Crossref

CERN Document Server

Archivio della ricerca- Università di Roma La Sapienza

Cherenkov Detectors Fast Simulation Using Neural Networks

Author: Derkach Denis
Kazeev Nikita
Ratnikov Fedor
Ustyuzhanin Andrey
Volokhova Alexandra
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

We propose a way to simulate Cherenkov detector response using a generative adversarial neural network to bypass low-level details. This network is trained to reproduce high level features of the simulated detector events based on input observables of incident particles. This allows the dramatic increase of simulation speed. We demonstrate that this approach provides simulation precision which is consistent with the baseline and discuss possible implications of these results.Comment: In proceedings of 10th International Workshop on Ring Imaging Cherenkov Detector

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Reproducible Experiment Platform

Author: Baranov Alexander
Khairullin Egor
Likhomanenko Tatiana
Rogozhnikov Alex
Ustyuzhanin Andrey
Publication venue: 'IOP Publishing'
Publication date: 01/10/2015
Field of study

Data analysis in fundamental sciences nowadays is an essential process that pushes frontiers of our knowledge and leads to new discoveries. At the same time we can see that complexity of those analyses increases fast due to a)~enormous volumes of datasets being analyzed, b)~variety of techniques and algorithms one have to check inside a single analysis, c)~distributed nature of research teams that requires special communication media for knowledge and information exchange between individual researchers. There is a lot of resemblance between techniques and problems arising in the areas of industrial information retrieval and particle physics. To address those problems we propose Reproducible Experiment Platform (REP), a software infrastructure to support collaborative ecosystem for computational science. It is a Python based solution for research teams that allows running computational experiments on shared datasets, obtaining repeatable results, and consistent comparisons of the obtained results. We present some key features of REP based on case studies which include trigger optimization and physics analysis studies at the LHCb experiment.Comment: 21st International Conference on Computing in High Energy Physics (CHEP2015), 6 page

arXiv.org e-Print Archive

Crossref

CERN Document Server