Search CORE

117 research outputs found

ROOT Status and Future Developments

Author: Brun Rene
Canal Philippe
Goto Masaharu
Rademakers Fons
Publication venue
Publication date: 16/06/2003
Field of study

In this talk we will review the major additions and improvements made to the ROOT system in the last 18 months and present our plans for future developments. The additons and improvements range from modifications to the I/O sub-system to allow users to save and restore objects of classes that have not been instrumented by special ROOT macros, to the addition of a geometry package designed for building, browsing, tracking and visualizing detector geometries. Other improvements include enhancements to the quick analysis sub-system (TTree::Draw()), the addition of classes that allow inter-file object references (TRef, TRefArray), better support for templated and STL classes, amelioration of the Automatic Script Compiler and the incorporation of new fitting and mathematical tools. Efforts have also been made to increase the modularity of the ROOT system with the introduction of more abstract interfaces and the development of a plug-in manager. In the near future, we intend to continue the development of PROOF and its interfacing with GRID environments. We plan on providing an interface between Geant3, Geant4 and Fluka and the new geometry package. The ROOT GUI classes will finally be available on Windows and we plan to release a GUI inspector and builder. In the last year, ROOT has drawn the endorsement of additional experiments and institutions. It is now officially supported by CERN and used as key I/O component by the LCG project.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 5 pages, MSWord, pSN MOJT00

arXiv.org e-Print Archive

CERN Document Server

Increasing Parallelism in the ROOT I/O Subsystem

Author: Amadio Guilherme
Bockelman Brian
Canal Philippe
Piparo Danilo
Tejedor Enric
Zhang Zhe
Publication venue
Publication date: 09/04/2018
Field of study

When processing large amounts of data, the rate at which reading and writing can take place is a critical factor. High energy physics data processing relying on ROOT is no exception. The recent parallelisation of LHC experiments' software frameworks and the analysis of the ever increasing amount of collision data collected by experiments further emphasized this issue underlying the need of increasing the implicit parallelism expressed within the ROOT I/O. In this contribution we highlight the improvements of the ROOT I/O subsystem which targeted a satisfactory scaling behaviour in a multithreaded context. The effect of parallelism on the individual steps which are chained by ROOT to read and write data, namely (de)compression, (de)serialisation, access to storage backend, are discussed. Performance measurements are discussed through real life examples coming from CMS production workflows on traditional server platforms and highly parallel architectures such as Intel Xeon Phi

arXiv.org e-Print Archive

CERN Document Server

Bitmap indices for fast end-user physics analysis in root.

Author: Kesheng Wu
Kurt Stockinger
Philippe Canal
Rene Brun
Publication venue: Elsevier,
Publication date: 01/01/2006
Field of study

Most physics analysis jobs involve multiple selection steps on the input data. These selection steps are called cuts or queries. A common strategy to implement these queries is to read all input data from files and then process the queries in memory. In many applications the number of variables used to define these queries is a relative small portion of the overall data set therefore reading all variables into memory takes unnecessarily long time. In this paper we describe an integration effort that can significantly reduce this unnecessary reading by using an efficient compressed bitmap index technology. The primary advantage of this index is that it can process arbitrary combinations of queries very efficiently, while most other indexing technologies suffer from the "curse of dimensionality" as the number of queries increases. By integrating this index technology with the ROOT analysis framework, the end-users can benefit from the added efficiency without having to modify their analysis programs. Our performance results show that for multi-dimensional queries, bitmap indices outperform the traditional analysis method up to a factor of 10

CiteSeerX

Boosting RDataFrame performance with transparent bulk event processing

Author: Blomer Jakob
Canal Philippe
Guiraud Enrico
Naumann Axel
Publication venue: EDP Sciences
Publication date: 01/01/2024
Field of study

RDataFrame is ROOT’s high-level interface for Python and C++ data analysis. Since it first became available, RDataFrame adoption has grown steadily and it is now poised to be a major component of analysis software pipelines for LHC Run 3 and beyond. Thanks to its design inspired by declarative programming principles, RDataFrame enables the development of highperformance, highly parallel analyses without requiring expert knowledge of multi-threading and I/O: user logic is expressed in terms of self-contained, small computation kernels tied together by a high-level API. This design completely decouples analysis logic from its actual execution, and opens several interesting avenues for workflow optimization. In particular, in this work we explore the benefits of moving internal data processing from an event-by-event to a bulkby-bulk loop. This refactoring dramatically reduces the framework’s runtime overheads; in collaboration with the I/O layer it improves data access patterns; it exposes information that optimizing compilers might use to auto-vectorize the invocation of user-defined computations; finally, while existing user-facing interfaces remain unaffected, it becomes possible to additionally offer interfaces that explicitly expose bulks of events, useful e.g. for the injection of GPU kernels into the analysis workflow. In order to inform similar future R&D, design challenges will be presented, as well as an investigation of the relevant timememory trade-off backed by novel performance benchmarks

Directory of Open Access Journals

Software Challenges For HL-LHC Data Analysis

Author: Amadio Guilherme
An Sitong
Bellenot Bertrand
Blomer Jakob
Brann Kim Albertsson
Canal Philippe
Couet Olivier
Galli Massimiliano
Guiraud Enrico
Hageboeck Stephan
Linev Sergey
Moneta Lorenzo
Naumann Axel
Padulano Vincenzo Eduardo
Pla Xavier Valls
Rademakers Fons
ROOT Team
Saavedra Enric Tejedor
Shadura Oksana
Tadel Alja Mrak
Tadel Matevz
Vassilev Vassil
Vila Pere Mato
Wunsch Stefan
Publication venue
Publication date: 04/05/2020
Field of study

The high energy physics community is discussing where investment is needed to prepare software for the HL-LHC and its unprecedented challenges. The ROOT project is one of the central software players in high energy physics since decades. From its experience and expectations, the ROOT team has distilled a comprehensive set of areas that should see research and development in the context of data analysis software, for making best use of HL-LHC's physics potential. This work shows what these areas could be, why the ROOT team believes investing in them is needed, which gains are expected, and where related work is ongoing. It can serve as an indication for future research proposals and cooperations

arXiv.org e-Print Archive

CERN Document Server

ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization

Author: Antcheva Ilka
Ballintijn Maarten
Bellenot Bertrand
Biskup Marek
Brun Rene
Buncic Nenad
Canal Philippe
Casadei Diego
Couet Olivier
Fine Valery
Franco Leandro
Ganis Gerardo
Gheata Andrei
Goto Masaharu
Iwaszkiewicz Jan
Kreshuk Anna
Maline David Gonzalez
Maunder Richard
Moneta Lorenzo
Naumann Axel
Offermann Eddy
Onuchin Valeriy
Panacek Suzanne
Rademakers Fons
Russo Paul
Segura Diego Marcos
Tadel Matevz
Publication venue: 'Elsevier BV'
Publication date: 31/08/2015
Field of study

ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, ROOT offers packages for complex data modeling and fitting, as well as multivariate classification based on machine learning techniques. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way

arXiv.org e-Print Archive

CERN Document Server

Contract Aware Components, 10 years after

Author: A. Keller
Albert Benveniste
Ali Koudri
Ananda Basu
Antoine Beugnard
Antoine Beugnard
Antoine Beugnard
Arindam Chakrabarti
Bernholdt et al.
Bertrand Meyer
Brice Morin
Carlos Canal
E. Bruneton
Eveline C. Kaboré
Guillaume Waignier
Gwen Salaün
Henrik Thane
Ivica Crnkovic
Javier Cámara
Jean-Marc Jézéquel
John Cheesman
Joseph Sifakis
L. Baresi
Luca de Alfaro
Matthias Klusch
Meriem Ouederni
Noël Plouzeau
Philip K. McKinley
Philip. K. McKinley
Philippe Collet
Richard Soley
Sébastien Saudrais
T. Ravichandran
Vladimir Tosic
Werner Damm
Werner Damm
Publication venue: 'Open Publishing Association'
Publication date: 01/10/2010
Field of study

The notion of contract aware components has been published roughly ten years ago and is now becoming mainstream in several fields where the usage of software components is seen as critical. The goal of this paper is to survey domains such as Embedded Systems or Service Oriented Architecture where the notion of contract aware components has been influential. For each of these domains we briefly describe what has been done with this idea and we discuss the remaining challenges.Comment: In Proceedings WCSI 2010, arXiv:1010.233

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

Directory of Open Access Journals

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

ROOT for the HL-LHC: data format

Author: Bellenot Bertrand
Blomer Jakob
Canal Philippe
Couet Olivier
Gomez Javier Lopez
Gruber Bernhard Manfred
Guiraud Enrico
Hahnfeld Jonas
Linev Sergey
Moneta Lorenzo
Naumann Axel
Padulano Vincenzo Eduardo
Rembser Jonas
Tadel Alja Mrak
Tadel Matevz
Tejedor Enric
Vassilev Vassil
Publication venue
Publication date: 09/04/2022
Field of study

This document discusses the state, roadmap, and risks of the foundational components of ROOT with respect to the experiments at the HL-LHC (Run 4 and beyond). As foundational components, the document considers in particular the ROOT input/output (I/O) subsystem. The current HEP I/O is based on the TFile container file format and the TTree binary event data format. The work going into the new RNTuple event data format aims at superseding TTree, to make RNTuple the production ROOT event data I/O that meets the requirements of Run 4 and beyond

arXiv.org e-Print Archive

CERN Document Server

ROOT’s RNTuple I/O Subsystem: The Path to Production

Author: Blomer Jakob
Canal Philippe
de Geus Florine
Hahnfeld Jonas
Lazzari Miotto Giovanna
Lopez-Gomez Javier
Naumann Axel
Padulano Vincenzo Eduardo
Publication venue: EDP Sciences
Publication date: 01/01/2024
Field of study

The RNTuple I/O subsystem is ROOT’s future event data file format and access API. It is driven by the expected data volume increase at upcoming HEP experiments, e.g. at the HL-LHC, and recent opportunities in the storage hardware and software landscape such as NVMe drives and distributed object stores. RNTuple is a redesign of the TTree binary format and API and has shown to deliver substantially faster data throughput and better data compression both compared to TTree and to industry standard formats. In order to let HENP computing workflows benefit from RNTuple’s superior performance, however, the I/O stack needs to connect efficiently to the rest of the ecosystem, from grid storage to (distributed) analysis frameworks to (multithreaded) experiment frameworks for reconstruction and ntuple derivation. With the RNTuple binary format soon arriving at its first production release, we present RNTuple’s feature set, integration efforts, and its performance impact on the time-to-solution. We show the latest performance figures of RDataFrame analysis code of realistic complexity, comparing RNTuple and TTree as data sources. We discuss RNTuple’s approach to functionality critical to the HENP I/O (such as multithreaded writes, fast data merging, schema evolution) and we provide an outlook on the road to its use in production

Directory of Open Access Journals

I/O performance studies of analysis workloads on production and dedicated resources at CERN

Author: Blomer Jakob
Canal Philippe
Duellmann Dirk
Guiraud Enrico
Naumann Axel
Padulano Vincenzo Eduardo
Panzer-Steindel Bernd
Peters Andreas
Schulz Markus
Sciabà Andrea
Smith David
Publication venue: EDP Sciences
Publication date: 01/01/2024
Field of study

The recent evolutions of the analysis frameworks and physics data formats of the LHC experiments provide the opportunity of using central analysis facilities with a strong focus on interactivity and short turnaround times, to complement the more common distributed analysis on the Grid. In order to plan for such facilities, it is essential to know in detail the performance of the combination of a given analysis framework, of a specific analysis and of the installed computing and storage resources. This contribution describes performance studies performed at CERN, using the EOS disk-based storage, either directly or through an XCache instance, from both batch resources and highperformance compute nodes which could be used to build an analysis facility. A variety of benchmarks, both synthetic and based on real-world physics analyses and their corresponding input datasets, are utilized. In particular, the RNTuple format from the ROOT project is put to the test and compared to the latest version of the TTree format, and the impact of caches is assessed. In addition, we assessed the difference in performance between the use of storage system specific protocols, like XRootd, and FUSE. The results of this study are intended to be a valuable input in the design of analysis facilities, at CERN and elsewhere

Directory of Open Access Journals