936 research outputs found
HEP C++ Meets reality
In 2007 the CMS experiment first reported some initial findings on the impedance mismatch between HEP use of C++ and the current generation of compilers and CPUs. Since then we have continued our analysis of the CMS experiment code base, including the external packages we use. We have found that large amounts of C++ code has been written largely ignoring any physical reality of the resulting machine code and run time execution costs, including and especially software developed by experts. We report on a wide range issues affecting typical high energy physics code, in the form of coding pattern - impact - lesson - improvement
Optimizing CMS build infrastructure via Apache Mesos
The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC)
at CERN consists of 6M lines of in-house code, developed over a decade by
nearly 1000 physicists, as well as a comparable amount of general use
open-source code. A critical ingredient to the success of the construction and
early operation of the WLCG was the convergence, around the year 2000, on the
use of a homogeneous environment of commodity x86-64 processors and Linux.
Apache Mesos is a cluster manager that provides efficient resource isolation
and sharing across distributed applications, or frameworks. It can run Hadoop,
Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of
nodes. We present how we migrated our continuos integration system to schedule
jobs on a relatively small Apache Mesos enabled cluster and how this resulted
in better resource usage, higher peak performance and lower latency thanks to
the dynamic scheduling capabilities of Mesos.Comment: Submitted to proceedings of the 21st International Conference on
Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japa
IGUANA Architecture, Framework and Toolkit for Interactive Graphics
IGUANA is a generic interactive visualisation framework based on a C++
component model. It provides powerful user interface and visualisation
primitives in a way that is not tied to any particular physics experiment or
detector design. The article describes interactive visualisation tools built
using IGUANA for the CMS and D0 experiments, as well as generic GEANT4 and
GEANT3 applications. It covers features of the graphical user interfaces, 3D
and 2D graphics, high-quality vector graphics output for print media, various
textual, tabular and hierarchical data views, and integration with the
application through control panels, a command line and different
multi-threading models.Comment: Presented at the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, USA, March 2003, 6 pages LaTeX, 4 eps figures. PSN
MOLT008 More and higher res figs at
http://iguana.web.cern.ch/iguana/snapshot/main/gallery.htm
Managing software build infrastructure at ALICE using Hashicorp Nomad
The ALICE experiment at CERN uses a cluster consisting of virtual and bare-metal machines to build and test proposed changes to the ALICE Online–Offline (O2) software in addition to building and publishing regular software releases.
Nomad is a free and open-source job scheduler for containerised and noncontainerised applications developed by Hashicorp. It is integrated into an ecosystem of related software, including Consul and Vault, providing a consistent interface to orchestration, monitoring and secret storage. At ALICE, it recently replaced Apache Mesos, Aurora and Marathon as the primary tool for managing our computing resources.
First, we will describe the architecture of the build cluster at the ALICE experiment. After giving an overview of the advantages that Nomad gives us in managing our computing workload, and our reasons for switching away from the Mesos software stack, we will present concrete examples of improvements in monitoring and automatic configuration of web services that we are already benefiting from. Finally, we will discuss where we see opportunities for future work in integrating the ALICE build infrastructure more deeply with Nomad, in order to take advantage of its larger feature set compared to Mesos
The O2 software framework and GPU usage in ALICE online and offline reconstruction in Run 3
ALICE has upgraded many of its detectors for LHC Run 3 to operate in continuous readout mode recording Pb–Pb collisions at 50 kHz interaction rate without trigger. This results in the need to process data in real time at rates 100 times higher than during Run 2. In order to tackle such a challenge we introduced O2, a new computing system and the associated infrastructure. Designed and implemented during the LHC long shutdown 2, O2 is now in production taking care of all the data processing needs of the experiment. O2 is designed around the message passing paradigm, enabling resilient, parallel data processing for both the synchronous (to LHC beam) and asynchronous data taking and processing phases. The main purpose of the synchronous online reconstruction is detector calibration and raw data compression. This synchronous processing is dominated by the TPC detector, which produces by far the largest data volume, and TPC reconstruction runs fully on GPUs. When there is no beam in the LHC, the powerful GPU-equipped online computing farm of ALICE is used for the asynchronous reconstruction, which creates the final reconstructed output for analysis from the compressed raw data. Since the majority of the compute performance of the online farm is in the GPUs, and since the asynchronous processing is not dominated by the TPC in the way the synchronous processing is, there is an ongoing effort to offload a significant amount of compute load from other detectors to the GPU as well
RootInteractive tool for multidimensional statistical analysis, machine learning and analytical model validation
The ALICE experiment [1] at CERN’s LHC is specifically designed for investigating heavy ion collisions. The upgraded ALICE accommodates a tenfold increase in Pb–Pb luminosity and a two-order-of-magnitude surge in minimum bias events. To address the challenges of high detector occupancy and event pile-ups, advanced multidimensional data analysis techniques, including machine learning (ML), are indispensable. Despite ML’s popularity, the complexity of its models presents interpretation challenges, and oversimplification in analysis often leads to inaccuracies.
Our objective was to develop RootInteractive, a tool for multidimensional statistical analysis. This tool simplifies data analysis across dimensions, visualizes functions with uncertainties, and validates assumptions and approximations. In RootInteractive, it is crucial to easily define the functional composition of analytical parametric and non-parametric functions, exploit symmetries, and define multidimensional "invariant" functions and corresponding alarms.
RootInteractive [2] adopts a declarative programming paradigm, ensuring userfriendliness for experts, students, and educators. It facilitates interactive visualization, n-dimensional histogramming/projection, and information extraction on both Python/C++ server and Javascript client. The tool supports client/server applications in Jupyter or standalone client-side applications. Through data compression, datasets with O(107) entries and O(25) attributes can be interactively analyzed in a browser with O(0.500-1 GB) size. Representative downsampling and reweighting/pre-aggregation enable the effective analysis of one year of ALICE data for various purposes
- …