385 research outputs found
CernVM-FS at Extreme Scales
The CernVM File System (CVMFS) provides the software distribution backbone for High Energy and Nuclear Physics experiments and many other scientific communities in the form of a globally available shared software area. It has been designed for the software distribution problem of experiment software for LHC Runs 1 and 2. For LHC Run 3 and even more so for HL-LHC (Runs 4-6), the complexity of the experiment software stacks and their build pipelines is substantially larger. For instance, software is being distributed for several CPU architectures, often in the form of containers which includes base and operating system libraries, the number of external packages such as machine learning libraries has multiplied, and there is a shift from C++ to more Python-heavy software stacks that results in more and smaller files needing to be distributed. For CVMFS, the new software landscape means an order of magnitude increase of scale in several key metrics. This contribution reports on the performance and reliability engineering on the file system client to sustain current and expected future software access load. Concretely, the impact of the newly designed file system cache management is shown, including significant performance improvements for HEP-representative benchmark workloads, and an up to 25% performance increase in software built-time when the build tools reside on CVMFS. Operational improvements presented include better network failure handling, error reporting, and integration with container runtimes. And a pilot study using zstd as compression algorithm shows that it could bring significant improvements for remote data access times
A cluster randomised feasibility study of an adolescent incentive intervention to increase uptake of HPV vaccination.
BACKGROUND: Uptake of human papillomavirus (HPV) vaccination is suboptimal among some groups. We aimed to determine the feasibility of undertaking a cluster randomised controlled trial (RCT) of incentives to improve HPV vaccination uptake by increasing consent form return. METHODS: An equal-allocation, two-arm cluster RCT design was used. We invited 60 London schools to participate. Those agreeing were randomised to either a standard invitation or incentive intervention arm, in which Year 8 girls had the chance to win a ÂŁ50 shopping voucher if they returned a vaccination consent form, regardless of whether consent was provided. We collected data on school and parent participation rates and questionnaire response rates. Analyses were descriptive. RESULTS: Six schools completed the trial and only 3% of parents opted out. The response rate was 70% for the girls' questionnaire and 17% for the parents'. In the intervention arm, 87% of girls returned a consent form compared with 67% in the standard invitation arm. The proportion of girls whose parents gave consent for vaccination was higher in the intervention arm (76%) than the standard invitation arm (61%). CONCLUSIONS: An RCT of an incentive intervention is feasible. The intervention may improve vaccination uptake but a fully powered RCT is needed.British Journal of Cancer advance online publication: 22 August 2017; doi:10.1038/bjc.2017.284 www.bjcancer.com
Multidifferential study of identified charged hadron distributions in -tagged jets in proton-proton collisions at 13 TeV
Jet fragmentation functions are measured for the first time in proton-proton
collisions for charged pions, kaons, and protons within jets recoiling against
a boson. The charged-hadron distributions are studied longitudinally and
transversely to the jet direction for jets with transverse momentum 20 GeV and in the pseudorapidity range . The
data sample was collected with the LHCb experiment at a center-of-mass energy
of 13 TeV, corresponding to an integrated luminosity of 1.64 fb. Triple
differential distributions as a function of the hadron longitudinal momentum
fraction, hadron transverse momentum, and jet transverse momentum are also
measured for the first time. This helps constrain transverse-momentum-dependent
fragmentation functions. Differences in the shapes and magnitudes of the
measured distributions for the different hadron species provide insights into
the hadronization process for jets predominantly initiated by light quarks.Comment: All figures and tables, along with machine-readable versions and any
supplementary material and additional information, are available at
https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-013.html (LHCb
public pages
Study of the decay
The decay is studied
in proton-proton collisions at a center-of-mass energy of TeV
using data corresponding to an integrated luminosity of 5
collected by the LHCb experiment. In the system, the
state observed at the BaBar and Belle experiments is
resolved into two narrower states, and ,
whose masses and widths are measured to be where the first uncertainties are statistical and the second
systematic. The results are consistent with a previous LHCb measurement using a
prompt sample. Evidence of a new
state is found with a local significance of , whose mass and width
are measured to be and , respectively. In addition, evidence of a new decay mode
is found with a significance of
. The relative branching fraction of with respect to the
decay is measured to be , where the first
uncertainty is statistical, the second systematic and the third originates from
the branching fractions of charm hadron decays.Comment: All figures and tables, along with any supplementary material and
additional information, are available at
https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-028.html (LHCb
public pages
Measurement of the ratios of branching fractions and
The ratios of branching fractions
and are measured, assuming isospin symmetry, using a
sample of proton-proton collision data corresponding to 3.0 fb of
integrated luminosity recorded by the LHCb experiment during 2011 and 2012. The
tau lepton is identified in the decay mode
. The measured values are
and
, where the first uncertainty is
statistical and the second is systematic. The correlation between these
measurements is . Results are consistent with the current average
of these quantities and are at a combined 1.9 standard deviations from the
predictions based on lepton flavor universality in the Standard Model.Comment: All figures and tables, along with any supplementary material and
additional information, are available at
https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-039.html (LHCb
public pages
Compute, Storage and Throughput Trade-offs for High-Energy Physics Data Acquisition
Nowadays, the large majority of research insights are gained by using compute-aided analyses. Before the analysis, data needs to be acquired and prepared.
Depending on the source of data, its acquisition can be a complex process. In that case, so-called Data Acquisition Systems (DAQs) are used. Real-time DAQs require unique challenges to be solved, either in latency or throughput. At the European Organization for Nuclear Research (CERN), where High Energy Physics (HEP) experiments collide particles, real-time DAQs are deployed to filter down the vast amount of data (for LHCb experiment: up to 4 TB/s). When working with large amounts of data, data compression allows improving limitations in capacity and transfer rates. This work is about the various compression techniques that exist and their evaluation for real-time HEP DAQs.
The first part characterizes popular general-purpose compression algorithms and their performance using ARM aarch64, IBM ppc64le, and Intel x86_64 CPU architectures. Their performance is found to be independent of the underlying CPU architecture, making each architecture a viable choice. The scaling and robustness are dependent on the number of simultaneous multithreading (SMT) available. High numbers of SMT scale better but are less robust in performance. When it comes to ``green'' computing, ARM outperforms IBM by a factor of 2.8 and Intel by a factor of 1.3.
The second part designs a co-scheduling policy that improves the integration of compression devices. This policy allows for efficient and fair distribution of performance between (independent) host and device workloads. It only needs two metrics: power consumption and memory bandwidth, and does not require any code changes for the host workload. Solely with NUMA binding and either polling or interrupts for communication with the device, the performance increases for resource-unsaturated host workloads by a factor of 1.5 - 4.0 for the device and by a factor of 1.8 -- 2.3 for the host. For resource-saturated host workloads, it increases by a factor of 1.8 - 1.9 for the device but decreases by 0.1 - 0.4 for the host.
The third part evaluates two compression techniques utilizing domain-based knowledge: Huffman coding and lossy autoencoders. Huffman coding on the original data compresses 40 - 260% better than any tested general-purpose algorithms. Huffman coding on delta encoded data performs poorly for HEP data.
Autoencoders are a popular machine learning technique. Two data representations, including One Hot Encoding, and many hyperparameters are tested. However, all configurations turn out to compress too lossy. They need more technological advances to improve the performance of neural networks with large layers.
And the last part performs a cost-benefit analysis of the previously presented compression techniques. It is based on power savings and capital expenses. Applied to the real-time LHCb DAQ, it concludes that only compression accelerators are an economically viable choice. Huffman coding on absolute values achieves a higher compression ratio than any general-purpose solution but is too slow. More research would be needed to find a better fitting compression technique based on domain knowledge.
While the context of this work is real-time DAQs in the HEP community with specific requirements and limitations, we believe the results of this work are generic enough to apply to the majority of environments and data characteristics
Adding Cross-Platform Support to a High-Throughput Software Stack and Exploration of Vectorization Libraries
This master thesis is written at the LHCb experiment at CERN. It is part of the initiative for improving software in view of the upcoming upgrade in 2021 which will significantly increase the amount of acquired data. This thesis consists of two parts. The first part is about the exploration of different vectorization libraries and their usefulness for the LHCb collaboration. The second part is about adding cross-platform support to the LHCb software stack. Here, the LHCb stack is successfully ported to ARM (aarch64) and its performance is analyzed. At the end of the thesis, the port to PowerPC(ppc64le) awaits the performance analysis. The main goal of porting the stack is the cost-performance evaluation for the different platforms to get the most cost efficient hardware for the new server farm for the upgrade. For this, selected vectorization libraries are extended to support the PowerPC and ARM platform. And though the same compiler is used, platform-specific changes to the compilation flags are required. In the evaluation of the port to ARM, the cost-performance analysis favors the tested Intel machine. Future steps will be to analyze the performance of the port to PowerPC and to improve the cross-platform support for selected vectorization libraries. The long-term goal is adding cross-platform support to the LHCb stack
Characterization of data compression across CPU platforms and accelerators
The ever increasing amount of generated data makes it more and more beneficial toutilize compression to trade computations for data movement and reduced storagerequirements.Lately,dedicatedacceleratorshavebeenintroducedtooffloadcompres-sion tasks from the main processor. However, research is lacking when it comes to thesystem costs for incorporating compression. This is especially true for the influence ofthe CPU platform and accelerators on the compression. This work will show that forgeneral-purpose lossless compression algorithms following can be recommended: (1)snappyforhighthroughput,butlowcompressionratio;(2)zstandard level 2formoderatethroughputandcompressionratio;(3)xz level 5forlowthroughput,buthighcompressionratio.Anditwillshowthattheselectedplatforms(ARM,IBMorIntel)have no influence on the algorithmâs performance. Furthermore, it will show that theacceleratorâs zlib implementation achieves a comparable compression ratio aszliblevel 2on a CPU, while having up to 17Ăthe throughput and utilizing over 80%less CPU resources. This suggests that the overhead of offloading compression is lim-ited but present. Overall, this work will allow system designers to identify deploymentopportunities for compression while considering integration constraints
Porting the LHCb Stack from x86 (Intel) to aarch64 (ARM) and ppc64le (PowerPC)
LHCb is undergoing major changes in its data selection and processing chain for the upcoming LHC Run 3 starting in 2021. With this in sight several initiatives have been launched to optimise the software stack. This contribution discusses porting the LHCb Stack from x86_64 architecture to both architectures aarch64 and ppc64le with the goal to evaluate the performance and the cost of the computing infrastructure for the High Level Trigger (HLT). This requires porting a stack with more than five million lines of code and finding working versions of external libraries provided by LCG. Across all software packages the biggest challenge is the growing use of vectorisation - as many vectorisation libraries are specialised on x86 architecture and do not have any support for other architectures. In spite of these challenges we have successfully ported the LHCb High Level Trigger code to aarch64 and ppc64le. This contribution discusses the status and plans for the porting of the software as well as the LHCb approach for tackling code vectorisation in a platform independent way
Porting the LHCb Stack from x86 (Intel) to aarch64 (ARM) and ppc64le (PowerPC)
LHCb is undergoing major changes in its data selection and processing chain for the upcoming LHC Run 3 starting in 2021. With this in sight several initiatives have been launched to optimise the software stack. This contribution discusses porting the LHCb Stack from x86_64 architecture to both architectures aarch64 and ppc64le with the goal to evaluate the performance and the cost of the computing infrastructure for the High Level Trigger (HLT). This requires porting a stack with more than five million lines of code and finding working versions of external libraries provided by LCG. Across all software packages the biggest challenge is the growing use of vectorisation - as many vectorisation libraries are specialised on x86 architecture and do not have any support for other architectures. In spite of these challenges we have successfully ported the LHCb High Level Trigger code to aarch64 and ppc64le. This contribution discusses the status and plans for the porting of the software as well as the LHCb approach for tackling code vectorisation in a platform independent way
- âŠ