38 research outputs found
A Roadmap for HEP Software and Computing R&D for the 2020s
Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade.Peer reviewe
COVID-19 symptoms at hospital admission vary with age and sex: results from the ISARIC prospective multinational observational study
Background:
The ISARIC prospective multinational observational study is the largest cohort of hospitalized patients with COVID-19. We present relationships of age, sex, and nationality to presenting symptoms.
Methods:
International, prospective observational study of 60 109 hospitalized symptomatic patients with laboratory-confirmed COVID-19 recruited from 43 countries between 30 January and 3 August 2020. Logistic regression was performed to evaluate relationships of age and sex to published COVID-19 case definitions and the most commonly reported symptoms.
Results:
‘Typical’ symptoms of fever (69%), cough (68%) and shortness of breath (66%) were the most commonly reported. 92% of patients experienced at least one of these. Prevalence of typical symptoms was greatest in 30- to 60-year-olds (respectively 80, 79, 69%; at least one 95%). They were reported less frequently in children (≤ 18 years: 69, 48, 23; 85%), older adults (≥ 70 years: 61, 62, 65; 90%), and women (66, 66, 64; 90%; vs. men 71, 70, 67; 93%, each P < 0.001). The most common atypical presentations under 60 years of age were nausea and vomiting and abdominal pain, and over 60 years was confusion. Regression models showed significant differences in symptoms with sex, age and country.
Interpretation:
This international collaboration has allowed us to report reliable symptom data from the largest cohort of patients admitted to hospital with COVID-19. Adults over 60 and children admitted to hospital with COVID-19 are less likely to present with typical symptoms. Nausea and vomiting are common atypical presentations under 30 years. Confusion is a frequent atypical presentation of COVID-19 in adults over 60 years. Women are less likely to experience typical symptoms than men
Recommended from our members
Global burden of 288 causes of death and life expectancy decomposition in 204 countries and territories and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021
BACKGROUND Regular, detailed reporting on population health by underlying cause of death is fundamental for public health decision making. Cause-specific estimates of mortality and the subsequent effects on life expectancy worldwide are valuable metrics to gauge progress in reducing mortality rates. These estimates are particularly important following large-scale mortality spikes, such as the COVID-19 pandemic. When systematically analysed, mortality rates and life expectancy allow comparisons of the consequences of causes of death globally and over time, providing a nuanced understanding of the effect of these causes on global populations. METHODS The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2021 cause-of-death analysis estimated mortality and years of life lost (YLLs) from 288 causes of death by age-sex-location-year in 204 countries and territories and 811 subnational locations for each year from 1990 until 2021. The analysis used 56 604 data sources, including data from vital registration and verbal autopsy as well as surveys, censuses, surveillance systems, and cancer registries, among others. As with previous GBD rounds, cause-specific death rates for most causes were estimated using the Cause of Death Ensemble model-a modelling tool developed for GBD to assess the out-of-sample predictive validity of different statistical models and covariate permutations and combine those results to produce cause-specific mortality estimates-with alternative strategies adapted to model causes with insufficient data, substantial changes in reporting over the study period, or unusual epidemiology. YLLs were computed as the product of the number of deaths for each cause-age-sex-location-year and the standard life expectancy at each age. As part of the modelling process, uncertainty intervals (UIs) were generated using the 2·5th and 97·5th percentiles from a 1000-draw distribution for each metric. We decomposed life expectancy by cause of death, location, and year to show cause-specific effects on life expectancy from 1990 to 2021. We also used the coefficient of variation and the fraction of population affected by 90% of deaths to highlight concentrations of mortality. Findings are reported in counts and age-standardised rates. Methodological improvements for cause-of-death estimates in GBD 2021 include the expansion of under-5-years age group to include four new age groups, enhanced methods to account for stochastic variation of sparse data, and the inclusion of COVID-19 and other pandemic-related mortality-which includes excess mortality associated with the pandemic, excluding COVID-19, lower respiratory infections, measles, malaria, and pertussis. For this analysis, 199 new country-years of vital registration cause-of-death data, 5 country-years of surveillance data, 21 country-years of verbal autopsy data, and 94 country-years of other data types were added to those used in previous GBD rounds. FINDINGS The leading causes of age-standardised deaths globally were the same in 2019 as they were in 1990; in descending order, these were, ischaemic heart disease, stroke, chronic obstructive pulmonary disease, and lower respiratory infections. In 2021, however, COVID-19 replaced stroke as the second-leading age-standardised cause of death, with 94·0 deaths (95% UI 89·2-100·0) per 100 000 population. The COVID-19 pandemic shifted the rankings of the leading five causes, lowering stroke to the third-leading and chronic obstructive pulmonary disease to the fourth-leading position. In 2021, the highest age-standardised death rates from COVID-19 occurred in sub-Saharan Africa (271·0 deaths [250·1-290·7] per 100 000 population) and Latin America and the Caribbean (195·4 deaths [182·1-211·4] per 100 000 population). The lowest age-standardised death rates from COVID-19 were in the high-income super-region (48·1 deaths [47·4-48·8] per 100 000 population) and southeast Asia, east Asia, and Oceania (23·2 deaths [16·3-37·2] per 100 000 population). Globally, life expectancy steadily improved between 1990 and 2019 for 18 of the 22 investigated causes. Decomposition of global and regional life expectancy showed the positive effect that reductions in deaths from enteric infections, lower respiratory infections, stroke, and neonatal deaths, among others have contributed to improved survival over the study period. However, a net reduction of 1·6 years occurred in global life expectancy between 2019 and 2021, primarily due to increased death rates from COVID-19 and other pandemic-related mortality. Life expectancy was highly variable between super-regions over the study period, with southeast Asia, east Asia, and Oceania gaining 8·3 years (6·7-9·9) overall, while having the smallest reduction in life expectancy due to COVID-19 (0·4 years). The largest reduction in life expectancy due to COVID-19 occurred in Latin America and the Caribbean (3·6 years). Additionally, 53 of the 288 causes of death were highly concentrated in locations with less than 50% of the global population as of 2021, and these causes of death became progressively more concentrated since 1990, when only 44 causes showed this pattern. The concentration phenomenon is discussed heuristically with respect to enteric and lower respiratory infections, malaria, HIV/AIDS, neonatal disorders, tuberculosis, and measles. INTERPRETATION Long-standing gains in life expectancy and reductions in many of the leading causes of death have been disrupted by the COVID-19 pandemic, the adverse effects of which were spread unevenly among populations. Despite the pandemic, there has been continued progress in combatting several notable causes of death, leading to improved global life expectancy over the study period. Each of the seven GBD super-regions showed an overall improvement from 1990 and 2021, obscuring the negative effect in the years of the pandemic. Additionally, our findings regarding regional variation in causes of death driving increases in life expectancy hold clear policy utility. Analyses of shifting mortality trends reveal that several causes, once widespread globally, are now increasingly concentrated geographically. These changes in mortality concentration, alongside further investigation of changing risks, interventions, and relevant policy, present an important opportunity to deepen our understanding of mortality-reduction strategies. Examining patterns in mortality concentration might reveal areas where successful public health interventions have been implemented. Translating these successes to locations where certain causes of death remain entrenched can inform policies that work to improve life expectancy for people everywhere. FUNDING Bill & Melinda Gates Foundation
CernVM Workshop 2019
In preparation for the post-LHC era, the Future Circular Collider (FCC)
Collaboration is undertaking design studies for multiple accelerator
projects with emphasis on proton-proton and electron-positron high-energy
frontier machines. From the beginning of the collaboration, the development
of a software stack with common and interchangeable packages plays an
important role in simulation, reconstruction or analysis studies. Similarly
to the existing LHC experiments, these packages need to be built and
deployed for different platforms and compilers. For these tasks FCC relies
on Spack, a new package manager tool recently adopted by other experiments
inside the High-Energy Physics (HEP) community. Despite its warm adoption,
the integration of Spack with CVMFS for software distribution, while
already possible, exposes some limitations. This talk provides an overview
of these difficulties in the context of the FCC Software
Parallelization and optimization of a High Energy Physics analysis with ROOT’s RDataFrame and Spark
After a remarkable era plenty of great discoveries, particle physics has an ambitious and broad experimental program aiming to expand the limits of our knowledge about the nature of our universe. The roadmap for the coming decades is full of big challenges with the Large Hadron Collider (LHC) at the forefront of scientific research. The discovery of the Higgs boson in 2012 is a major milestone in the history of physics that has inspired many new theories based on its presence. Located at CERN, the European Organization for Nuclear Research, the LHC will remain the most powerful accelerator in the world for years to come. In order to extend its discovery potential, a major hardware upgrade will take place in the 2020s to increase the number of collisions produced by a factor of five beyond its design value. The upgraded version of the LHC, called High-Luminosity LHC (HL-LHC) is expected to produce some 30 times more data than the LHC has currently produced. As the total amount of LHC data already collected is close to an exabyte ( bytes), the foreseen evolution in hardware technology will not be enough to face the challenge of processing those increasing volumes of data. Therefore, software will have to cover that gap: there is a need for tools to easily express physics analyses in a high-level way, as well as to automatically parallelize those analyses on new parallel and distributed infrastructures. The High Energy Physics (HEP) community has developed specialized solutions for processing experiments data over decades. However, HEP data sizes are becoming more common in other fields. Recent breakthroughs in Big Data and Cloud platforms inquire whether such technologies could be applied to the domain of physics data analysis. This thesis is carried out in the context of a collaboration between different CERN departments with the aim of providing web-based interactive services as the entry-point for scientists to cutting-edge distributed data processing frameworks such as Spark. In such context, this thesis aims to exploit the aforementioned services to run a real analysis of the CERN TOTEM experiment on 4.7TB of data. In addition, this thesis explores the benefits of a new high-level programming model for physics analysis, called RDataFrame, by translating the original code of the TOTEM analysis to use RDataFrame. Following sections describe for the first time the detailed process of translating a data analysis of this magnitude to the programming model offered by RDataFrame. Moreover, we compare the performance between both codes and provide results gathered from local and distributed analyses. Results are promising and show how the processing time of the dataset can be reduced by multiple order of magnitude with the new analysis model
Modern Software Stack Building for HEP
High-Energy Physics has evolved a rich set of software packages that need to work harmoniously to carry out the key software tasks needed by experiments. The problem of consistently building and deploying these packages as a coherent software stack is one that is shared across the HEP community. To that end the HEP Software Foundation Packaging Working Group has worked to identify common solutions that can be used across experiments, with an emphasis on consistent, reproducible builds and easy deployment into CernVM-FS or containers via CI systems. We based our approach on well-identified use cases and requirements from many experiments. In this paper we summarise the work of the group in the last year and how we have explored various approaches based on package managers from industry and the scientific computing community. We give details about a solution based on the Spack package manager which has been used to build the software required by the SuperNEMO and FCC experiments and trialled for a multi-experiment software stack, Key4hep. We shall discuss changes that needed to be made to Spack to satisfy all our requirements. We show how support for a build environment for software developers is provided
Distributed data analysis with ROOT RDataFrame
Widespread distributed processing of big datasets has been around for more than a decade now thanks to Hadoop, but only recently higher-level abstractions have been proposed for programmers to easily operate on those datasets, e.g. Spark. ROOT has joined that trend with its RDataFrame tool for declarative analysis, which currently supports local multi-threaded parallelisation. However, RDataFrame’s programming model is general enough to accommodate multiple implementations or backends: users could write their code once and execute it as-is locally or distributedly, just by selecting the corresponding backend.
This abstract introduces PyRDF, a new python library developed on top of RDataFrame to seamlessly switch from local to distributed environments with no changes in the application code. In addition, PyRDF has been integrated with a service for web-based analysis, SWAN, where users can dynamically plug in new resources, as well as write, execute, monitor and debug distributed applications via an intuitive interface
Estudio numérico del sistema de dirección para un vehículo arenero
Diseño asistido por computadora del sistema de dirección de vehículos automotore
Modern Software Stack Building for HEP
High-Energy Physics has evolved a rich set of software packages that need to work harmoniously to carry out the key software tasks needed by experiments. The problem of consistently building and deploying these packages as a coherent software stack is one that is shared across the HEP community. To that end the HEP Software Foundation Packaging Working Group has worked to identify common solutions that can be used across experiments, with an emphasis on consistent, reproducible builds and easy deployment into CernVM-FS or containers via CI systems. We based our approach on well-identified use cases and requirements from many experiments. In this paper we summarise the work of the group in the last year and how we have explored various approaches based on package managers from industry and the scientific computing community. We give details about a solution based on the Spack package manager which has been used to build the software required by the SuperNEMO and FCC experiments and trialled for a multi-experiment software stack, Key4hep. We shall discuss changes that needed to be made to Spack to satisfy all our requirements. We show how support for a build environment for software developers is provided
Distributed data analysis with ROOT RDataFrame
Widespread distributed processing of big datasets has been around for more than a decade now thanks to Hadoop, but only recently higher-level abstractions have been proposed for programmers to easily operate on those datasets, e.g. Spark. ROOT has joined that trend with its RDataFrame tool for declarative analysis, which currently supports local multi-threaded parallelisation. However, RDataFrame’s programming model is general enough to accommodate multiple implementations or backends: users could write their code once and execute it as-is locally or distributedly, just by selecting the corresponding backend.
This abstract introduces PyRDF, a new python library developed on top of RDataFrame to seamlessly switch from local to distributed environments with no changes in the application code. In addition, PyRDF has been integrated with a service for web-based analysis, SWAN, where users can dynamically plug in new resources, as well as write, execute, monitor and debug distributed applications via an intuitive interface