8,629 research outputs found
A new development cycle of the Statistical Toolkit
The Statistical Toolkit is an open source system specialized in the
statistical comparison of distributions. It addresses requirements common to
different experimental domains, such as simulation validation (e.g. comparison
of experimental and simulated distributions), regression testing in the course
of the software development process, and detector performance monitoring.
Various sets of statistical tests have been added to the existing collection to
deal with the one sample problem (i.e. the comparison of a data distribution to
a function, including tests for normality, categorical analysis and the
estimate of randomness). Improved algorithms and software design contribute to
the robustness of the results. A simple user layer dealing with primitive data
types facilitates the use of the toolkit both in standalone analyses and in
large scale experiments.Comment: To be published in the Proc. of CHEP (Computing in High Energy
Physics) 201
Distributed computing in the LHC era
A large, worldwide distributed, scientific community is running intensively physics analyses on the first data collected at LHC. In order to prepare for this unprecedented computing challenge, the four LHC experiments have developed distributed computing models capable of serving, processing and archiving the large number of events produced by data taking, amounting to about 15 petabytes per year. The experiments workflows for event reconstruction from raw data, production of simulated events and physics analysis on skimmed data generate hundreds of thousands of jobs per day, running on a complex distributed computing fabric. All this is possible thanks to reliable Grid services, which have been developed, deployed at the needed scale and thouroughly tested by the WLCG Collaboration during the last ten years. In order to provide a concrete example, this paper concentrates on CMS computing model and CMS experience with the first data at LHC
The computing of the LHC experiments
The LHC experiments have thousands of collaborators distributed worldwide who expect to run their physics analyses on the collected data. Each of the four experiments will run hundreds of thousands of jobs per day, including event reconstruction from raw data, analysis on skimmed data, and production of simulated events. At the same time tens of petabytes of data will have to be easily
available on a complex distributed computing fabric for a period of at least ten years. These challenging goals have prompted the development and deployment of reliable Grid services, which have been thouroughly tested and put at the needed scale over the last years. This paper concentrates on CMS computing needs for data taking at LHC and highlights the INFN-Grid contribution to the effort
An electromagnetic shashlik calorimeter with longitudinal segmentation
A novel technique for longitudinal segmentation of shashlik calorimeters has
been tested in the CERN West Area beam facility. A 25 tower very fine samplings
e.m. calorimeter has been built with vacuum photodiodes inserted in the first 8
radiation lengths to sample the initial development of the shower. Results
concerning energy resolution, impact point reconstruction and electron/pion
separation are reported.Comment: 13 pages, 12 figure
Integrated Depths for Partially Observed Functional Data
Partially observed functional data are frequently encountered in applications and are the object of an increasing interest by the literature. We here address the problem of measuring the centrality of a datum in a partially observed functional sample. We propose an integrated functional depth for partially observed functional data, dealing with the very challenging case where partial observability can occur systematically on any observation of the functional dataset. In particular, differently from many techniques for partially observed functional data, we do not request that some functional datum is fully observed, nor we require that a common domain exist, where all of the functional data are recorded. Because of this, our proposal can also be used in those frequent situations where reconstructions methods and other techniques for partially observed functional data are inapplicable. By means of simulation studies, we demonstrate the very good performances of the proposed depth on finite samples. Our proposal enables the use of benchmark methods based on depths, originally introduced for fully observed data, in the case of partially observed functional data. This includes the functional boxplot, the outliergram and the depth versus depth classifiers. We illustrate our proposal on two case studies, the first concerning a problem of outlier detection in German electricity supply functions, the second regarding a classification problem with data obtained from medical imaging. for this article are available online
Microbial vs thermogenic gas hydrates in the South Falkland Basin: BSR distribution and fluid origin
The South Falkland Basin hosts a working petroleum system, as well as one of the most recently discovered gas hydrate provinces of the South Atlantic Ocean. Using three-dimensional reflection seismic data, a series of bottom-simulating reflections (BSRs) are interpreted within two contrasting settings, (1) the thrust-cored anticlines, developed by the oblique convergence of the Scotia and the South American plates, and (2) the foreland basin, formed to the north of this plate boundary. These BSRs are interpreted as the base of the gas hydrate stability zone, and are associated with seismic indicators of underlying free-gas accumulations and overlying hydrate-bearing sediments. In the foreland basin, the BSR is laterally continuous for tens of kilometres, whereas in the fold belt, BSR occurrences are restricted to limited portions of the thrust-cored anticline crests. These observations, calibrated with sedimentological analyses and gas geochemistry, argue that the gas source for the gas hydrates within the thrust-cored anticlines is unrelated to in-situ microbial generation of methane, but instead is associated with the vertical seepage of thermogenic fluids from deeper cores of the anticlines. In contrast, the nature of the sediments in the foreland basin appears more favourable for the generation of shallow microbial methane. This study highlights that, in specific tectonic and depositional environments, the character of the BSR observed on reflection seismic data with the limited support of in-situ data, can be used to predict the most likely source of natural gas hydrate systems
Mining discharge letters for diagnoses validation and quality assessment
We present two projects where text mining techniques are applied to free text documents written by clinicians. In the first, text mining is applied to discharge letters related to patients with diag-noses of acute myocardial infarction (by ICD9CM coding). The aim is extracting information on diagnoses to validate them and to integrate administrative databases. In the second, text mining is applied to discharge letters related to patients that received a diagnosis of heart failure (by ICD9CM coding). The aim is assessing the presence of follow-up instructions of doctors to patients, as an aspect of information continuity and of the continuity and quality of care. Results show that text mining is a promising tool both for diagnoses validation and quality of care as-sessment
The running of the electromagnetic coupling alpha in small-angle Bhabha scattering
A method to determine the running of alpha from a measurement of small-angle
Bhabha scattering is proposed and worked out. The method is suited to high
statistics experiments at e+e- colliders, which are equipped with luminometers
in the appropriate angular region. A new simulation code predicting small-angle
Bhabha scattering is also presentedComment: 15 pages, 3 Postscript figure
- âŠ