79 research outputs found
On the Higgs cross section at NLO+NLL and its uncertainty
We consider the inclusive production of a Higgs boson in gluon-fusion and we
study the impact of threshold resummation at next-to-next-to-next-to-leading
logarithmic accuracy (NLL) on the recently computed fixed-order prediction
at next-to-next-to-next-to-leading order (NLO). We propose a conservative,
yet robust way of estimating the perturbative uncertainty from missing higher
(fixed- or logarithmic-) orders. We compare our results with two other
different methods of estimating the uncertainty from missing higher orders: the
Cacciari-Houdeau Bayesian approach to theory errors, and the use of algorithms
to accelerate the convergence of the perturbative series. We confirm that the
best convergence happens at , and we conclude that a
reliable estimate of the uncertainty from missing higher orders on the Higgs
cross section at 13 TeV is approximately %.Comment: 27 pages, 6 figures. Version to be published in JHE
Top Quark Pair Production beyond NNLO
We construct an approximate expression for the total cross section for the
production of a heavy quark-antiquark pair in hadronic collisions at
next-to-next-to-next-to-leading order (NLO) in . We use a
technique which exploits the analyticity of the Mellin space cross section, and
the information on its singularity structure coming from large N (soft gluon,
Sudakov) and small N (high energy, BFKL) all order resummations, previously
introduced and used in the case of Higgs production. We validate our method by
comparing to available exact results up to NNLO. We find that NLO
corrections increase the predicted top pair cross section at the LHC by about
4% over the NNLO.Comment: 34 pages, 9 figures; final version, to be published in JHEP;
reference added, minor improvement
Not proper ROC curves as new tool for the analysis of differentially expressed genes in microarray experiments
<p>Abstract</p> <p>Background</p> <p>Most microarray experiments are carried out with the purpose of identifying genes whose expression varies in relation with specific conditions or in response to environmental stimuli. In such studies, genes showing similar mean expression values between two or more groups are considered as not differentially expressed, even if hidden subclasses with different expression values may exist. In this paper we propose a new method for identifying differentially expressed genes, based on the area between the ROC curve and the rising diagonal (<it>ABCR</it>). <it>ABCR </it>represents a more general approach than the standard area under the ROC curve (<it>AUC</it>), because it can identify both proper (<it>i.e.</it>, concave) and not proper ROC curves (NPRC). In particular, NPRC may correspond to those genes that tend to escape standard selection methods.</p> <p>Results</p> <p>We assessed the performance of our method using data from a publicly available database of 4026 genes, including 14 normal B cell samples (NBC) and 20 heterogeneous lymphomas (namely: 9 follicular lymphomas and 11 chronic lymphocytic leukemias). Moreover, NBC also included two sub-classes, <it>i.e.</it>, 6 heavily stimulated and 8 slightly or not stimulated samples. We identified 1607 differentially expressed genes with an estimated False Discovery Rate of 15%. Among them, 16 corresponded to NPRC and all escaped standard selection procedures based on <it>AUC </it>and <it>t </it>statistics. Moreover, a simple inspection to the shape of such plots allowed to identify the two subclasses in either one class in 13 cases (81%).</p> <p>Conclusion</p> <p>NPRC represent a new useful tool for the analysis of microarray data.</p
CONFIDERAI: a novel CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial Intelligence
Everyday life is increasingly influenced by artificial intelligence, and
there is no question that machine learning algorithms must be designed to be
reliable and trustworthy for everyone. Specifically, computer scientists
consider an artificial intelligence system safe and trustworthy if it fulfills
five pillars: explainability, robustness, transparency, fairness, and privacy.
In addition to these five, we propose a sixth fundamental aspect: conformity,
that is, the probabilistic assurance that the system will behave as the machine
learner expects. In this paper, we propose a methodology to link conformal
prediction with explainable machine learning by defining CONFIDERAI, a new
score function for rule-based models that leverages both rules predictive
ability and points geometrical position within rules boundaries. We also
address the problem of defining regions in the feature space where conformal
guarantees are satisfied by exploiting techniques to control the number of
non-conformal samples in conformal regions based on support vector data
description (SVDD). The overall methodology is tested with promising results on
benchmark and real datasets, such as DNS tunneling detection or cardiovascular
disease prediction.Comment: 12 pages, 7 figures, 1 algorithm, international journa
Gene expression modeling through positive Boolean functions
In the framework of gene expression data analysis, the selection of biologically relevant sets of genes and the discovery of new subclasses of diseases at bio-molecular level represent two significant problems. Unfortunately, in both cases the correct solution is usually unknown and the evaluation of the performance of gene selection and clustering methods is difficult and in many cases unfeasible. A natural approach to this complex issue consists in developing an artificial model for the generation of biologically plausible gene expression data, thus allowing to know in advance the set of relevant genes and the functional classes involved in the problem. In this work we propose a mathematical model, based on positive Boolean functions, for the generation of synthetic gene expression data. Despite its simplicity, this model is sufficiently rich to take account of the specific peculiarities of gene expression, including the biological variability, viewed as a sort of random source. As an applicative example, we also provide some data simulations and numerical experiments for the analysis of the performances of gene selection methods
Clostridium difficile outbreak: epidemiological surveillance, infection prevention and control
INTRODUCTION: Clostridium difficile infection (CDI) is currently considered the most common cause of health care-associated infections. The aim is to describe the trend of CDI in an Italian hospital and to assess the efficacy of the measures adopted to manage the burden.
METHODS: we looked at CDI from 2016 to 2018. The incidence rate of CDIs was calculated as the number of new infected persons per month by the overall length of stay (incidence per 10,000 patient-days). Changes in the CDI rate during the period considered were analysed using a joinpoint regression model.
RESULTS: thanks to the monitoring activity it was possible to adopt a new protocol, in order to manage CDI: the CDI episodes decreased from 85 in 2017 to 31 in 2018 (63% decrease). The joinpoint regression model was a useful tool to identify an important decrement during 2017, statistically significant (slope=-15.84; p= 0.012).
CONCLUSIONS: reports based on routine laboratory data can accurately measure population burden of CDI with limited surveillance resources. This acitivity can help target prevention programs and evaluate their effect
On Convergence Properties of Pocket Algorithm
The problem of finding optimal weights for a single threshold neuron starting from a general training set is considered. Among the variety of possible learning techniques, the pocket algorithm has a proper convergence theorem which asserts its optimality
- …