27,091 research outputs found
GNA: new framework for statistical data analysis
We report on the status of GNA --- a new framework for fitting large-scale
physical models. GNA utilizes the data flow concept within which a model is
represented by a directed acyclic graph. Each node is an operation on an array
(matrix multiplication, derivative or cross section calculation, etc). The
framework enables the user to create flexible and efficient large-scale lazily
evaluated models, handle large numbers of parameters, propagate parameters'
uncertainties while taking into account possible correlations between them, fit
models, and perform statistical analysis. The main goal of the paper is to give
an overview of the main concepts and methods as well as reasons behind their
design. Detailed technical information is to be published in further works.Comment: 9 pages, 3 figures, CHEP 2018, submitted to EPJ Web of Conference
Automatic Derivation of Statistical Data Analysis Algorithms: Planetary Nebulae and Beyond
AUTOBAYES is a fully automatic program synthesis system for the data analysis domain. Its input is a declarative problem description in form of a statistical model; its output is documented and optimized C/C++ code. The synthesis process relies on the combination of three key techniques. Bayesian networks are used as a compact internal representation mechanism which enables problem decompositions and guides the algorithm derivation. Program schemas are used as independently composable building blocks for the algorithm construction; they can encapsulate advanced algorithms and data structures. A symbolic-algebraic system is used to find closed-form solutions for problems and emerging subproblems. In this paper, we describe the application of AUTOBAYES to the analysis of planetary nebulae images taken by the Hubble Space Telescope. We explain the system architecture, and present in detail the automatic derivation of the scientists’ original analysis [1] as well as a refined analysis using clustering models. This study demonstrates that AUTOBAYES is now mature enough so that it can be applied to realistic scientific data analysis tasks
Statistical Data Analysis for Energy Communities
The objectives of the European Energy transition entail an increasing use of electricity especially for residential sector. Member states are invited to promote energy policies that involve stakeholders directly. Energy Communities (EC) are intended as local institutions that could drive this change, creating local-scaled energy entities that cooperate to exchange energy. The purpose of this study is to investigate the energy consumption identifying a linear regression model to forecast electric energy demand at municipal scale, for residential end users. This work analyses electric consumption of 1,201 municipalities in Piedmont (north-west of Italy) evaluating the main energy-related variables. Information are obtained by online databases and georeferenced with GIS tool. The identified model evidences that the most influential variables are the population, the number of members per family, the education level, and the income. Regarding building features, the dwelling area and the number of occupied dwellings, the age of buildings and their maintenance condition. The statistical GIS-based methodology proposed in this study is replicable and can be applied to other contexts. A forecasting model to predict the amount of energy demand can support preliminary decision-making process defining the scale of ECs and their optimal configuration for balancing energy demand and local production
HistFitter software framework for statistical data analysis
We present a software framework for statistical data analysis, called
HistFitter, that has been used extensively by the ATLAS Collaboration to
analyze big datasets originating from proton-proton collisions at the Large
Hadron Collider at CERN. Since 2012 HistFitter has been the standard
statistical tool in searches for supersymmetric particles performed by ATLAS.
HistFitter is a programmable and flexible framework to build, book-keep, fit,
interpret and present results of data models of nearly arbitrary complexity.
Starting from an object-oriented configuration, defined by users, the framework
builds probability density functions that are automatically fitted to data and
interpreted with statistical tests. A key innovation of HistFitter is its
design, which is rooted in core analysis strategies of particle physics. The
concepts of control, signal and validation regions are woven into its very
fabric. These are progressively treated with statistically rigorous built-in
methods. Being capable of working with multiple data models at once, HistFitter
introduces an additional level of abstraction that allows for easy bookkeeping,
manipulation and testing of large collections of signal hypotheses. Finally,
HistFitter provides a collection of tools to present results with
publication-quality style through a simple command-line interface.Comment: 35 pages (excluding appendix) and 10 figures. Code publicly available
at: http://cern.ch/histfitte
Finding Universal Relations using Statistical Data Analysis
We present applications of statistical data analysis methods from both bi-
and multivariate statistics to find suitable sets of neutron star features that
can be leveraged for accurate and EoS independent -- or universal -- relations.
To this end, we investigate the ability of various correlation measures such as
Distance Correlation and Mutual Information in identifying universally related
pairs of neutron star features. We also evaluate relations produced by methods
of multivariate statistics such as Principal Component Analysis to assess their
suitability for producing universal relations with multiple independent
variables. As part of our analyses, we also put forward multiple entirely novel
relations, including multivariate relations for the -mode frequency of
neutron stars with reduced error when compared to existing, bivariate
relations.Comment: 19 pages, 29 figures, 6 table
- …