27,091 research outputs found

    GNA: new framework for statistical data analysis

    Full text link
    We report on the status of GNA --- a new framework for fitting large-scale physical models. GNA utilizes the data flow concept within which a model is represented by a directed acyclic graph. Each node is an operation on an array (matrix multiplication, derivative or cross section calculation, etc). The framework enables the user to create flexible and efficient large-scale lazily evaluated models, handle large numbers of parameters, propagate parameters' uncertainties while taking into account possible correlations between them, fit models, and perform statistical analysis. The main goal of the paper is to give an overview of the main concepts and methods as well as reasons behind their design. Detailed technical information is to be published in further works.Comment: 9 pages, 3 figures, CHEP 2018, submitted to EPJ Web of Conference

    Automatic Derivation of Statistical Data Analysis Algorithms: Planetary Nebulae and Beyond

    No full text
    AUTOBAYES is a fully automatic program synthesis system for the data analysis domain. Its input is a declarative problem description in form of a statistical model; its output is documented and optimized C/C++ code. The synthesis process relies on the combination of three key techniques. Bayesian networks are used as a compact internal representation mechanism which enables problem decompositions and guides the algorithm derivation. Program schemas are used as independently composable building blocks for the algorithm construction; they can encapsulate advanced algorithms and data structures. A symbolic-algebraic system is used to find closed-form solutions for problems and emerging subproblems. In this paper, we describe the application of AUTOBAYES to the analysis of planetary nebulae images taken by the Hubble Space Telescope. We explain the system architecture, and present in detail the automatic derivation of the scientists’ original analysis [1] as well as a refined analysis using clustering models. This study demonstrates that AUTOBAYES is now mature enough so that it can be applied to realistic scientific data analysis tasks

    Statistical Data Analysis for Energy Communities

    Get PDF
    The objectives of the European Energy transition entail an increasing use of electricity especially for residential sector. Member states are invited to promote energy policies that involve stakeholders directly. Energy Communities (EC) are intended as local institutions that could drive this change, creating local-scaled energy entities that cooperate to exchange energy. The purpose of this study is to investigate the energy consumption identifying a linear regression model to forecast electric energy demand at municipal scale, for residential end users. This work analyses electric consumption of 1,201 municipalities in Piedmont (north-west of Italy) evaluating the main energy-related variables. Information are obtained by online databases and georeferenced with GIS tool. The identified model evidences that the most influential variables are the population, the number of members per family, the education level, and the income. Regarding building features, the dwelling area and the number of occupied dwellings, the age of buildings and their maintenance condition. The statistical GIS-based methodology proposed in this study is replicable and can be applied to other contexts. A forecasting model to predict the amount of energy demand can support preliminary decision-making process defining the scale of ECs and their optimal configuration for balancing energy demand and local production

    HistFitter software framework for statistical data analysis

    Get PDF
    We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple data models at once, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication-quality style through a simple command-line interface.Comment: 35 pages (excluding appendix) and 10 figures. Code publicly available at: http://cern.ch/histfitte

    Finding Universal Relations using Statistical Data Analysis

    Full text link
    We present applications of statistical data analysis methods from both bi- and multivariate statistics to find suitable sets of neutron star features that can be leveraged for accurate and EoS independent -- or universal -- relations. To this end, we investigate the ability of various correlation measures such as Distance Correlation and Mutual Information in identifying universally related pairs of neutron star features. We also evaluate relations produced by methods of multivariate statistics such as Principal Component Analysis to assess their suitability for producing universal relations with multiple independent variables. As part of our analyses, we also put forward multiple entirely novel relations, including multivariate relations for the ff-mode frequency of neutron stars with reduced error when compared to existing, bivariate relations.Comment: 19 pages, 29 figures, 6 table
    corecore