27 research outputs found

    Text miner's little helper: scalable self-tuning methodologies for knowledge exploration

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Characterizing Thermal Energy Consumption through Exploratory Data Mining Algorithms

    Get PDF
    Nowadays large volumes of energy data are continuously collected through a variety of meters from dierent smart-city environments. Such data have a great potential to influence the overall energy balance of our communities by optimizing building energy consumption and by enhancing people's awareness of energy wasting. This paper presents FARTEC, a data mining engine based on exploratory and unsupervised data mining algorithms to characterize building energy consumption together with meteorological conditions. FARTEC exploits a joint approach coupling cluster analysis and association rules. First, a partitional clustering algorithm is applied to weather conditions to discover groups of thermal energy consumption that occurred in similar weather conditions. Each computed cluster is then locally characterized through a set of association rules to ease the manual inspection of the most interesting correlations between thermal consumption and weather conditions. FARTEC also includes a categorization of the rules into a few groups according to their meaning. Each group is determined by the data features appearing in the rule. The experimental evaluation performed on real datasets demonstrates the effectiveness of the proposed approach in discovering interesting knowledge items to raise people's awareness of their energy consumption

    All in a twitter: Self-tuning strategies for a deeper understanding of a crisis tweet collection

    Get PDF
    Natural disasters have become more frequent during the past 20 years due to significant climate changes. These natural events are hotly debated on social networks like Twitter and a huge amount of short text messages are continuously and promptly exchanged with personal opinions, descriptions of the natural events and their corresponding consequences. The analysis of these large and complex data could help policy-makers to better understand the event as well as to set priorities. However, the correct configuration of the tweet mining process is still challenging due to variable data distribution and the availability of a large number of algorithms with different specific parameters. The analyst need to perform a large number of experiments to identify the best configuration for the overall knowledge discovery process. Innovative, scalable, and parameter-free solutions need to be explored to streamline the analytics process. This paper presents an enhanced version of PASTA (a distributed self-tuning engine) applied to a crisis tweet collection to group a corpus of tweets into cohesive and well-separated clusters with minimal analyst intervention. Experimental results performed on real data collected during natural disasters show the effectiveness of PASTA in discovering interesting groups of correlated tweets without selecting neither the algorithms nor their parameters

    Useful ToPIC: Self-tuning strategies to enhance Latent Dirichlet Allocation

    Get PDF
    ToPIC (Tuning of Parameters for Inference of Concepts) is a distributed self-tuning engine whose aim is to cluster collections of textual data into correlated groups of documents through a topic modeling methodology (i.e., LDA). ToPIC includes automatic strategies to relieve the end-user of the burden of selecting proper values for the overall analytics process. ToPIC's current implementation runs on Apache Spark, a state-of-the-art distributed computing framework. As a case study, ToPIC has been validated on three real collections of textual documents characterized by different distributions. The experimental results show the effectiveness and efficiency of the proposed solution in analyzing collections of documents without tuning algorithm parameters and in discovering cohesive and well-separated groups of documents with a similar topic

    Search for single production of vector-like quarks decaying into Wb in pp collisions at s=8\sqrt{s} = 8 TeV with the ATLAS detector

    Get PDF

    Measurement of the charge asymmetry in top-quark pair production in the lepton-plus-jets final state in pp collision data at s=8TeV\sqrt{s}=8\,\mathrm TeV{} with the ATLAS detector

    Get PDF

    ATLAS Run 1 searches for direct pair production of third-generation squarks at the Large Hadron Collider

    Get PDF

    Charged-particle distributions at low transverse momentum in s=13\sqrt{s} = 13 TeV pppp interactions measured with the ATLAS detector at the LHC

    Get PDF
    corecore