5 research outputs found

    Characterizing Thermal Energy Consumption through Exploratory Data Mining Algorithms

    Get PDF
    Nowadays large volumes of energy data are continuously collected through a variety of meters from dierent smart-city environments. Such data have a great potential to influence the overall energy balance of our communities by optimizing building energy consumption and by enhancing people's awareness of energy wasting. This paper presents FARTEC, a data mining engine based on exploratory and unsupervised data mining algorithms to characterize building energy consumption together with meteorological conditions. FARTEC exploits a joint approach coupling cluster analysis and association rules. First, a partitional clustering algorithm is applied to weather conditions to discover groups of thermal energy consumption that occurred in similar weather conditions. Each computed cluster is then locally characterized through a set of association rules to ease the manual inspection of the most interesting correlations between thermal consumption and weather conditions. FARTEC also includes a categorization of the rules into a few groups according to their meaning. Each group is determined by the data features appearing in the rule. The experimental evaluation performed on real datasets demonstrates the effectiveness of the proposed approach in discovering interesting knowledge items to raise people's awareness of their energy consumption

    Predicting large scale fine grain energy consumption

    Get PDF
    Today a large volume of energy-related data have been continuously collected. Extracting actionable knowledge from such data is a multi-step process that opens up a variety of interesting and novel research issues across two domains: energy and computer science. The computer science aim is to provide energy scientists with cutting-edge and scalable engines to effectively support them in their daily research activities. This paper presents SPEC, a scalable and distributed predictor of fine grain energy consumption in buildings. SPEC exploits a data stream methodology analysis over a sliding time window to train a prediction model tailored to each building. The building model is then exploited to predict the upcoming energy consumption at a time instant in the near future. SPEC currently integrates the artificial neural networks technique and the random forest regression algorithm. The SPEC methodology exploits the computational advantages of distributed computing frameworks as the current implementation runs on Spark. As a case study, real data of thermal energy consumption collected in a major city have been exploited to preliminarily assess the SPEC accuracy. The initial results are promising and represent a first step towards predicting fine grain energy consumption over a sliding time window

    Exploring energy performance certificates through visualization

    Get PDF
    Energy Performance Certificates (EPCs) provide interesting information on the standard-based calculation of energy performance, thermo-physical and geometrical related properties of a building. Because of the volume of available data (issued as open data) and the heterogeneity of the attributes, the exploration of these energy-related data collections is challenging. This paper presents INDICE (INformative DynamiC dashboard Engine), a new data visualization framework able to automatically explore large collections of EPCs. INDICE explores EPCs through both querying and analytics tasks, and intuitively presents the output through informative dashboards. The latter include dynamic and interactive maps along with different informative charts allowing different stakeholders (e.g., domain and non-domain expert users) to explore and interpret the extracted knowledge at different spatial granularity levels. The objective of INDICE is to create energy maps useful for the characterization of the energy performance of buildings located in different areas. The experimental evaluation, performed on a real set of EPCs related to a major Italian region in the North West of Italy, demonstrates the effectiveness of INDICE in exploring an EPC dataset through different data and knowledge visualization techniques

    Frequent Itemsets Mining for Big Data: A Comparative Analysis

    Get PDF
    Itemset mining is a well-known exploratory data mining technique used to discover interesting correlations hidden in a data collection. Since it supports different targeted analyses, it is profitably exploited in a wide range of different domains, ranging from network traffic data to medical records. With the increasing amount of generated data, different scalable algorithms have been developed, exploiting the advantages of distributed computing frameworks, such as Apache Hadoop and Spark. This paper reviews Hadoop- and Spark-based scalable algorithms addressing the frequent itemset mining problem in the Big Data domain through both theoretical and experimental comparative analyses. Since the itemset mining task is computationally expensive, its distribution and parallelization strategies heavily affect memory usage, load balancing, and communication costs. A detailed discussion of the algorithmic choices of the distributed methods for frequent itemset mining is followed by an experimental analysis comparing the performance of state-of-the-art distributed implementations on both synthetic and real datasets. The strengths and weaknesses of the algorithms are thoroughly discussed with respect to the dataset features (e.g., data distribution, average transaction length, number of records), and specific parameter settings. Finally, based on theoretical and experimental analyses, open research directions for the parallelization of the itemset mining problem are presented

    Characterizing Thermal Energy Consumption through Exploratory Data Mining Algorithms

    No full text
    Nowadays large volumes of energy data are continuously collected through a variety of meters from dierent smart-city environments. Such data have a great potential to influence the overall energy balance of our communities by optimizing building energy consumption and by enhancing people's awareness of energy wasting. This paper presents FARTEC, a data mining engine based on exploratory and unsupervised data mining algorithms to characterize building energy consumption together with meteorological conditions. FARTEC exploits a joint approach coupling cluster analysis and association rules. First, a partitional clustering algorithm is applied to weather conditions to discover groups of thermal energy consumption that occurred in similar weather conditions. Each computed cluster is then locally characterized through a set of association rules to ease the manual inspection of the most interesting correlations between thermal consumption and weather conditions. FARTEC also includes a categorization of the rules into a few groups according to their meaning. Each group is determined by the data features appearing in the rule. The experimental evaluation performed on real datasets demonstrates the effectiveness of the proposed approach in discovering interesting knowledge items to raise people's awareness of their energy consumption
    corecore