685 research outputs found

    Data-Driven Copy-Paste Imputation for Energy Time Series

    Get PDF
    A cornerstone of the worldwide transition to smart grids are smart meters. Smart meters typically collect and provide energy time series that are vital for various applications, such as grid simulations, fault-detection, load forecasting, load analysis, and load management. Unfortunately, these time series are often characterized by missing values that must be handled before the data can be used. A common approach to handle missing values in time series is imputation. However, existing imputation methods are designed for power time series and do not take into account the total energy of gaps, resulting in jumps or constant shifts when imputing energy time series. In order to overcome these issues, the present paper introduces the new Copy-Paste Imputation (CPI) method for energy time series. The CPI method copies data blocks with similar properties and pastes them into gaps of the time series while preserving the total energy of each gap. The new method is evaluated on a real-world dataset that contains six shares of artificially inserted missing values between 1 and 30%. It outperforms by far the three benchmark imputation methods selected for comparison. The comparison furthermore shows that the CPI method uses matching patterns and preserves the total energy of each gap while requiring only a moderate run-time.Comment: 8 pages, 7 figures, submitted to IEEE Transactions on Smart Grid, the first two authors equally contributed to this wor

    Data-Driven Copy-Paste Imputation for Energy Time Series

    Get PDF
    A cornerstone of the worldwide transition to smart grids are smart meters. Smart meters typically collect and provide energy time series that are vital for various applications, such as grid simulations, fault-detection, load forecasting, load analysis, and load management. Unfortunately, these time series are often characterized by missing values that must be handled before the data can be used. A common approach to handle missing values in time series is imputation. However, existing imputation methods are designed for power time series and do not take into account the total energy of gaps, resulting in jumps or constant shifts when imputing energy time series. In order to overcome these issues, the present paper introduces the new Copy-Paste Imputation (CPI) method for energy time series. The CPI method copies data blocks with similar characteristics and pastes them into gaps of the time series while preserving the total energy of each gap. The new method is evaluated on a real-world dataset that contains six shares of artificially inserted missing values between 1 and 30%. It outperforms the three benchmark imputation methods selected for comparison. The comparison furthermore shows that the CPI method uses matching patterns and preserves the total energy of each gap while requiring only a moderate run-time

    Real-time data quality monitoring and improvement in energy networks

    Get PDF
    Abstract. Data quality monitoring is an important aspect in real-time data-based operation and of growing interest. Studying the different methods and approaches in real-time data quality monitoring, in the context of the energy systems, can yield some highly beneficial improvements in the ever-growing demand for material efficiency and energy savings. Quality flags, based on appropriate quality dimensions, can improve the decision making of systems in real time. The goal of this study is to find out, how this can be applied, utilizing the varied and large volumes of energy industry data. The concept of data quality was first dissected at a theoretical level, to understand what meaningful data quality dimensions in the energy systems could be, in terms of possible sources of data and what aspects of it are meaningful for the quality of the processes. Based on the gathered understanding from the related theoretical section, an understanding of essential data quality dimensions was formed, helping in the choice of data quality dimensions for this study. After this, the potential data quality pre-processing and analyzing methods were inspected. The goal was to apply simple methods to see what results could be achieved with them when the data quality flagging algorithm was formed. Selected seven quality dimensions were Accessibility, Interpretability, Completeness, Consistency, Timeliness, Accuracy and Believability. Data was generated with imputed errors, and the data quality flagging algorithm performance was tested on it, simulating three signals producing sensor readings, one with redundant readings, two without. The data flagging results were correct in all simulated cases, but the accuracy of the estimated values varied. High precision data quality description about the data compared to the actual value was achieved consistently with the signals that had redundant values utilizing the chosen simple methods. On the other hand, algorithm produced less accurate estimation value with the signals without the redundant readings, depending on the error type. Drifting error type was challenging to handle if only one signal was available, without more sophisticated estimation methods. Most data quality checks studied in this thesis are applicable in real time operation, but changes are needed in the estimation methods for the individual signals. The selected methods were simple to ease the load on real-time data quality monitoring requirements. Further research should concentrate in finding better methods to deal with the errors that caused a lot of estimation challenges in this study.Tiivistelmä. Datan laadun varmistaminen on tärkeä osa sen reaaliaikaisessa hyödyntämisessä ja kasvavan kiinnostuksen kohde. Energiateollisuuden kontekstissa datan laadun reaaliaikaisten monitorointimenetelmien tutkiminen voi tuottaa hyödyllisiä tuloksia tehokkuusvaatimusten jatkuvan tarpeen kasvaessa. Dataa hyödyntävien järjestelmien päätöksentekoa voidaan parantaa reaaliaikaisella laatuliputuksella, joka kertoo käsiteltävän datan laadun sidottuna sen tärkeisiin laatudimensioihin. Tämän tutkimuksen tavoite oli selvittää, miten tämä voidaan toteuttaa monimuotoisella ja runsaslukuisella energiajärjestelmien datalla. Työ alkoi datan laadun määrityksestä perustasolla, että ymmärrys datan laadusta energiateollisuuden kontekstissa voitiin muodostaa. Tähän liittyi datan laatudimensioiden tunnistaminen ja niiden soveltaminen energiajärjestelmissä. Valittaviin laatudimensioihin vaikuttavat datan alkuperä, sen määrä ja tyyppi. Tämän jälkeen arvioitiin mahdollisia esikäsittely ja analyysimenetelmiä datan laadun valvonnan kannalta, kehitettävää reaaliaikaista algoritmia varten. Seitsemän datan laatudimensiota, joita tässä työssä käytettiin algoritmin määrityksessä, olivat esteettömyys, tulkittavuus, täydellisyys, johdonmukaisuus, ajallisuus, tarkkuus ja uskottavuus. Kehitettyä algoritmia testattiin simuloidulla datalla, johon oli lisätty virhettä tietyille aikaväleille ja satunnaisia virheitä. Simuloituja signaaleja oli kolme, joista yhdessä oli redundantteja datajoukkoja. Simulointitulosten perusteella datan liputusarvot olivat oikein kaikissa tilanteissa, toisaalta estimaattien tarkkuus hetkellisestä arvosta vaihteli. Korkea selitystarkkuus datan hetkellisestä laadusta verrattuna datan oikeaan arvoon saavutettiin johdonmukaisesti signaaleissa, missä oli redundantteja mittausarvoja ja kun sovellettiin yksinkertaisia menetelmiä. Signaalien ryömintävirhe aiheutti haasteita yksittäisiin mittausarvoihin perustuvilla estimaattoreilla, joka viittaa kehittyneemmän estimointimenetelmän tarpeesta tulevaisuuden tutkimuksen kannalta. Tulosten perusteella suurin osa työssä testatuista datan laatutarkastuksista soveltuvat reaaliaikaiseen monitorointiin, mutta estimaattien tarkkuuden parannus vaatii muutoksia estimaattimetodeihin etenkin, jos saatavilla on vain yksi mittausarvo. Yksinkertaisten menetelmien valinnan syy oli helpottaa reaaliaikaisen laatuliputuksen asettamia vaatimuksia datan laadun monitoroinnissa. Jatkotutkimus puuttuvien ja virheellisten arvojen estimaattien parantamiseen on tärkeää

    Data Consistency for Data-Driven Smart Energy Assessment

    Get PDF
    In the smart grid era, the number of data available for different applications has increased considerably. However, data could not perfectly represent the phenomenon or process under analysis, so their usability requires a preliminary validation carried out by experts of the specific domain. The process of data gathering and transmission over the communication channels has to be verified to ensure that data are provided in a useful format, and that no external effect has impacted on the correct data to be received. Consistency of the data coming from different sources (in terms of timings and data resolution) has to be ensured and managed appropriately. Suitable procedures are needed for transforming data into knowledge in an effective way. This contribution addresses the previous aspects by highlighting a number of potential issues and the solutions in place in different power and energy system, including the generation, grid and user sides. Recent references, as well as selected historical references, are listed to support the illustration of the conceptual aspects

    Data-Driven Methods for Managing Anomalies in Energy Time Series

    Get PDF
    With the progressing implementation of the smart grid, more and more smart meters record power or energy consumption and generation as time series. The increasing availability of these recorded energy time series enables the goal of the automated operation of smart grid applications such as load analysis, load forecasting, and load management. However, to perform well, these applications usually require clean data that describes the typical behavior of the underlying system well. Unfortunately, recorded energy time series are usually not clean but contain anomalies, i.e., patterns that deviate from what is considered normal. Since anomalies thus potentially contain data points or patterns that represent false or misleading information, they can be problematic for any analysis of this data performed by smart grid applications. Therefore, the present thesis proposes data-driven methods for managing anomalies in energy time series. It introduces an anomaly management whose characteristics correspond to steps in a sequential pipeline, namely anomaly detection, anomaly compensation, and a subsequent application. Using forecasting as an exemplary subsequent application and real-world data with inserted synthetic and labeled anomalies, this thesis answers four research questions along that pipeline for managing anomalies in energy time series. Based on the answers to these four research questions, the anomaly management presented in this thesis exhibits four characteristics. First, the presented anomaly management is guided by well-defined anomalies derived from real-world energy time series. These anomalies serve as a basis for generating synthetic anomalies in energy time series to promote the development of powerful anomaly detection methods. Second, the presented anomaly management applies an anomaly detection approach to energy time series that is capable of providing a high anomaly detection performance. Third, the presented anomaly management also compensates detected anomalies in energy time series realistically by considering the characteristics of the respective data. Fourth, the proposed anomaly management applies and evaluates general anomaly management strategies in view of the subsequent forecasting that uses this data. The comparison shows that managing anomalies well is essential, as the compensation strategy, which detects and compensates anomalies in the input data before applying a forecasting method, is the most beneficial strategy when the input data contains anomalies

    Situational awareness in low-observable distribution grid - exploiting sparsity and multi-timescale data

    Get PDF
    Doctor of PhilosophyDepartment of Electrical and Computer EngineeringBalasubramaniam NatarajanThe power distribution grid is typically unobservable due to a lack of real-time measurements. While deploying more sensors can alleviate this issue, it also presents new challenges related to data aggregation and the underlying communication infrastructure. Limited real-time measurements hinders the distribution system state estimation (DSSE). DSSE involves estimation of the system states (i.e., voltage magnitude and voltage angle) based on available measurements and system model information. To cope with the unobservability issue, sparsity-based DSSE approaches allow us to recover system state information from a small number of measurements, provided the states of the distribution system exhibit sparsity. However, these approaches perform poorly in the presence of outliers in measurements and errors in system model information. In this dissertation, we first develop robust formulations of sparsity-based DSSE to deal with uncertainties in the system model and measurement data in a low-observable distribution grid. We also combine the advantages of two sparsity-based DSSE approaches to estimate grid states with high fidelity in low observability regions. In practical distribution systems, information from field sensors and meters are unevenly sampled at different time scales and could be lost during the transmission process. It is critical to effectively aggregate these information sources for DSSE as well as other tasks related to situational awareness. To address this challenge, the second part of this dissertation proposes a Bayesian framework for multi-timescale data aggregation and matrix completion-based state estimation. Specifically, the multi-scale time-series data aggregated from heterogeneous sources are reconciled using a multitask Gaussian process. The resulting consistent time-series alongwith the confidence bound on the imputations are fed into a Bayesian matrix completion method augmented with linearized power-flow constraints for accurate state estimation low-observable distribution system. We also develop a computationally efficient recursive Gaussian process approach that is capable of handling batch-wise or real-time measurements while leveraging the network connectivity information of the grid. To further enhance the scalability and accuracy, we develop neural network-based approaches (latent neural ordinary differential equation approach and stochastic neural differential equation with recurrent neural network approach) to aggregate irregular time-series data in the distribution grid. The stochastic neural differential equation and recurrent neural network also allows us to quantify the uncertainty in a holistic manner. Simulation results on the different IEEE unbalanced test systems illustrate the high fidelity of the Bayesian and neural network-based methods in aggregating multi-timescale measurements. Lastly, we develop phase, and outage awareness approaches for power distribution grid. In this regard, we first design a graph signal processing approach that identifies the phase labels in the presence of limited measurements and incorrect phase labeling. The second approach proposes a novel outage detector for identifying all outages in a reconfigurable distribution network. Simulation results on standard IEEE test systems reveal the potential of these methods to improve situational awareness

    Digital architecture for monitoring and operational analytics of multi-vector microgrids utilizing cloud computing, advanced virtualization techniques, and data analytics methods

    Get PDF
    Microgrids are considered a viable solution for achieving net-zero targets and increasing renewable energy integration. However, there is a lack of conceptual work focusing on practical data analytics deployment schemes and case-specific insights. This paper presents a scalable and flexible physical and digital architecture for extracting data-driven insights from microgrids, with a real-world microgrid utilized as a test-bed. The proposed architecture includes edge monitoring and intelligence, data-processing mechanisms, and edge–cloud communication. Cloud-hosted data analytics have been developed in AWS, considering market arrangements between the microgrid and the utility. The analysis involves time-series data processing, followed by the exploration of statistical relationships utilizing cloud-hosted tools. Insights from one year of operation highlight the potential for significant operational cost reduction through the real-time optimization and control of microgrid assets. By addressing the real-world applicability, end-to-end architectures, and extraction of case-specific insights, this work contributes to advancing microgrid design, operation, and adoption

    Energy load forecast in smart buildings with deep learning techniques

    Get PDF
    Predicting energy load is a growing problem these days. The need to study in advance how electricity consumption will behave is key to resource management. Especially interesting is the case of the so-called Smart Buildings, buildings born from the trend towards sustainable development and consumption which is increasingly in vogue, becoming mandatory by law in many countries. One type of model that constitutes an important part of the state of the art are the models based on Deep Learning. These models represented great advances in Artificial Intelligence recently, since although they were born in the 20th century, it has not been until 10 years ago that they have re-emerged thanks to the computational advances that allow them to be trained by the general public. In this Final Degree Project, advanced Deep Learning techniques applied to the problem of load prediction in Smart Buildings are presented, mainly basing the development on the data from the Alice Perry building of the National University of Ireland Galway, in collaboration with the Informatics Research Unit for Sustainable Engineering of the same university. The datasets used were obtained from the time series of aggregated electricity consumption of the air handling units (AHUs) in the Alice Perry building. Along with this information, historical weather data were also collected from the weather station in the same building in order to study if these climatic variables help to a better prediction in the models. Time series prediction on this energy load data will be made in two different ways with hourly granularity: one-step prediction in which studying the previous observations an estimate of the value of the load in the next hour is obtained and sequence prediction, in which we will try to predict the behaviour of the series in the next hours from the previous values.La predicción de carga energética es un problema al alza actualmente. La necesidad de estudiar con antelación cómo se va a comportar el consumo eléctrico es clave para la gestión de recursos. Especialmente interesante es el caso de los llamados Smart Buildings, edificios nacidos por la tendencia hacia un desarrollo y consumo sostenible el cual cada vez está más en boga, llegando a ser obligatorio por ley en muchos países. Un tipo de modelos que constituyen una parte importante del estado del arte son los modelos basados en Deep Learning. Estos modelos supusieron grandes avances en la Inteligencia Artificial recientemente, ya que aunque nacidos en el Siglo XX, no ha sido hasta escasos 10 años cuando han resurgido gracias a los avances computacionales que permiten entrenarlos por el público general. En este trabajo de fin de grado se presentan técnicas avanzadas de Deep Learning aplicadas al problema de la predicción de carga en Smart Buildings, principalmente basando el desarrollo en los datos del edificio Alice Perry de la National University of Ireland Galway, en colaboración con el grupo Informatics Research Unit for Sustainable Engineering de la misma universidad. Los conjuntos de datos utilizados se obtuvieron datos sobre la serie temporal de consumo eléctrico agregado de los aires acondicionados en el edificio Alice Perry. Junto a esta información, se recopilaron también datos meteorológicos históricos de la estación meteorológica en el mismo edificio con el objetivo de estudiar si estas variables climáticas ayudan a una mejor predicción en los modelos. La predicción de series temporales sobre estos datos de carga energética se realizará en dos modos con granularidad horaria: La predicción a un paso en la que estudiando las observaciones anteriores se obtiene una estimación del valor de la carga en la próxima hora y predicción de secuencias, en la que se intentará predecir el comportamiento de la serie en las próximas horas a partir de los valores anteriores.Grado en Ingeniería Informátic

    Semantic technologies for supporting KDD processes

    Get PDF
    209 p.Achieving a comfortable thermal situation within buildings with an efficient use of energy remains still an open challenge for most buildings. In this regard, IoT (Internet of Things) and KDD (Knowledge Discovery in Databases) processes may be combined to solve these problems, even though data analysts may feel overwhelmed by heterogeneity and volume of the data to be considered. Data analysts could benefit from an application assistant that supports them throughout the KDD process. This research work aims at supporting data analysts through the different KDD phases towards the achievement of energy efficiency and thermal comfort in tertiary buildings. To do so, the EEPSA (Energy Efficiency Prediction Semantic Assistant) is proposed, which aids data analysts discovering the most relevant variables for the matter at hand, and informs them about relationships among relevant data. This assistant leverages Semantic Technologies such as ontologies, ontology-driven rules and ontology-driven data access. More specifically, the EEPSA ontology is the cornerstone of the assistant. This ontology is developed on top of three ODPs (Ontology Design Patterns) and it is designed so that its customization to address similar problems in different types of buildings can be approached methodically
    corecore