863 research outputs found

    Robust and automatic data cleansing method for short-term load forecasting of distribution feeders

    Get PDF
    Distribution networks are undergoing fundamental changes at medium voltage level. To support growing planning and control decision-making, the need for large numbers of short-term load forecasts has emerged. Data-driven modelling of medium voltage feeders can be affected by (1) data quality issues, namely, large gross errors and missing observations (2) the presence of structural breaks in the data due to occasional network reconfiguration and load transfers. The present work investigates and reports on the effects of advanced data cleansing techniques on forecast accuracy. A hybrid framework to detect and remove outliers in large datasets is proposed; this automatic procedure combines the Tukey labelling rule and the binary segmentation algorithm to cleanse data more efficiently, it is fast and easy to implement. Various approaches for missing value imputation are investigated, including unconditional mean, Hot Deck via k-nearest neighbour and Kalman smoothing. A combination of the automatic detection/removal of outliers and the imputation methods mentioned above are implemented to cleanse time series of 342 medium-voltage feeders. A nested rolling-origin-validation technique is used to evaluate the feed-forward deep neural network models. The proposed data cleansing framework efficiently removes outliers from the data, and the accuracy of forecasts is improved. It is found that Hot Deck (k-NN) imputation performs best in balancing the bias-variance trade-off for short-term forecasting

    Predictive Data Analytics for Energy Demand Flexibility

    Get PDF

    Robust data cleaning procedure for large scale medium voltage distribution networks feeders

    Get PDF
    Relatively little attention has been given to the short-term load forecasting problem of primary substations mainly because load forecasts were not essential to secure the operation of passive distribution networks. With the increasing uptake of intermittent generations, distribution networks are becoming active since power flows can change direction in a somewhat volatile fashion. The volatility of power flows introduces operational constraints on voltage control, system fault levels, thermal constraints, systems losses and high reverse power flows. Today, greater observability of the networks is essential to maintain a safe overall system and to maximise the utilisation of existing assets. Hence, to identify and anticipate for any forthcoming critical operational conditions, networks operators are compelled to broaden their visibility of the networks to time horizons that include not only real-time information but also hour-ahead and day-ahead forecasts. With this change in paradigm, progressively, large scales of short-term load forecasters is integrated as an essential component of distribution networks' control and planning tools. The data acquisition of large scale real-world data is prone to errors; anomalies in data sets can lead to erroneous forecasting outcomes. Hence, data cleansing is an essential first step in data-driven learning techniques. Data cleansing is a labour-intensive and time-consuming task for the following reasons: 1) to select a suitable cleansing method is not trivial 2) to generalise or automate a cleansing procedure is challenging, 3) there is a risk to introduce new errors in the data. This thesis attempts to maximise the performance of large scale forecasting models by addressing the quality of the modelling data. Thus, the objectives of this research are to identify the bad data quality causes, design an automatic data cleansing procedure suitable for large scale distribution network datasets and, to propose a rigorous framework for modelling MV distribution network feeders time series with deep learning architecture. The thesis discusses in detail the challenges in handling and modelling real-world distribution feeders time series. It also discusses a robust technique to detect outliers in the presence of level-shifts, and suitable missing values imputation techniques. All the concepts have been demonstrated on large real-world distribution network data.Open Acces

    Enhancing Landsat time series through multi-sensor fusion and integration of meteorological data

    Get PDF
    Over 50 years ago, the United States Interior Secretary, Stewart Udall, directed space agencies to gather "facts about the natural resources of the earth." Today global climate change and human modification make earth observations from all variety of sensors essential to understand and adapt to environmental change. The Landsat program has been an invaluable source for understanding the history of the land surface, with consistent observations from the Thematic Mapper (TM) and Enhanced Thematic Mapper Plus (ETM+) sensors since 1982. This dissertation develops and explores methods for enhancing the TM/ETM+ record by fusing other data sources, specifically, Landsat 8 for future continuity, radar data for tropical forest monitoring, and meteorological data for semi-arid vegetation dynamics. Landsat 8 data may be incorporated into existing time series of Landsat 4-7 data for applications like change detection, but vegetation trend analysis requires calibration, especially when using the near-infrared band. The improvements in radiometric quality and cloud masking provided by Landsat 8 data reduce noise compared to previous sensors. Tropical forests are notoriously difficult to monitor with Landsat alone because of clouds. This dissertation developed and compared two approaches for fusing Synthetic Aperture Radar (SAR) data from the Advanced Land Observation Satellite (ALOS-1) with Landsat in Peru, and found that radar data increased accuracy of deforestation. Simulations indicate that the benefit of using radar data increased with higher cloud cover. Time series analysis of vegetation indices from Landsat in semi-arid environments is complicated by the response of vegetation to high variability in timing and amount of precipitation. We found that quantifying dynamics in precipitation and drought index data improved land cover change detection performance compared to more traditional harmonic modeling for grasslands and shrublands in California. This dissertation enhances the value of Landsat data by combining it with other data sources, including other optical sensors, SAR data, and meteorological data. The methods developed here show the potential for data fusion and are especially important in light of recent and upcoming missions, like Sentinel-1, Sentinel-2, and NASA-ISRO Synthetic Aperture Radar (NISAR)

    Machine learning for the sustainable energy transition: a data-driven perspective along the value chain from manufacturing to energy conversion

    Get PDF
    According to the special report Global Warming of 1.5 °C of the IPCC, climate action is not only necessary but more than ever urgent. The world is witnessing rising sea levels, heat waves, events of flooding, droughts, and desertification resulting in the loss of lives and damage to livelihoods, especially in countries of the Global South. To mitigate climate change and commit to the Paris agreement, it is of the uttermost importance to reduce greenhouse gas emissions coming from the most emitting sector, namely the energy sector. To this end, large-scale penetration of renewable energy systems into the energy market is crucial for the energy transition toward a sustainable future by replacing fossil fuels and improving access to energy with socio-economic benefits. With the advent of Industry 4.0, Internet of Things technologies have been increasingly applied to the energy sector introducing the concept of smart grid or, more in general, Internet of Energy. These paradigms are steering the energy sector towards more efficient, reliable, flexible, resilient, safe, and sustainable solutions with huge environmental and social potential benefits. To realize these concepts, new information technologies are required, and among the most promising possibilities are Artificial Intelligence and Machine Learning which in many countries have already revolutionized the energy industry. This thesis presents different Machine Learning algorithms and methods for the implementation of new strategies to make renewable energy systems more efficient and reliable. It presents various learning algorithms, highlighting their advantages and limits, and evaluating their application for different tasks in the energy context. In addition, different techniques are presented for the preprocessing and cleaning of time series, nowadays collected by sensor networks mounted on every renewable energy system. With the possibility to install large numbers of sensors that collect vast amounts of time series, it is vital to detect and remove irrelevant, redundant, or noisy features, and alleviate the curse of dimensionality, thus improving the interpretability of predictive models, speeding up their learning process, and enhancing their generalization properties. Therefore, this thesis discussed the importance of dimensionality reduction in sensor networks mounted on renewable energy systems and, to this end, presents two novel unsupervised algorithms. The first approach maps time series in the network domain through visibility graphs and uses a community detection algorithm to identify clusters of similar time series and select representative parameters. This method can group both homogeneous and heterogeneous physical parameters, even when related to different functional areas of a system. The second approach proposes the Combined Predictive Power Score, a method for feature selection with a multivariate formulation that explores multiple sub-sets of expanding variables and identifies the combination of features with the highest predictive power over specified target variables. This method proposes a selection algorithm for the optimal combination of variables that converges to the smallest set of predictors with the highest predictive power. Once the combination of variables is identified, the most relevant parameters in a sensor network can be selected to perform dimensionality reduction. Data-driven methods open the possibility to support strategic decision-making, resulting in a reduction of Operation & Maintenance costs, machine faults, repair stops, and spare parts inventory size. Therefore, this thesis presents two approaches in the context of predictive maintenance to improve the lifetime and efficiency of the equipment, based on anomaly detection algorithms. The first approach proposes an anomaly detection model based on Principal Component Analysis that is robust to false alarms, can isolate anomalous conditions, and can anticipate equipment failures. The second approach has at its core a neural architecture, namely a Graph Convolutional Autoencoder, which models the sensor network as a dynamical functional graph by simultaneously considering the information content of individual sensor measurements (graph node features) and the nonlinear correlations existing between all pairs of sensors (graph edges). The proposed neural architecture can capture hidden anomalies even when the turbine continues to deliver the power requested by the grid and can anticipate equipment failures. Since the model is unsupervised and completely data-driven, this approach can be applied to any wind turbine equipped with a SCADA system. When it comes to renewable energies, the unschedulable uncertainty due to their intermittent nature represents an obstacle to the reliability and stability of energy grids, especially when dealing with large-scale integration. Nevertheless, these challenges can be alleviated if the natural sources or the power output of renewable energy systems can be forecasted accurately, allowing power system operators to plan optimal power management strategies to balance the dispatch between intermittent power generations and the load demand. To this end, this thesis proposes a multi-modal spatio-temporal neural network for multi-horizon wind power forecasting. In particular, the model combines high-resolution Numerical Weather Prediction forecast maps with turbine-level SCADA data and explores how meteorological variables on different spatial scales together with the turbines' internal operating conditions impact wind power forecasts. The world is undergoing a third energy transition with the main goal to tackle global climate change through decarbonization of the energy supply and consumption patterns. This is not only possible thanks to global cooperation and agreements between parties, power generation systems advancements, and Internet of Things and Artificial Intelligence technologies but also necessary to prevent the severe and irreversible consequences of climate change that are threatening life on the planet as we know it. This thesis is intended as a reference for researchers that want to contribute to the sustainable energy transition and are approaching the field of Artificial Intelligence in the context of renewable energy systems
    • …
    corecore