    Forecasting wind energy for a data center

    Abstract. Data centers are increasingly using renewables such as wind and solar energy. RISE’s ICE data center has already solar panels and is now studying impact of adding a wind turbine into their microgrid. In this thesis, a machine learning model was developed to forecast wind power production for the data center. Data center in Luleå has several applications to utilize wind power forecasting. Renewable energy sources are intermittent, so accurate forecasting of output power reduces a need for additional balancing of energy and reserve power in an electricity grid. Renewable energy can be reserved from market for next hour or next day to maximize its use. Forecasting from 30 min to 6 hours ahead allows job scheduling to optimize usage of renewables and to reduce power consumption. Data center may target to minimize electricity cost or maximize usage of renewables for lower greenhouse gas emissions. Smart microgrid based on artificial intelligence is the way to implement the applications. Two open data sets from India and Sweden have been used in the research. The data available supports choosing of a statistical model. Random forest regression was the model used in the research. Data from India enabled to develop a model for one wind turbine. Developed model forecasted output power well. Swedish data set is from EEM20 competition, it included total wind power production in Sweden and had to be applied to approximate production of one wind turbine in Luleå. To achieve the goal output power of Luleå price region was averaged, and location for the simulation was chosen to be near Luleå. As expected, the accuracy of forecasting with Swedish data was reasonable, but approximations done reduced it. The developed model was applied to RISE’s ICE data center. Validation has been done, but final testing will take place in RISE’s simulation environment. In general, data from northern Sweden is not openly available for wind power forecasting. In addition, any scientific articles covering the geographical area were not found while working on literature review. The study with Swedish competition data gave understanding, which variables are significant in northern Sweden and about their relative importances. Wind gust is such a variable. Using two data sets from different geographical locations proved that climate has a major impact on performance of the trained model. Thus, it is reasonable to use the trained model in locations with similar weather conditions only.Tuulienergian ennustaminen datakeskusta varten. Tiivistelmä. Datakeskukset käyttävät uusiutuvia energialähteitä yhä enemmän. Tällaisia lähteitä ovat mm. tuuli- ja aurinkoenergia. RISE:n ICE datakeskuksella Luulajassa on jo aurinkopaneelit käytössä, ja nyt tutkitaan tuulimyllyn lisäämisen vaikutusta mikroverkkoon. Tässä työssä kehitettiin koneoppimismalli tuulivoiman tuotannon ennustamiseksi datakeskusta varten. Datakeskuksella on useita sovelluksia tuulienergian ennustamisen hyödyntämiseksi. Uusiutuvat energialähteet ovat luonteeltaan vaihtelevia, joten tuotetun tehon tarkka ennustaminen vähentää ylimääräisen säätämisen ja reservitehon tarvetta sähköverkossa yleensäkin. Datakeskus voi varata uusiutuvaa energiaa markkinoilta seuraavaksi tunniksi tai päiväksi uusiutuvan energian käytön maksimoimiseksi. Ennustaminen 30 minuutista 6 tuntiin etukäteen mahdollistaa työjonon aikatauluttamisen uusiutuvien käytön optimoimiseksi ja vähentää tehonkulutusta. Datakeskus voi pyrkiä minimoimaan sähkön käytön kustannuksia, tai pienentämään kasvihuonekaasujen päästöjä käyttämällä mahdollisimman paljon uusiutuvaa energiaa. Tekoälyyn perustuva älykäs mikroverkko on tapa toteuttaa edellä mainitut sovellukset. Tutkimuksessa on käytetty kahta avointa tietoainestoa Intiasta ja Ruotsista. Saatavilla oleva data tukee tilastollisen ennustemallin valintaa. Tässä työssä käytettiin satunnaismetsämenetelmää. Intian dataa käytettiin mallin kehityksessä yhtä tuulimyllyä varten. Kehitetty malli ennusti tuotetun tehon hyvin. Ruotsalainen data perustuu EEM20-kilpailuun, jossa arvioitiin koko Ruotsin tuulivoiman tuotantoa. Sitä olikin sovellettava Luulajassa olevan yhden tuulimyllyn tuotannon arvioimiseksi. Luulajan hinta-alueen tuottama teho keskiarvoistettiin, ja ennustamista varten valittiin maantieteellinen paikka läheltä Luulajaa. Kuten oli odotettavissa, soveltamisessa tehdyt likiarvoistukset pienensivät ennustamisen tarkkuutta, jota voidaan kuitenkin pitää kohtuullisena. Kehitettyä mallia sovellettiin RISE:n ICE datakeskusta varten. Algoritmin validointi on suoritettu, mutta lopullinen testaus tehdään RISE:n simulointiympäristössä. Yleisesti ennustamiseen soveltuvaa dataa ei ole Pohjois-Ruotsista tarjolla. Tieteellisiä artikkeleita ko. maantieteelliseltä alueelta ei löytynyt kirjallisuustutkimusta tehtäessä. Tutkimus ruotsalaisella datalla toi ymmärrystä siihen, mitkä muuttujat ovat merkittäviä Pohjois-Ruotsin alueella sekä niiden suhteellisesta merkityksestä. Kahden eri maantieteellisen alueen tietoaineiston käyttö osoitti, että ilmastolla on huomattava vaikutus koulutetun mallin suorituskykyyn. Näin onkin mielekästä käyttää koulutettua mallia vain sellaisilla alueilla, joiden sääolosuhteet ovat samankaltaiset

    An Ensemble Approach for Multi-Step Ahead Energy Forecasting of Household Communities

    This paper addresses the estimation of household communities' overall energy usage and solar energy production, considering different prediction horizons. Forecasting the electricity demand and energy generation of communities can help enrich the information available to energy grid operators to better plan their short-term supply. Moreover, households will increasingly need to know more about their usage and generation patterns to make wiser decisions on their appliance usage and energy-trading programs. The main issues to address here are the volatility of load consumption induced by the consumption behaviour and variability in solar output influenced by solar cells specifications, several meteorological variables, and contextual factors such as time and calendar information. To address these issues, we propose a predicting approach that first considers the highly influential factors and, second, benefits from an ensemble learning method where one Gradient Boosted Regression Tree algorithm is combined with several Sequence-to-Sequence LSTM networks. We conducted experiments on a public dataset provided by the Ausgrid Australian electricity distributor collected over three years. The proposed model's prediction performance was compared to those by contributing learners and by conventional ensembles. The obtained results have demonstrated the potential of the proposed predictor to improve short-term multi-step forecasting by providing more stable forecasts and more accurate estimations under different day types and meteorological conditionspublishedVersio

    Solar Irradiance Forecasting Using Dynamic Ensemble Selection

    Solar irradiance forecasting has been an essential topic in renewable energy generation. Forecasting is an important task because it can improve the planning and operation of photovoltaic systems, resulting in economic advantages. Traditionally, single models are employed in this task. However, issues regarding the selection of an inappropriate model, misspecification, or the presence of random fluctuations in the solar irradiance series can result in this approach underperforming. This paper proposes a heterogeneous ensemble dynamic selection model, named HetDS, to forecast solar irradiance. For each unseen test pattern, HetDS chooses the most suitable forecasting model based on a pool of seven well-known literature methods: ARIMA, support vector regression (SVR), multilayer perceptron neural network (MLP), extreme learning machine (ELM), deep belief network (DBN), random forest (RF), and gradient boosting (GB). The experimental evaluation was performed with four data sets of hourly solar irradiance measurements in Brazil. The proposed model attained an overall accuracy that is superior to the single models in terms of five well-known error metrics

    Modeling renewable energy production and CO2 emissions in the region of Adrar in Algeria using LSTM neural networks

    This paper addresses the slow-onset crisis of global warming caused by CO2 emissions. Although electrical load is a major influence in a country’s growth and development, it is also one of largest sources of greenhouse gases (GHG), CO2 in particular. Therefore, switching to cleaner energy sources is a clear objective and forecasting electricity load and its environmental cost is a necessary task for electrical energy planning and management. This paper addresses short-term load forecasting of renewable energy (RE) production in the region of Adrar in Algeria with Adrar’s photovoltaic (PV) farm and Kabertene’s wind farm. The forecast is compared to the overall load demand, and the reduced amount of CO2 resulting from using renewable energy instead of fossil fuels is calculated. The forecasting models are Long short-term memory (LSTM) neural networks, which were trained and validated using real data provided by the national state-owned company SONALGAZ. The results show good performance for the forecasting models with PV and wind models achieving a Mean-absolute-error (MAE) of 0.024 and 0.1 respectively, and that RE can help reduce CO2 emissions by up to 25% per hour

    Forecasting of Photovoltaic Power Production

    Solar irradiance and temperature are some weather parameters that affect the amount of power photovoltaic cells can generate. Based on these and past power production, future production can be predicted. Knowing" future generation may help the integration of this renewable energy source on an even larger scale than today, as well as optimize the use of them today. In this thesis, forecasting of future power generation was made by an artificial neural network (ANN) model, a support vector regression (SVR) model, an auto-regressive integrated moving average (ARIMA) model, a quantile regression neural network (QRNN) model, an ensemble model of ANN and SVR, an ANN ensemble model and an ANN model using only numerical weather predictions (NWPs) as inputs. Correlation techniques and principal component analysis were used for feature reduction for all models. The research questions for this thesis are, "How will the models perform using random train data to predict August 2021, compared to a random test sample? Will the ensemble models perform better than the standalone models, and will the quantile regression neural network make accurate prediction intervals? How well will the predictions be if the ANN model only uses NWP data as inputs, compared to both historical power and NWPs?". As well as to answer these questions, the objective of this thesis is to provide a model or multiple models that can accurately predict future power production for the PV power system in Lillesand. All models can predict future power production, but some with less accuracy than others. Of all models, as expected, both ensemble models performed best overall for both tests. The SVR model did however perform with the lowest MAE for the August test. For different fits, these results will probably slightly change, but it is expected that the ensemble models will still perform best overall

    Analysis, Characterization, Prediction and Attribution of Extreme Atmospheric Events with Machine Learning: a Review

    Atmospheric Extreme Events (EEs) cause severe damages to human societies and ecosystems. The frequency and intensity of EEs and other associated events are increasing in the current climate change and global warming risk. The accurate prediction, characterization, and attribution of atmospheric EEs is therefore a key research field, in which many groups are currently working by applying different methodologies and computational tools. Machine Learning (ML) methods have arisen in the last years as powerful techniques to tackle many of the problems related to atmospheric EEs. This paper reviews the ML algorithms applied to the analysis, characterization, prediction, and attribution of the most important atmospheric EEs. A summary of the most used ML techniques in this area, and a comprehensive critical review of literature related to ML in EEs, are provided. A number of examples is discussed and perspectives and outlooks on the field are drawn.Comment: 93 pages, 18 figures, under revie

    Wind Power Forecasting Methods Based on Deep Learning: A Survey

    Accurate wind power forecasting in wind farm can effectively reduce the enormous impact on grid operation safety when high permeability intermittent power supply is connected to the power grid. Aiming to provide reference strategies for relevant researchers as well as practical applications, this paper attempts to provide the literature investigation and methods analysis of deep learning, enforcement learning and transfer learning in wind speed and wind power forecasting modeling. Usually, wind speed and wind power forecasting around a wind farm requires the calculation of the next moment of the definite state, which is usually achieved based on the state of the atmosphere that encompasses nearby atmospheric pressure, temperature, roughness, and obstacles. As an effective method of high-dimensional feature extraction, deep neural network can theoretically deal with arbitrary nonlinear transformation through proper structural design, such as adding noise to outputs, evolutionary learning used to optimize hidden layer weights, optimize the objective function so as to save information that can improve the output accuracy while filter out the irrelevant or less affected information for forecasting. The establishment of high-precision wind speed and wind power forecasting models is always a challenge due to the randomness, instantaneity and seasonal characteristics

    Predictive modelling for solar thermal energy systems: A comparison of support vector regression, random forest, extra trees and regression trees

    Predictive analytics play an important role in the management of decentralised energy systems. Prediction models of uncontrolled variables (e.g., renewable energy sources generation, building energy consumption) are required to optimally manage electrical and thermal grids, making informed decisions and for fault detection and diagnosis. The paper presents a comprehensive study to compare tree-based ensemble machine learning models (random forest – RF and extra trees – ET), decision trees (DT) and support vector regression (SVR) to predict the useful hourly energy from a solar thermal collector system. The developed models were compared based on their generalisation ability (stability), accuracy and computational cost. It was found that RF and ET have comparable predictive power and are equally applicable for predicting useful solar thermal energy (USTE), with root mean square error (RMSE) values of 6.86 and 7.12 on the testing dataset, respectively. Amongst the studied algorithms, DT is the most computationally efficient method as it requires significantly less training time. However, it is less accurate (RMSE = 8.76) than RF and ET. The training time of SVR was 1287.80 ms, which was approximately three times higher than the ET training time

    Tree-based ensemble methods for predicting PV power generation and their comparison with support vector regression

    The variability of renewable energy resources, due to characteristic weather fluctuations, introduces uncertainty in generation output that are greater than the conventional energy reserves the grid uses to deal with the relatively predictable uncertainties in demand. The high variability of renewable generation makes forecasting critical for optimal balancing and dispatch of generation plants in a smarter grid. The challenge is to improve the accuracy and the confidence level of forecasts at a reasonable computational cost. Ensemble methods such as random forest (RF) and extra trees (ET) are well suited for predicting stochastic photovoltaic (PV) generation output as they reduce variance and bias by combining several machine learning techniques while improving the stability; i.e. generalisation capabilities. This paper investigated the accuracy, stability and computational cost of RF and ET for predicting hourly PV generation output, and compared their performance with support vector regression (SVR), a supervised machine learning technique. All developed models have comparable predictive power and are equally applicable for predicting hourly PV output. Despite their comparable predictive power, ET outperformed RF and SVR in terms of computational cost. The stability and algorithmic efficiency of ETs make them an ideal candidate for wider deployment in PV output forecasting