597 research outputs found
PRINCIPAL COMPONENT ANALYSIS-VECTOR AUTOREGRESSIVE INTEGRATED (PCA-VARI) MODEL USING DATA MINING APPROACH TO CLIMATE DATA IN THE WEST JAVA REGION
Over a long time, atmospheric changes have been caused by natural phenomena. This study uses the Principal Component Analysis (PCA) model combined with Vector Autoregressive Integrated (VARI) called the PCA-VARI model through the data mining approach. PCA reduces ten variables of climate data into two principal components during ten years (2001-2020) of climate data from NASA Prediction Of Worldwide Energy Resources. VARI is a non-stationary multivariate time series to model two or more variables that influence each other using a differencing process. The Knowledge Discovery in Database (KDD) method was conducted for empirical analysis. Pre-processing is an analysis of raw climate data. The data mining process determines the proportion of each component of PCA and is selected as variables in the VARI process. The postprocessing is by visualizing and interpreting the PCA-VARI model. Variables of solar radiation and precipitation are strongly correlated with each measurement location data. A forecast of the interaction of variables between locations is shown in the results of Impulse Response Function (IRF) visualization, where the climate of the West Java region, especially the Lembang and Bogor areas, has strong response climate locations, which influence each other
Review of automated time series forecasting pipelines
Time series forecasting is fundamental for various use cases in different
domains such as energy systems and economics. Creating a forecasting model for
a specific use case requires an iterative and complex design process. The
typical design process includes the five sections (1) data pre-processing, (2)
feature engineering, (3) hyperparameter optimization, (4) forecasting method
selection, and (5) forecast ensembling, which are commonly organized in a
pipeline structure. One promising approach to handle the ever-growing demand
for time series forecasts is automating this design process. The present paper,
thus, analyzes the existing literature on automated time series forecasting
pipelines to investigate how to automate the design process of forecasting
models. Thereby, we consider both Automated Machine Learning (AutoML) and
automated statistical forecasting methods in a single forecasting pipeline. For
this purpose, we firstly present and compare the proposed automation methods
for each pipeline section. Secondly, we analyze the automation methods
regarding their interaction, combination, and coverage of the five pipeline
sections. For both, we discuss the literature, identify problems, give
recommendations, and suggest future research. This review reveals that the
majority of papers only cover two or three of the five pipeline sections. We
conclude that future research has to holistically consider the automation of
the forecasting pipeline to enable the large-scale application of time series
forecasting
Probabilistic Wind Power and Electricity Load Forecasting with Echo State Networks
With the introduction of distributed generation and the establishment of smart grids,
several new challenges in energy analytics arose. These challenges can be solved with a
specific type of recurrent neural networks called echo state networks, which can handle
the combination of both weather and power consumption or production depending on the
dataset to make predictions. Echo state networks are particularly suitable for time series
forecasting tasks. Having accurate energy forecasts is paramount to assure grid operation
and power provision remains reliable during peak hours when the consumption is high.
The majority of load forecasting algorithms do not produce prediction intervals with
coverage guarantees but rather produce simple point estimates. Information about uncer-
tainty and prediction intervals is rarely useless. It helps grid operators change strategies
for configuring the grid from conservative to risk-based ones and assess the reliability of
operations.
A popular way of producing prediction intervals in regression tasks is by applying Bayesian
regression as the regression algorithm. As Bayesian regression is done by sampling, it nat-
urally lends itself to generating intervals. However, Bayesian regression is not guaranteed
to satisfy the designed coverage level for finite samples.
This thesis aims to modify the traditional echo state network model to produce marginally
valid and calibrated prediction intervals. This is done by replacing the standard linear
regression method with Bayesian linear regression while simultaneously reducing the di-
mensions to speed up the computation times. Afterward, a novel calibration technique
for time series forecasting is applied in order to obtain said valid prediction intervals.
The experiments are conducted using three different time series, two of them being a time
series of electricity load. One is univariate, and the other is bivariate. The third time series
is a wind power production time series. The proposed method showed promising results
for all three datasets while significantly reducing computation times in the sampling ste
Energy consumption forecasting using machine learning
Forecasting electricity demand and consumption accurately is critical to the optimal and costeffective operation system, providing a competitive advantage to companies. In working with seasonal data and external variables, the traditional time-series forecasting methods cannot be applied to electricity consumption data. In energy planning for a generating company, accurate power forecasting for the electrical consumption prediction, as a technique, to understand and predict the market electricity demand is of paramount importance. Their power production can be adjusted accordingly in a deregulated market. As data type is seasonal, Persistence Models (Naïve Models), Seasonal AutoRegressive Integrated Moving Averages with eXogenous regressors (SARIMAX), and Univariate Long-Short Term Memory Neural Network (LSTM) is used to explicitly deal with seasonality as a class of time-series forecasting models. The main purpose of this project is to perform exploratory data analysis of the Spain power, then use different forecasting models to once-daily predict the next 24 hours of energy demand and daily peak demand. To split the electricity consumption data from 2015 to 2018 into training and test sets, the first three years from 2015 and 2017 were used as the training set, while values from 2018 were used as the test set. The obtained results showed that the machine learning algorithms proposed in the recent literature outperformed the tested algorithms. Models are evaluated using root mean squared error (RMSE) to be directly comparable to energy readings in the data. RMSE has calculated two ways. First to represent the error of predicting each hour at a time (i.e. one error per-hourly slice). Second to represent the models’ overall performance. The results show that electricity demand can be modeled using machine learning algorithms, deploying renewable energy, planning for high/low load days, and reducing wastage from polluting on reserve standby generation, detecting abnormalities in consumption trends, and quantifying energy and cost-saving measures
mCity: utilização de dados de monitorização de uma cidade inteligente para caraterizar e melhorar a mobilidade urbana
The sustainable growth of cities created the need for better informed decisions
based on information and communication technologies to sense the
city and quantify its pulse. An important part in this concept of \smart
cities" is the characterization of the traffic flows.
In this work, we aim at characterizing the urban mobility in two different
cities, Porto and Aveiro. The structure and contents of the corresponding
datasets is very different, enabling two case studies, with distinct use cases
related to traffic analysis and forecasting.
For the Porto use case, we had access to road-mounted traffic sensors and
the buses tracking data. The first source was studied and was looked for
patterns (e.g.: weekdays behavior). Historic traffic counters data was used
to forecast future flows, using both statistical and deep learning methods.
We found that it was not possible to find a clear relationship between (buses)
speed and traffic intensity, however, when the speed was high, there was low
intensity, and when there was high intensity, the velocity was low. There
are daily and weekly patterns in the traffic flow data that enable forecasting.
When the anomalies in traffic do happen, the methods for short-term
forecasting perform better than those for long-term forecasting.
In the Aveiro use case, the dataset includes bus traces, that were used to
characterize the driving behavior, based on speed and acceleration. These
data were mapped into the city to find problematic areas. Side-by-side visualizations
help with the comparison of the traffic behavior in selected time
periods. We observed that some roads often present the same problems,
independently of the day or time of the day. In other parts of the city, the
problems can be found more often in specifics periods.
The datasets for Aveiro and Porto were sampled with different frequency
(each second and each minute, respectively). We confirmed, with simulations,
that the analysis made for Aveiro was not possible with the granularity
of the Porto's data set (as some information would be lost).
The computational pipeline to run the supporting analyses is fully implemented,
as well the required integrations to programmatically obtain the
data from the existing data sinks. For the driving behavior analysis, a web
dashboard is deployed, enabling the relevant departments to study potential
problematic areas in the city of Aveiro.O crescimento sustentável das cidades criou a necessidade de decisões
melhor informadas, baseadas em tecnologias de informação e comunicação
para sentir a cidade e quantificar o seu pulso. Uma parte importante no
conceito de “cidades inteligentes" é a caracterização dos luxos de tráfego.
O objetivo deste trabalho ‘e caraterizar a mobilidade em duas cidades
diferentes: Porto e Aveiro. A estrutura e conteúdo dos respetivos datasets
é muito diferente, permitindo dois casos de estudo, com casos de uso
distintos relacionados com a análise de tráfego e a previsão.
Para o caso de uso do Porto, foi concedido acesso a sensores de tráfego
instalados na estrada e dados de rastreamento de autocarros. Para a
primeira fonte realizou-se um estudo e a pesquisa de padrões (por exemplo,
o comportamento dos dias da semana). Dados históricos dos contadores
de tráfego foram usados para prever fluxos futuros, usando métodos
estatÃsticos e de aprendizagem profunda.
Descobrimos que não era possÃvel encontrar uma relação clara entre
a velocidade (dos autocarros) e a intensidade do tráfego, no entanto,
quando a velocidade era alta, havia baixa intensidade e, quando havia alta
intensidade, a velocidade era baixa. Existem padrões diários e semanais nos
dados do fluxo de tráfego que permitem a previsão. Quando as anomalias
no tráfego ocorrem, os métodos para previsão de curto prazo têm um
desempenho melhor do que aqueles para previsão de longo prazo.
Para o caso de uso de Aveiro, o conjunto de dados inclui rastreamentos
de autocarros, que foram utilizados para caraterizar o comportamento
de condução, baseado na velocidade e aceleração. Esses dados foram
mapeados na cidade para encontrar áreas problemáticas. As visualizações
lado a lado ajudam na comparação do comportamento do tráfego em
perÃodos selecionados. Foi observado que algumas estradas apresentam
frequentemente os mesmos problemas, independentemente do dia ou
da hora do dia. Em outras partes da cidade, os problemas podem ser
encontrados com mais frequência em perÃodos especÃficos.
Os conjuntos de dados de Aveiro e Porto tinham amostras com diferentes
frequências (a cada segundo e a cada minuto, respetivamente). Confirmamos,
com simulações, que a analise feita para Aveiro não era possÃvel
com a granularidade do conjunto de dados do Porto (dado que algumas
informações seriam perdidas).
A pipeline computacional para executar as análises de suporte foi totalmente
implementada, bem como as integrações necessárias para obter
programaticamente os dados das fontes de dados existentes. Foi desenvolvida
uma pipeline de previsão de tráfego para o Porto. Para a análise
do comportamento de condução, foi construÃda uma web dashboard,
permitindo que os departamentos relevantes estudem possÃveis áreas
problemáticas na cidade de Aveiro.Mestrado em Engenharia Informátic
A comparison of wind forecasting methods for Norwegian on-shore wind : a perspective into the nuances in wind speed to power conversion and the economic costs associated with wind forecast accuracy
This paper examines various short-term forecasting methods to forecast hourly wind energy production
in Norway. Performance of forecasting methods were compared across different months,
through different evaluation metrics, to analyze the uniformity and dependability of methods. More
than a decade of hourly wind speed data spanning across 69 locations along the Norwegian coast
were utilized in the study.
Given the upcoming integration of EU and Nordic intraday electricity markets into the Cross-
Border Intraday market (XBID), the study focuses on one-hour-ahead forecasting to be in alignment
with the operations of intraday electricity market. With the final objective of predicting power
production, a customized loss function, Power Curve Conversion Error with penalty, which takes
into account both wind speed to power conversion and economic cost associated with over and under
forecast, is included as part of the evaluation metrics to capture the true value of each model’s
predictions.
Forecasting methods undertaken consist of a mix of Statistical and Machine-Learning methods,
with Naive forecasting used as the overall benchmark model. Other Statistical methods are
ARIMA, and ARIMAX which includes the use of seasonal and time of day dummies. In terms
of ML methods, Gradient Boosted Trees, Extremely Randomized Forest, and Neural Network
are selected. Finally, a hybrid model of ARIMAX and Extremely Randomized Forest is also
formulated. These methods are then evaluated on multiple evaluation metrics, namely: RMSE,
MAE, Classification Accuracy, and the Power Curve Conversion Error with penalty.
The general implication of the study reveals that accuracy of models are consistent with their
required computational intensity, with ML models outperforming statistical methods in most situations.
The findings also suggest the Hybrid model to be the most suitable forecasting method for
one-hour-ahead forecast under almost all evaluation metrics employed. This conclusion holds true
for wind power forecasting under different seasons of the year as well.nhhma
TIME SERIES IMPUTATION USING VAR-IM (CASE STUDY: WEATHER DATA IN METEOROLOGICAL STATION OF CITEKO)
Univariate imputation methods are defined as imputation methods that only use the information of the target variable to estimate missing values. While univariate imputation methods are convenient and flexible since no other variable is required, multivariate imputation methods can potentially improve imputation accuracy given that the other variables are relevant to the target variable. Many multivariate imputation methods have been proposed, one of which is Vector Autoregression Imputation Method (VAR-IM). This study aims to compare imputation results of VAR-IM-based methods and univariate imputation methods on time series data, specifically on long lag seasonal data such as daily weather data. Three modified VAR-IM methods were studied using simulations with three steps: deletion, imputation, and evaluation. The deletion step was conducted using six different schemes with six missing proportions. The simulations were conducted on secondary daily weather data collected from meteorological station of Citeko from January 1, 1991, to June 22, 2013. Nine weather variables were examined, that is the minimum, maximum, and average temperatures, average humidity, rainfall rate, duration of solar radiation, maximum and average wind speed, as well as wind direction at maximum speed. The simulation results show that the three modified VAR-IM methods can improve the accuracies in around 75% of cases. The simulation results also show that imputation results of VAR-IM-based methods tend to be more stable in accuracy as the missing proportion increase compared to the imputation results of univariate imputation methods
An Additive Schwarz Preconditioner for the Spectral Element Ocean Model Formulation of the Shallow Water Equations
We discretize the shallow water equations with an Adams-Bashford scheme combined with the Crank-Nicholson scheme for the time derivatives and spectral elements for the discretization in space. The resulting coupled system of equations will be reduced to a Schur complement system with a special structure of the Schur complement. This system can be solved with a preconditioned conjugate gradients, where the matrix-vector product is only implicitly given. We derive an overlapping block preconditioner based on additive Schwarz methods for preconditioning the reduced system
- …