96 research outputs found

    An evaluation of time series forecasting models on water consumption data: A case study of Greece

    Full text link
    In recent years, the increased urbanization and industrialization has led to a rising water demand and resources, thus increasing the gap between demand and supply. Proper water distribution and forecasting of water consumption are key factors in mitigating the imbalance of supply and demand by improving operations, planning and management of water resources. To this end, in this paper, several well-known forecasting algorithms are evaluated over time series, water consumption data from Greece, a country with diverse socio-economic and urbanization issues. The forecasting algorithms are evaluated on a real-world dataset provided by the Water Supply and Sewerage Company of Greece revealing key insights about each algorithm and its use

    A deep learning approach to real-time short-term traffic speed prediction with spatial-temporal features

    Get PDF
    In the realm of Intelligent Transportation Systems (ITS), accurate traffic speed prediction plays an important role in traffic control and management. The study on the prediction of traffic speed has attracted considerable attention from many researchers in this field in the past three decades. In recent years, deep learning-based methods have demonstrated their competitiveness to the time series analysis which is an essential part of traffic prediction. These methods can efficiently capture the complex spatial dependency on road networks and non-linear traffic conditions. We have adopted the convolutional neural network-based deep learning approach to traffic speed prediction in our setting, based on its capability of handling multi-dimensional data efficiently. In practice,the traffic data may not be recorded with a regular interval, due to many factors, like power failure, transmission errors,etc.,that could have an impact on the data collection. Given that some part of our dataset contains a large amount of missing values, we study the effectiveness of a multi-view approach to imputing the missing values so that various prediction models can apply. Experimental results showed that the performance of the traffic speed prediction model improved significantly after imputing the missing values with a multi-view approach, where the missing ratio is up to 50%

    Estimation Of Idle Time Using Machine Learning Models For Vehicle-To-Grid (V2G) Integration And Services

    Get PDF
    As the Electric Vehicles (EVs) market continues to expand, ensuring the access to charging stations remains a significant concern. This work focuses on addressing multiple challenges related to EV charging behavior and Vehicle-to-Grid (V2G) services. Firstly, it focuses on accurate minute-ahead (20 minute \& 30 minute intervals) load forecasts for an EV charging station by using four years of historical data, from 2018-2021. This data is recorded from a university campus garage charging station. Machine Learning (ML) models such as Seasonal Auto-Regressive Integrated Moving Average (SARIMA), Random Forest (RF), and Neural Networks (NN) are employed for load forecasts in terms of Kilowatt hour (kWh) delivered from 54 charging stations. Preliminary results indicate that RF method performed better compared to other ML approaches, achieving a average Mean Absolute Error (MAE) of 7.26 on historical weekdays data. Secondly, it focuses on estimating the probability of aggregated available capacity of users for V2G connections, which could be sold back to the grid through V2G system. To achieve this, an Idle Time (IT) parameter was tracked from the time spent by the EV users at the charging station after being fully charged. ML classification methods such as Logistic Regression (LR) and Linear Support Vector Classifier (SVC) were employed to estimate the IT variable. The SVC model performed better in estimating IT variable with an accuracy of 85% over LR 81%. This work also analyzes the aggregated excess kWh available from the charging stations for V2G services, which offer benefits to both EV owners through incentives and the grid by balancing the load. ML models, including Support Vector Regressor (SVR), Gradient Boosting Regressor (GBR), Long-Short Term Memory (LSTM), and Random Forest (RF), are employed. LSTM performs better for this prediction problem with a Mean Absolute Percentage Error (MAPE) of 3.12, and RF as second best with lowest 3.59, when considering historical data on weekdays. Furthermore, this work estimated the number of users available for V2G services corresponding to 15\% and 30\% of excess kWh, by using ML classification models such as Decision Tree (DT) and K Nearest Neighbor (KNN). Among these models, DT performed better, with highest 89% and 84% accuracy respectively. This work also investigated the impact of the COVID-19 pandemic on EV users\u27 charging behavior. This study analyzes the behavior modelled as before, after, and during COVID-19, employing data visualization using K-means and hierarchical clustering methods to identify common charging pattern with connection and disconnection time of the vehicles. K-means clustering proves to be more effective in all three scenarios modeled with a high silhouette index. Furthermore, prediction of collective charging session duration is achieved using ML Models, RF and XgBoost which achieved a MAPE of 14.6% and 15.1% respectively

    Review of automated time series forecasting pipelines

    Get PDF
    Time series forecasting is fundamental for various use cases in different domains such as energy systems and economics. Creating a forecasting model for a specific use case requires an iterative and complex design process. The typical design process includes the five sections (1) data pre-processing, (2) feature engineering, (3) hyperparameter optimization, (4) forecasting method selection, and (5) forecast ensembling, which are commonly organized in a pipeline structure. One promising approach to handle the ever-growing demand for time series forecasts is automating this design process. The present paper, thus, analyzes the existing literature on automated time series forecasting pipelines to investigate how to automate the design process of forecasting models. Thereby, we consider both Automated Machine Learning (AutoML) and automated statistical forecasting methods in a single forecasting pipeline. For this purpose, we firstly present and compare the proposed automation methods for each pipeline section. Secondly, we analyze the automation methods regarding their interaction, combination, and coverage of the five pipeline sections. For both, we discuss the literature, identify problems, give recommendations, and suggest future research. This review reveals that the majority of papers only cover two or three of the five pipeline sections. We conclude that future research has to holistically consider the automation of the forecasting pipeline to enable the large-scale application of time series forecasting

    Towards More Accurate and Explainable Supervised Learning-Based Prediction of Deliverability for Underground Natural Gas Storage

    Get PDF
    Numerous subsurface factors, including geology and fluid properties, can affect the connectivity of the storage spaces in depleted reservoirs; hence, fluid flow simulations become more complicated, and predicting their deliverability remains challenging. This paper applies Machine Learning (ML) techniques to predict the deliverability of underground natural gas storage (UNGS) in depleted reservoirs. First, three baseline models were developed based on Support Vector Regression (SVR), Artificial Neural Network (ANN), and Random Forest (RF) algorithms. To improve the accuracy of the RF model as the best-performing baseline model, a unified framework, referred to as SARF, was developed. SARF combines the capabilities of Sparse Autoencoder (SA) and that of Random Forest (RF). To achieve this, the internal representations of the SA, which constitute extracted features of the input variables, are used in RF to develop the proposed SARF framework. The predictive capabilities of the baseline models and the proposed SARF model were validated using 3744 real-world storage data samples of 52 active storage reservoirs in the United States. The experimental result of this study shows that the proposed SARF model achieved an average 5.7% increase in accuracy on four separate data partitions over the baseline RF model. Furthermore, a set of eXplainable Artificial Intelligence (XAI) methods were developed to provide an intuitive explanation of which factors influence the deliverability of reservoir storage. The visualizations developed using the XAI method provide an easy-to-understand interpretation of how the SARF model predicted the deliverability values for separate reservoirs

    Prédiction de la tendance des actions basée sur les réseaux convolutifs graphiques et les LSTM

    Get PDF
    Abstract: As stocks have been developing over decades, the trend and the price of a stock are more often used for predictions in stock market analysis. In the field of finance, an accurate stock future trending can not only help decision-makers estimate the possibility of profit, but also help them avoid risks. In this research, we present a quantitative approach to predicting the trend of stocks in which a clustering model is employed to mine the stock trends patterns from historical stock price data. Stock series clustering is a special kind of time series clustering. We aim to find out the trend types, e.g. rising, falling and others, of a stock at time intervals, and then make use of the past trends to predict its future trend. The proposed prediction method is based on Graph Convolutional Neural Network for clustering and Long Short-Term Memory model for prediction. This method is suitable for the data clustering of unbalanced classes too. The experiments on real-world stock data demonstrate that our method can yield accurate forecasts. In the long run, the proposed method can be used to explore new possibilities in the research field of time series clustering, such as using other graph neural networks to predict stock trends.Comme les prix des actions évoluent au fil des décennies, la tendance et le prix d’une action sont souvent utilisés pour effectuer des prévisions en bourse. Bien anticiper la tendance future des actions peut non seulement aider les décideurs à mieux estimer les possibilités de profit, mais aussi les risques. Dans cette thèse, une approche quantitative est présentée pour prédire les fluctuations d’actions. L’approche se base sur une méthode de clustering pour identifier la tendance des actions à partir de leurs données historiques. C’est un type particulier de clustering appliqué sur des séries chronologiques. Il consiste à découvrir les tendances des actions sur des intervalles de temps, tel que des tendances haussières, des tendances baissières, et ensuite d’utiliser ces tendances pour prédire leurs états futurs. La méthode de prédiction proposée se base sur les réseaux de neurones convolutionnels graphiques et des réseaux récurrents mémoire pour la prédiction. Cette méthode fonctionne également sur des ensembles de données où la proportion des classes est déséquilibrée. Les données historiques des actions démontrent que la méthode proposée effectue des prévisions précises. La méthode proposée peut ouvrir une nouvelle perspective de recherche pour le clustering de séries chronologiques, notamment l’utilisation d‘autres réseaux de neurones graphiques pour prédire les tendances des actions

    Review of automated time series forecasting pipelines

    Get PDF
    Time series forecasting is fundamental for various use cases in different domains such as energy systems and economics. Creating a forecasting model for a specific use case requires an iterative and complex design process. The typical design process includes the five sections (1) data pre-processing, (2) feature engineering, (3) hyperparameter optimization, (4) forecasting method selection, and (5) forecast ensembling, which are commonly organized in a pipeline structure. One promising approach to handle the ever-growing demand for time series forecasts is automating this design process. The present paper, thus, analyzes the existing literature on automated time series forecasting pipelines to investigate how to automate the design process of forecasting models. Thereby, we consider both Automated Machine Learning (AutoML) and automated statistical forecasting methods in a single forecasting pipeline. For this purpose, we firstly present and compare the proposed automation methods for each pipeline section. Secondly, we analyze the automation methods regarding their interaction, combination, and coverage of the five pipeline sections. For both, we discuss the literature, identify problems, give recommendations, and suggest future research. This review reveals that the majority of papers only cover two or three of the five pipeline sections. We conclude that future research has to holistically consider the automation of the forecasting pipeline to enable the large-scale application of time series forecasting

    Improving hydrologic modeling of runoff processes using data-driven models

    Get PDF
    2021 Spring.Includes bibliographical references.Accurate rainfall–runoff simulation is essential for responding to natural disasters, such as floods and droughts, and for proper water resources management in a wide variety of fields, including hydrology, agriculture, and environmental studies. A hydrologic model aims to analyze the nonlinear and complex relationship between rainfall and runoff based on empirical equations and multiple parameters. To obtain reliable results of runoff simulations, it is necessary to consider three tasks, namely, reasonably diagnosing the modeling performance, managing the uncertainties in the modeling outcome, and simulating runoff considering various conditions. Recently, with the advancement of computing systems, technology, resources, and information, data-driven models are widely used in various fields such as language translation, image classification, and time-series analysis. In addition, as spatial and temporal resolutions of observations are improved, the applicability of data-driven models, which require massive amounts of datasets, is rapidly increasing. In hydrology, rainfall–runoff simulation requires various datasets including meteorological, topographical, and soil properties with multiple time steps from sub-hourly to monthly. This research investigates whether data-driven approaches can be effectively applied for runoff analysis. In particular, this research aims to explore if data-driven models can 1) reasonably evaluate hydrologic models, 2) improve the modeling performance, and 3) predict hourly runoff using distributed forcing datasets. The details of these three research aspects are as follows: First, this research developed a hydrologic assessment tool using a hybrid framework, which combines two data-driven models, to evaluate the performance of a hydrologic model for runoff simulation. The National Water Model, which is a fully distributed hydrologic model, was used as the physical-based model. The developed assessment tool aims to provide easy-to-understand performance ratings for the simulated hydrograph components, namely, the rising and recession limbs, as well as for the entire hydrograph, against observed runoff data. In this research, four performance ratings were used. This is the first research that tries to apply data-driven models for evaluating the performance of the National Water Model and the results are expected to reasonably diagnose the model's ability for runoff simulations based on a short-term time step. Second, correction of errors inherent in the predicted runoff is essential for efficient water management. Hydrologic models include various parameters that cannot be measured directly, but they can be adjusted to improve the predictive performance. However, even a calibrated model still has obvious errors in predicting runoff. In this research, a data-driven model was applied to correct errors in the predicted runoff from the National Water Model and improve its predictive performance. The proposed method uses historic errors in runoff to predict new errors as a post-processor. This research shows that data-driven models, which can build algorithms based on the relationships between datasets, have strong potential for correcting errors and improving the predictive performance of hydrologic models. Finally, to simulate rainfall-runoff accurately, it is essential to consider various factors such as precipitation, soil property, and runoff coming from upstream regions. With improvements in observation systems and resources, various types of forcing datasets, including remote-sensing based data and data-assimilation system products, are available for hydrologic analysis. In this research, various data-driven models with distributed forcing datasets were applied to perform hourly runoff predictions. The forcing datasets included different hydrologic factors such as soil moisture, precipitation, land surface temperature, and base flow, which were obtained from a data assimilation system. The predicted results were evaluated in terms of seasonal and event-based performances and compared with those of the National Water Model. The results demonstrated that data-driven models for hourly runoff forecasting are effective and useful for short-term runoff prediction and developing flood warning system during wet season

    Type-2 fuzzy logic system applications for power systems

    Get PDF
    PhD ThesisIn the move towards ubiquitous information & communications technology, an opportunity for further optimisation of the power system as a whole has arisen. Nonetheless, the fast growth of intermittent generation concurrently with markets deregulation is driving a need for timely algorithms that can derive value from these new data sources. Type-2 fuzzy logic systems can offer approximate solutions to these computationally hard tasks by expressing non-linear relationships in a more flexible fashion. This thesis explores how type-2 fuzzy logic systems can provide solutions to two of these challenging power system problems; short-term load forecasting and voltage control in distribution networks. On one hand, time-series forecasting is a key input for economic secure power systems as there are many tasks that require a precise determination of the future short-term load (e.g. unit commitment or security assessment among others), but also when dealing with electricity as commodity. As a consequence, short-term load forecasting becomes essential for energy stakeholders and any inaccuracy can be directly translated into their financial performance. All these is reflected in current power systems literature trends where a significant number of papers cover the subject. Extending the existing literature, this work focuses in how these should be implemented from beginning to end to bring to light their predictive performance. Following this research direction, this thesis introduces a novel framework to automatically design type-2 fuzzy logic systems. On the other hand, the low-carbon economy is pushing the grid status even closer to its operational limits. Distribution networks are becoming active systems with power flows and voltages defined not only by load, but also by generation. As consequence, even if it is not yet absolutely clear how power systems will evolve in the long-term, all plausible future scenarios claim for real-time algorithms that can provide near optimal solutions to this challenging mixed-integer non-linear problem. Aligned with research and industry efforts, this thesis introduces a scalable implementation to tackle this task in divide-and-conquer fashio
    corecore