1,680 research outputs found

    Gaussian-process-based demand forecasting for predictive control of drinking water networks

    Get PDF
    This paper focuses on water demand forecasting for predictive control of Drinking Water Networks (DWN) in the short term by using Gaussian Process (GP). For the predictive control strategy, system states in a finite horizon are generated by a DWN model and demands are regarded as system disturbances. The goal is to provide a demand estimation within a given confidence interval. For the sake of obtaining a desired forecasting performance, the forecasting process is carried out in two parts: the expected part is forecasted by Double-Seasonal Holt-Winters (DSHW) method and the stochastic part is forecasted by GP method. The mean value of water demand is firstly estimated by DSHW while GP provides estimations within a confidence interval. GP is applied with random inputs to propagate uncertainty at each step. Results of the application of the proposed approach to a real case study based on the Barcelona DWN have shown that the general goal has been successfully reached.Peer ReviewedPostprint (author’s final draft

    Multivariate Bayesian Machine Learning Regression for Operation and Management of Multiple Reservoir, Irrigation Canal, and River Systems

    Get PDF
    The principal objective of this dissertation is to develop Bayesian machine learning models for multiple reservoir, irrigation canal, and river system operation and management. These types of models are derived from the emerging area of machine learning theory; they are characterized by their ability to capture the underlying physics of the system simply by examination of the measured system inputs and outputs. They can be used to provide probabilistic predictions of system behavior using only historical data. The models were developed in the form of a multivariate relevance vector machine (MVRVM) that is based on a sparse Bayesian learning machine approach for regression. Using this Bayesian approach, a predictive confidence interval is obtained from the model that captures the uncertainty of both the model and the data. The models were applied to the multiple reservoir, canal and river system located in the regulated Lower Sevier River Basin in Utah. The models were developed to perform predictions of multi-time-ahead releases of multiple reservoirs, diversions of multiple canals, and streamflow and water loss/gain in a river system. This research represents the first attempt to use a multivariate Bayesian learning regression approach to develop simultaneous multi-step-ahead predictions with predictive confidence intervals for multiple outputs in a regulated river basin system. These predictions will be of potential value to reservoir and canal operators in identifying the best decisions for operation and management of irrigation water supply systems

    Estudio comparativo completo de varios métodos basados en datos para la gestión de los recursos hídricos en ambientes mediterráneos a través de diferentes escalas temporales

    Get PDF
    Since the beginning of time, there has been innovation in the knowledge and technology of water and the hydraulic systems, to achieve an efficient and upgrade management of them. In this project, as an opening hypothesis, we will apply computational techniques and Artificial Intelligence concepts. Given that the primary asset of these studies is data, we have preferred to use the term ”Data-Driven”, as the term Artificial Intelligence can cause confusion in non-experts. This is an expanding field in all aspects of science and life, where the computing and processing powers are increasing periodic, so does the generation of information. There we have 5G technology, or the Internet of things, where the exponential build up in the volume of data utilised, pushes us to set up frameworks for the treatment and analysis of the information.Data-Driven techniques offers enormous potential to transform our perception to understand,monitor and predict the states of hydro-meteorological variables. Its application provides benefits, however, performing these exercises requires practice and explicit knowledge. Therefore, a deeper understanding of the capabilities and limitations of novel computational techniques within our field of knowledge is needed. Hence, it is essential to carry out ”hydro-informatics” experiences under this assumption. For the development of these models, we identify which points are the most relevant and need to be taken into account in regional conditions or frameworks. In consequence, we will work with the time series collected in the different monitoring networks, selecting the hydrological points of interest, in order to further develop hydrological frameworks that are useful for water management and optimisation. Here, we are interested in seeing the practical applicability to hydro-meteorology under Mediterranean conditions, where data are sometimes scarce, by selecting two hydrographic basins in south-east Andalusia: the Guadalhorce river (Málaga) and the Guadalfeo river (Granada). In chapter 1, an introduction to the doctoral thesis is made. Likewise, we establish the general and the specific objectives, and the motivation of the thesis. Afterwards, we describe the three fundamental exercises to be carried out in the research work: Regression, Classification and Optimisation. Ultimately, we carry out a brief review of previous works under Mediterranean climatic conditions and similar assumptions. Chapter 2 presents the study areas, analysing the spatial and temporal characteristics of two Andalusian Mediterranean basins in south-east Spain: Guadalhorce (GH) and Guadalfeo (GF). These are hydrographic basins with highly variable/heterogeneous spacetime patterns. The first hydrological system, GH, contains an area of socio-economic importance, such is the city of M´alaga. The second, GF, to the north has the Sierra Nevada National Park, crowned by the Mulhac´en peak and flowing in a few kilometres into the area of Motril. In this particular water system, we find large gradients of the geophysical agents. Both systems have regulation structures of great interest for the development and study of their optimisation. We also review the monitoring networks available in these basins, and which environmental agents and/or processes should be taken into account to meet the objectives of this work. We carry out a bibliographic review of the most relevant historical floods, listing the factors associated with these extreme events. In the data analysis stage of this chapter, we focus on the spatialtemporal evolution of the risk of flooding in the two mouths of the Guadalhorce and Guadalfeo Rivers into the Albor´an Sea. We quantify that had stepped up in recent years, noting that dangerous practices have increased the risk of flooding because of the intrusion of land uses with high-costs. This chapter also analyses collected data within the monitoring networks, to understand the occurrence of floods in the river GH related to upstream discharges. We found that this basin has limitations in regulation and cannot mitigate costs downstream. The results got, were part of the work presented in Egüen et al. (2015). These analyses allow us to identify in which parts of the flood management of this hydrological system need a more precise optimisation. Finally, a summary of another important hydrological risk is carried out, such as droughts, and how these water deficits can be represented by standardised indices, both in rainfall and the flow rates. The various approaches and methodologies for hydro-meteorological time series modelling are discussed in the chapter 3. The contrasting concepts are exposed antagonistically, to focus on the different design choices that we need to make: black box vs. grey box vs. white box, parametric vs. non-parametric, static vs. dynamic, linear vs. non-linear, frequency vs. Bayesian, single vs. multiple, among others..., detailing the advantages and disadvantages of each approach. We presented some ideas that emerged in this part of the research in Herrero et al. (2014). The partition, management and data transformation steps for the correct application of these experimental methods are also discussed. This is of great importance, since part of the hard work in the application of these methods comes from the transformation of the data. So that, the algorithms and transfer functions work correctly. Finally, we focus on how to test and validate the deterministic and probabilistic behaviours through evaluative coefficients to avoid coefficients that mask the results, and therefore focus on the behaviours of our interest, in our case precision and predictability. We have also taken parsimony into account in models based on neural networks, since they can easily fall into over-parameterisation. In chapter 4, we present the experimental work, where seven short-term, six daily and one hourly rainfall-runoff regressions are performed. The case studies correspond to various points of interest within the study areas with important implications for hydrological management. On an hourly scale, we analyse the efficiency and predictive capacities of the MLR and BNN at ten time horizons for the level of the Guadalhorce River in Cártama. We found that, for closer predictive horizons, a simpler approach such as linear (MLR) can outperform other with a priori higher capabilities, such as non-linear (BNN). This finding could simplify greatly its development and application. At a daily scale, we establish a comparative framework between the two previous models and a complete Bayesian method such as the Gaussian Processes. This DD computational technique, allows us to apply different transfer functions under a single model. This is an advantage over the other two DD models, since the results show that they work well in one domain, but do not work well in the other. During the construction of the models, we do the selection of the input variables in a progressive way, through a trial-and-error method, where the significant improvements with respect to the last predictor structure are taken into account preserving the principle of parsimony. Here, we have used different types of data: real data collected in the monitoring networks, and data generated in parallel from physically based hydrological modelling (WiMMed). The results are robust, where the major limitation is the high computational cost by the recurrent and iterative method used. Some results of this chapter, were presented in Gulliver et al. (2014). In chapter 5 three medium-term time scale prediction experiments are performed. We base the first modelling experiment on a quarterly scale, where a hydrological time scheme determines the cumulative flow for specific time horizons. We start the scheme according to the relevant dates where hydrological planning takes place. It is validated that the forecasts are more prosperous after have been consumed the first six months of the hydrological year. Instead of the three months in which we carry out the evaluations. The observed input variables quantified in the water system are: cumulative stream flow, cumulative rainfall, cumulative snowfall values and atmospheric oscillations (AO). At the level of modelling with DD, this experience has shown the importance of combining mixed regression classification models instead of only regression models within static frameworks. In this manner, we reduce and narrow the space of possible solutions and, therefore, we optimised the predictive behaviour of the DD model. During the development of this exercise, we have also carried out a classification practice comparing three DD classifiers: Probabilistic Neural Network (PNN), K-Nearest Neighbour (KNN) and Support Vector Machine (SVM). We see that the SVM behaves better than the others with our data. However, more research is still needed on classifiers in hydro-meteorological frameworks like ours, because of their variability. We showed this part of the doctoral thesis in Gulliver et al. (2016). In the second section of this chapter (Sec. 5.3), we carry out a rain forecast exercise on a monthly scale. To do so, we use BNN following the same construction method of the SVI model exposed in the previous chapter (Sec. Ref. Chapter 4), thus validating it in another time scale. However, the results in predictive terms are poor for this hydro-meteorological variable. This confirms the difficulty of predicting this variable from historical data and without the incorporation of dynamic tools. Thus, the need for complex hydrodynamic modelling for the prediction of this important variable is confirmed. On the other hand, this case serves to empirically infer the causality of the most relevant atmospheric oscillations in the points of study. From multiple simulations with the model-based approach it has been possible to establish which indices have a greater influence. In the last section of this chapter (Section 5.4), an exercise was carried out to predict the deviation or anomaly of rainfall and runoff indices for four time series representative of different locations within the Guadalfeo BR. In this case, we verified the suitability of seven statistical distributions to characterize the anomalies/deviations under Mediterranean conditions. Under this hypothesis, the indices that passed the Shapiro-Wilk normality test were modelled to analyse the capabilities of BNN to predict these indices at various time horizons. Here, predictions of negative phases (droughts or deficit periods) have been poor, and the behaviour of the models for positive phases (wet periods) has been more successful. Regarding the causal inference of IC and its possible influence on the study area, we found out how NAO and WEMO help forecasts for shorter time horizons, while MOI helps for longer cumulative time horizons/times. We have analysed the relevance of these atmospheric variables in each case where sometimes their introduction was convenient and sometimes not, following the rules of construction and detailing them in each case study. Throughout the work, the usefulness of mixed modelling approaches has been verified, using models based on observed data from the different monitoring networks with physical modelling for the reproduction of essential hydrological processes. With the proposed methodology, a positive influence of atmospheric oscillations has been observed for medium-term prediction within the study regions, finding no evidence for short-term predictions (daily scale). The final conclusions and the most important points for future work are presented in the chapter 6. Applications of this type of methods are currently necessary. They help us to establish relationships based on measured hydro-meteorological data and thus ”based on real data”, without hypothesizing any assumptions. These data-based experiences are very useful for limiting future uncertainty and optimizing water resources. The establishment of temporal relationships between different environmental agents allows us, through supervised methods, to establish causal relationships. From here a physical inference exercise is necessary to add coherence and establish a robust scientific exercise. The results obtained in this work, reaffirm the practicality of implementing this Data- Driven frameworks, in both the public and private spheres, being a good starting point for technology transfer. Most of the routines and models provided in this thesis, could be directly applied in Hydro-meteorological Services, or Decision Support Systems for water officials. This includes potential users as varied as public administrations and basin organisations, reservoir managers, energy companies that manage hydroelectric generation, irrigation communities, water bottling plants,... etc. The establishment of iterative and automatic frameworks for data processing and modelling, needs to be implemented, to make the most of the data collected in the water systems.Desde el inicio de los tiempos, se innova en el conocimiento y la tecnología de los sistemas hídricos e hidráulicos con el fin de conseguir una eficiente y correcta gestión de los mismos. En este proyecto, como hipótesis de partida, se van a aplicar diversas técnicas computacionales y conceptos de Inteligencia Artificial. Dado que el principal activo de estas aplicaciones son los datos, optamos por el término ”Data-Driven” (DD), ya que el término de Inteligencia Artificial puede causar confusión en los no expertos. Este es un campo en expansión en todos los aspectos de la ciencia y de la vida, donde al tiempo que se incrementan las capacidades de computación y de procesamiento, se incrementa la generación de datos. Ahí tenemos la tecnología 5G, o el internet de las cosas, donde el incremento exponencial del volumen de datos que se utilizan nos obliga a desarrollar marcos para el tratamiento y el análisis de los mismos. Los métodos DD tienen un enorme potencial para transformar nuestra habilidad de establecer un seguimiento supervisado y predecir estados de variables hidro-meteorológicas. Su aplicación provee claramente de beneficios, sin embargo realizar estos ejercicios requiere una práctica y un conocimiento específico. Por ello, es necesario un entendimiento más profundo de las capacidades y de las limitaciones de estas técnicas computacionales, dentro de nuestro campo de conocimiento y casos específicos. Por estos motivos, es esencial realizar experiencias ”hidro-informáticas” bajo este supuesto, identificando así que puntos son los más relevantes y a tener en cuenta en el desarrollo y la validación de estos modelos en condiciones o marcos más regionales. Para ello, trabajaremos con las series temporales recogidas en las diferentes redes de monitorización, con series resultantes de modelado hidro-meteorológico y con series de las oscilaciones atmosféricas más relevantes en la zona de estudio. El objetivo principal de este trabajo es el desarrollo y la validación de marcos metodológicos basados en datos. Para ello, se seleccionan puntos de interés, con el fin de desarrollar marcos hidro-meteorológicos ´útiles en la gestión y optimización de los recursos hídricos. En este supuesto, nos interesa ver la aplicabilidad práctica de estas herramientas de aprendizaje automático, machine learning, en condiciones mediterráneas y locales, donde los datos a veces son escasos o de baja calidad. En el primer capítulo (Cap.1) se realiza una introducción a la tesis doctoral, estableciendo los objetivos tanto generales como específicos, y la motivación de la tesis. Seguidamente se realiza a modo introductorio una descripción de los tres ejercicios fundamentales a realizar en el trabajo de investigación: Regresión, Clasificación y Optimización. Finalmente, se realiza una revisión del estado del arte de trabajos previos bajo condiciones climáticas mediterráneas y similares. El capítulo 2 presenta las zonas de estudio, analizando las características espacio-temporales de dos cuencas mediterráneas andaluzas situadas en el sureste español: río Guadalhorce (GH) y río Guadalfeo (GF). Son cuencas hidrográficas con unos patrones espaciotemporales altamente variables/heterogéneos. El primer sistema hidrológico, GH, contiene una zona de gran importancia socio-económica como es la ciudad de Málaga. El segundo, GF, al norte tiene situado el Parque Nacional de Sierra Nevada, coronado por el pico Mulhacén y desemboca a pocos kilómetros en la costa de Motril. Esto hace que este sea un sistema con grandes gradientes geo-morfológicos e hidro-meteorológicos. En ambas cuencas existen estructuras de regulación de gran interés para el desarrollo y estudio de su optimización. También se revisan las redes de monitorización disponibles en estas cuencas, y que agentes deben ser tenidos en cuenta para la consecución de los objetivos del presente trabajo. En la etapa de análisis de datos de este capítulo, nos centramos en la evolución espacio temporal del riesgo frente a las inundaciones en las desembocaduras de ambos sistemas hidrológicos al mar de Alborán. Se cuantifica el aumento del riesgo frente a inundaciones ante la intrusión de usos del suelo con altos costes en las zonas potencialmente inundables en estos ´últimos años, constatando así una mala práctica en la planificación del territorio dentro de la zona de estudio. También, en este capítulo se analizan los datos registrados con el fin de comprender la ocurrencia de avenidas en el río GH y su relación con los desembalses aguas arriba. En este análisis se pudo identificar, como ante algunos eventos pluviométricos extremos (> 100mm/24h), esta cuenca tiene limitaciones en la regulación, no pudiendo así mitigar los costes aguas abajo. Parte de los resultados obtenidos formaron parte del trabajo presentado en Egüen et al. (2015). Estos análisis nos permiten identificar la necesidad de una optimización temporal más precisa en la gestión de avenidas en este sistema hidrológico. Finalmente, realizamos un análisis de otro riesgo hidrológico importante como son las sequías, y cómo podemos representar este déficit hídrico mediante índices estandarizados, tanto para la pluviometría como para la escorrentía. En el capítulo 3 se analizan los diversos enfoques y metodologías para el modelado de series temporales hidro-meteorológicas. Los enfoques se exponen de forma antagonista entre las diferentes opciones de modelado que tenemos: caja negra vs. caja gris vs. caja blanca, paramétricos vs. no-paramétricos, estático vs. dinámico, lineal vs. no-lineal, frecuentista vs. bayesiano, único vs múltiple, entre otros..., enumerando las ventajas e inconvenientes de cada enfoque. Algunas ideas surgidas en esta parte de la investigación fueron expuestas en Herrero et al. (2014). Por otro lado, también se discuten los pasos de partición, gestión y transformación de los datos para una correcta aplicación de este tipo de métodos experimentales. Esto es de gran importancia, ya que parte del trabajo duro en la aplicación de este tipo de metodologías, proviene de la transformación de los datos para que los algoritmos y las funciones de transferencia funcionen correctamente. En la parte final de este capítulo, nos centramos en cómo evaluar y validar el comportamiento determinista y probabilístico mediante coeficientes evaluativos. En este punto, prestamos especial atención en evitar la utilización de coeficientes que enmascaren los resultados o muy generalistas, y por lo tanto nos centramos en aquellos que evalúan las capacidades predictivas y de precisión de los modelos. También se ha tenido en cuenta la parsimonia para los modelos basados en redes neuronales, ya que pueden caer fácilmente en una sobre-parametrización. El capítulo 4 expone trabajo puramente experimental, donde se realizan siete regresiones lluvia escorrentía a corto plazo, seis diarias y una horaria. Los casos de estudio corresponden a diversos puntos de interés dentro de las zonas de estudio, con importantes implicaciones en la gestión hidrológica. A escala horaria se analiza las capacidades de eficiencia y predictivas de la Regresión Lineal Múltiple (MLR) y Redes Neuronales Bayesianas (BNN) a diez horizontes temporales para el nivel del río Guadalhorce en el puente de Cártama. Se encontró que, para horizontes predictivos más cercanos, un enfoque más sencillo como puede ser el lineal (MLR), puede superar a uno con mayores capacidades predictivas a priori, como pueden ser uno no lineal (BNN). Simplificando así, el desarrollo y la implementación de este tipo de técnicas computacionales bajo este tipo de marcos hidrológicos. Por otro lado, a escala diaria se establece un marco comparativo entre los dos modelos anteriores, MLR y BNN, y un método bayesiano completo: Procesos Gaussianos (GP). Esta técnica computacional, nos permite aplicar funciones de transferencia de diferente naturaleza bajo un único modelo. Esto es una ventaja con respecto a los otros dos modelos computacionales, ya que los resultados nos indican que a veces funcionan bien en un dominio, pero no funcionan bien en el contrario. Durante la construcción de los modelos, la selección de las variables de entrada se realiza de forma progresiva, mediante un método de prueba y error, donde se tienen en cuenta las mejoras significativas con respecto a la última estructura de predictores preservando el principio de parsimonia. Se han utilizado datos de diferente naturaleza: datos reales recogidos en las redes de monitorización y datos generados paralelamente de modalización hidrológica con base física (WiMMed). Los resultados son robustos donde la principal limitación es el alto coste computacional por el método recurrente e iterativo. Resultados de este capítulo fueron presentados en Gulliver et al. (2014). En el capítulo 5 se realizan tres

    Demand forecasting for a Mixed-Use Building using an Agent-schedule information Data-Driven Model

    Get PDF
    There is great interest in data-driven modelling for the forecasting of building energy consumption while using machine learning (ML) modelling. However, little research considers classification-based ML models. This paper compares the regression and classification ML models for daily electricity and thermal load modelling in a large, mixed-use, university building. The independent feature variables of the model include outdoor temperature, historical energy consumption data sets, and several types of ‘agent schedules’ that provide proxy information that is based on broad classes of activity undertaken by the building’s inhabitants. The case study compares four different ML models testing three different feature sets with a genetic algorithm (GA) used to optimize the feature sets for those ML models without an embedded feature selection process. The results show that the regression models perform significantly better than classification models for the prediction of electricity demand and slightly better for the prediction of heat demand. The GA feature selection improves the performance of all models and demonstrates that historical heat demand, temperature, and the ‘agent schedules’, which derive from large occupancy fluctuations in the building, are the main factors influencing the heat demand prediction. For electricity demand prediction, feature selection picks almost all ‘agent schedule’ features that are available and the historical electricity demand. Historical heat demand is not picked as a feature for electricity demand prediction by the GA feature selection and vice versa. However, the exclusion of historical heat/electricity demand from the selected features significantly reduces the performance of the demand prediction

    Kalman Filter Bibliography: Agriculture, Biology, and Medicine

    Full text link
    27 pages, 1 article*Kalman Filter Bibliography: Agriculture, Biology, and Medicine* (Federer, Walter T.; Murty, B. R.) 27 page

    State-of-the-Art Using Bibliometric Analysis of Wind-Speed and -Power Forecasting Methods Applied in Power Systems

    Get PDF
    The integration of wind energy into power systems has intensified as a result of the urgency for global energy transition. This requires more accurate forecasting techniques that can capture the variability of the wind resource to achieve better operative performance of power systems. This paper presents an exhaustive review of the state-of-the-art of wind-speed and -power forecasting models for wind turbines located in different segments of power systems, i.e., in large wind farms, distributed generation, microgrids, and micro-wind turbines installed in residences and buildings. This review covers forecasting models based on statistical and physical, artificial intelligence, and hybrid methods, with deterministic or probabilistic approaches. The literature review is carried out through a bibliometric analysis using VOSviewer and Pajek software. A discussion of the results is carried out, taking as the main approach the forecast time horizon of the models to identify their applications. The trends indicate a predominance of hybrid forecast models for the analysis of power systems, especially for those with high penetration of wind power. Finally, it is determined that most of the papers analyzed belong to the very short-term horizon, which indicates that the interest of researchers is in this time horizon

    Multi-model prediction for demand forecast in water distribution networks

    Get PDF
    This paper presents a multi-model predictor called Qualitative Multi-Model Predictor Plus (QMMP+) for demand forecast in water distribution networks. QMMP+ is based on the decomposition of the quantitative and qualitative information of the time-series. The quantitative component (i.e., the daily consumption prediction) is forecasted and the pattern mode estimated using a Nearest Neighbor (NN) classifier and a Calendar. The patterns are updated via a simple Moving Average scheme. The NN classifier and the Calendar are executed simultaneously every period and the most suited model for prediction is selected using a probabilistic approach. The proposed solution for water demand forecast is compared against Radial Basis Function Artificial Neural Networks (RBF-ANN), the statistical Autoregressive Integrated Moving Average (ARIMA), and Double Seasonal Holt-Winters (DSHW) approaches, providing the best results when applied to real demand of the Barcelona Water Distribution Network. QMMP+ has demonstrated that the special modelling treatment of water consumption patterns improves the forecasting accuracyPeer ReviewedPostprint (published version
    corecore