3,012 research outputs found

    Verification tools for probabilistic forecasts of continuous hydrological variables

    Get PDF
    In the present paper we describe some methods for verifying and evaluating probabilistic forecasts of hydrological variables. We propose an extension to continuous-valued variables of a verification method originated in the meteorological literature for the analysis of binary variables, and based on the use of a suitable cost-loss function to evaluate the quality of the forecasts. We find that this procedure is useful and reliable when it is complemented with other verification tools, borrowed from the economic literature, which are addressed to verify the statistical correctness of the probabilistic forecast. We illustrate our findings with a detailed application to the evaluation of probabilistic and deterministic forecasts of hourly discharge value

    Kumaraswamy autoregressive moving average models for double bounded environmental data

    Full text link
    In this paper we introduce the Kumaraswamy autoregressive moving average models (KARMA), which is a dynamic class of models for time series taking values in the double bounded interval (a,b)(a,b) following the Kumaraswamy distribution. The Kumaraswamy family of distribution is widely applied in many areas, especially hydrology and related fields. Classical examples are time series representing rates and proportions observed over time. In the proposed KARMA model, the median is modeled by a dynamic structure containing autoregressive and moving average terms, time-varying regressors, unknown parameters and a link function. We introduce the new class of models and discuss conditional maximum likelihood estimation, hypothesis testing inference, diagnostic analysis and forecasting. In particular, we provide closed-form expressions for the conditional score vector and conditional Fisher information matrix. An application to environmental real data is presented and discussed.Comment: 25 pages, 4 tables, 4 figure

    Seasonal River Discharge Forecasting Using Support Vector Regression: A Case Study in the Italian Alps

    Get PDF
    In this contribution we analyze the performance of a monthly river discharge forecasting model with a Support Vector Regression (SVR) technique in a European alpine area. We considered as predictors the discharges of the antecedent months, snow-covered area (SCA), and meteorological and climatic variables for 14 catchments in South Tyrol (Northern Italy), as well as the long-term average discharge of the month of prediction, also regarded as a benchmark. Forecasts at a six-month lead time tend to perform no better than the benchmark, with an average 33% relative root mean square error (RMSE%) on test samples. However, at one month lead time, RMSE% was 22%, a non-negligible improvement over the benchmark; moreover, the SVR model reduces the frequency of higher errors associated with anomalous months. Predictions with a lead time of three months show an intermediate performance between those at one and six months lead time. Among the considered predictors, SCA alone reduces RMSE% to 6% and 5% compared to using monthly discharges only, for a lead time equal to one and three months, respectively, whereas meteorological parameters bring only minor improvements. The model also outperformed a simpler linear autoregressive model, and yielded the lowest volume error in forecasting with one month lead time, while at longer lead times the differences compared to the benchmarks are negligible. Our results suggest that although an SVR model may deliver better forecasts than its simpler linear alternatives, long lead-time hydrological forecasting in Alpine catchments remains a challenge. Catchment state variables may play a bigger role than catchment input variables; hence a focus on characterizing seasonal catchment storage—Rather than seasonal weather forecasting—Could be key for improving our predictive capacity.JRC.H.1-Water Resource

    Estudio comparativo completo de varios métodos basados en datos para la gestión de los recursos hídricos en ambientes mediterráneos a través de diferentes escalas temporales

    Get PDF
    Since the beginning of time, there has been innovation in the knowledge and technology of water and the hydraulic systems, to achieve an efficient and upgrade management of them. In this project, as an opening hypothesis, we will apply computational techniques and Artificial Intelligence concepts. Given that the primary asset of these studies is data, we have preferred to use the term ”Data-Driven”, as the term Artificial Intelligence can cause confusion in non-experts. This is an expanding field in all aspects of science and life, where the computing and processing powers are increasing periodic, so does the generation of information. There we have 5G technology, or the Internet of things, where the exponential build up in the volume of data utilised, pushes us to set up frameworks for the treatment and analysis of the information.Data-Driven techniques offers enormous potential to transform our perception to understand,monitor and predict the states of hydro-meteorological variables. Its application provides benefits, however, performing these exercises requires practice and explicit knowledge. Therefore, a deeper understanding of the capabilities and limitations of novel computational techniques within our field of knowledge is needed. Hence, it is essential to carry out ”hydro-informatics” experiences under this assumption. For the development of these models, we identify which points are the most relevant and need to be taken into account in regional conditions or frameworks. In consequence, we will work with the time series collected in the different monitoring networks, selecting the hydrological points of interest, in order to further develop hydrological frameworks that are useful for water management and optimisation. Here, we are interested in seeing the practical applicability to hydro-meteorology under Mediterranean conditions, where data are sometimes scarce, by selecting two hydrographic basins in south-east Andalusia: the Guadalhorce river (Málaga) and the Guadalfeo river (Granada). In chapter 1, an introduction to the doctoral thesis is made. Likewise, we establish the general and the specific objectives, and the motivation of the thesis. Afterwards, we describe the three fundamental exercises to be carried out in the research work: Regression, Classification and Optimisation. Ultimately, we carry out a brief review of previous works under Mediterranean climatic conditions and similar assumptions. Chapter 2 presents the study areas, analysing the spatial and temporal characteristics of two Andalusian Mediterranean basins in south-east Spain: Guadalhorce (GH) and Guadalfeo (GF). These are hydrographic basins with highly variable/heterogeneous spacetime patterns. The first hydrological system, GH, contains an area of socio-economic importance, such is the city of M´alaga. The second, GF, to the north has the Sierra Nevada National Park, crowned by the Mulhac´en peak and flowing in a few kilometres into the area of Motril. In this particular water system, we find large gradients of the geophysical agents. Both systems have regulation structures of great interest for the development and study of their optimisation. We also review the monitoring networks available in these basins, and which environmental agents and/or processes should be taken into account to meet the objectives of this work. We carry out a bibliographic review of the most relevant historical floods, listing the factors associated with these extreme events. In the data analysis stage of this chapter, we focus on the spatialtemporal evolution of the risk of flooding in the two mouths of the Guadalhorce and Guadalfeo Rivers into the Albor´an Sea. We quantify that had stepped up in recent years, noting that dangerous practices have increased the risk of flooding because of the intrusion of land uses with high-costs. This chapter also analyses collected data within the monitoring networks, to understand the occurrence of floods in the river GH related to upstream discharges. We found that this basin has limitations in regulation and cannot mitigate costs downstream. The results got, were part of the work presented in Egüen et al. (2015). These analyses allow us to identify in which parts of the flood management of this hydrological system need a more precise optimisation. Finally, a summary of another important hydrological risk is carried out, such as droughts, and how these water deficits can be represented by standardised indices, both in rainfall and the flow rates. The various approaches and methodologies for hydro-meteorological time series modelling are discussed in the chapter 3. The contrasting concepts are exposed antagonistically, to focus on the different design choices that we need to make: black box vs. grey box vs. white box, parametric vs. non-parametric, static vs. dynamic, linear vs. non-linear, frequency vs. Bayesian, single vs. multiple, among others..., detailing the advantages and disadvantages of each approach. We presented some ideas that emerged in this part of the research in Herrero et al. (2014). The partition, management and data transformation steps for the correct application of these experimental methods are also discussed. This is of great importance, since part of the hard work in the application of these methods comes from the transformation of the data. So that, the algorithms and transfer functions work correctly. Finally, we focus on how to test and validate the deterministic and probabilistic behaviours through evaluative coefficients to avoid coefficients that mask the results, and therefore focus on the behaviours of our interest, in our case precision and predictability. We have also taken parsimony into account in models based on neural networks, since they can easily fall into over-parameterisation. In chapter 4, we present the experimental work, where seven short-term, six daily and one hourly rainfall-runoff regressions are performed. The case studies correspond to various points of interest within the study areas with important implications for hydrological management. On an hourly scale, we analyse the efficiency and predictive capacities of the MLR and BNN at ten time horizons for the level of the Guadalhorce River in Cártama. We found that, for closer predictive horizons, a simpler approach such as linear (MLR) can outperform other with a priori higher capabilities, such as non-linear (BNN). This finding could simplify greatly its development and application. At a daily scale, we establish a comparative framework between the two previous models and a complete Bayesian method such as the Gaussian Processes. This DD computational technique, allows us to apply different transfer functions under a single model. This is an advantage over the other two DD models, since the results show that they work well in one domain, but do not work well in the other. During the construction of the models, we do the selection of the input variables in a progressive way, through a trial-and-error method, where the significant improvements with respect to the last predictor structure are taken into account preserving the principle of parsimony. Here, we have used different types of data: real data collected in the monitoring networks, and data generated in parallel from physically based hydrological modelling (WiMMed). The results are robust, where the major limitation is the high computational cost by the recurrent and iterative method used. Some results of this chapter, were presented in Gulliver et al. (2014). In chapter 5 three medium-term time scale prediction experiments are performed. We base the first modelling experiment on a quarterly scale, where a hydrological time scheme determines the cumulative flow for specific time horizons. We start the scheme according to the relevant dates where hydrological planning takes place. It is validated that the forecasts are more prosperous after have been consumed the first six months of the hydrological year. Instead of the three months in which we carry out the evaluations. The observed input variables quantified in the water system are: cumulative stream flow, cumulative rainfall, cumulative snowfall values and atmospheric oscillations (AO). At the level of modelling with DD, this experience has shown the importance of combining mixed regression classification models instead of only regression models within static frameworks. In this manner, we reduce and narrow the space of possible solutions and, therefore, we optimised the predictive behaviour of the DD model. During the development of this exercise, we have also carried out a classification practice comparing three DD classifiers: Probabilistic Neural Network (PNN), K-Nearest Neighbour (KNN) and Support Vector Machine (SVM). We see that the SVM behaves better than the others with our data. However, more research is still needed on classifiers in hydro-meteorological frameworks like ours, because of their variability. We showed this part of the doctoral thesis in Gulliver et al. (2016). In the second section of this chapter (Sec. 5.3), we carry out a rain forecast exercise on a monthly scale. To do so, we use BNN following the same construction method of the SVI model exposed in the previous chapter (Sec. Ref. Chapter 4), thus validating it in another time scale. However, the results in predictive terms are poor for this hydro-meteorological variable. This confirms the difficulty of predicting this variable from historical data and without the incorporation of dynamic tools. Thus, the need for complex hydrodynamic modelling for the prediction of this important variable is confirmed. On the other hand, this case serves to empirically infer the causality of the most relevant atmospheric oscillations in the points of study. From multiple simulations with the model-based approach it has been possible to establish which indices have a greater influence. In the last section of this chapter (Section 5.4), an exercise was carried out to predict the deviation or anomaly of rainfall and runoff indices for four time series representative of different locations within the Guadalfeo BR. In this case, we verified the suitability of seven statistical distributions to characterize the anomalies/deviations under Mediterranean conditions. Under this hypothesis, the indices that passed the Shapiro-Wilk normality test were modelled to analyse the capabilities of BNN to predict these indices at various time horizons. Here, predictions of negative phases (droughts or deficit periods) have been poor, and the behaviour of the models for positive phases (wet periods) has been more successful. Regarding the causal inference of IC and its possible influence on the study area, we found out how NAO and WEMO help forecasts for shorter time horizons, while MOI helps for longer cumulative time horizons/times. We have analysed the relevance of these atmospheric variables in each case where sometimes their introduction was convenient and sometimes not, following the rules of construction and detailing them in each case study. Throughout the work, the usefulness of mixed modelling approaches has been verified, using models based on observed data from the different monitoring networks with physical modelling for the reproduction of essential hydrological processes. With the proposed methodology, a positive influence of atmospheric oscillations has been observed for medium-term prediction within the study regions, finding no evidence for short-term predictions (daily scale). The final conclusions and the most important points for future work are presented in the chapter 6. Applications of this type of methods are currently necessary. They help us to establish relationships based on measured hydro-meteorological data and thus ”based on real data”, without hypothesizing any assumptions. These data-based experiences are very useful for limiting future uncertainty and optimizing water resources. The establishment of temporal relationships between different environmental agents allows us, through supervised methods, to establish causal relationships. From here a physical inference exercise is necessary to add coherence and establish a robust scientific exercise. The results obtained in this work, reaffirm the practicality of implementing this Data- Driven frameworks, in both the public and private spheres, being a good starting point for technology transfer. Most of the routines and models provided in this thesis, could be directly applied in Hydro-meteorological Services, or Decision Support Systems for water officials. This includes potential users as varied as public administrations and basin organisations, reservoir managers, energy companies that manage hydroelectric generation, irrigation communities, water bottling plants,... etc. The establishment of iterative and automatic frameworks for data processing and modelling, needs to be implemented, to make the most of the data collected in the water systems.Desde el inicio de los tiempos, se innova en el conocimiento y la tecnología de los sistemas hídricos e hidráulicos con el fin de conseguir una eficiente y correcta gestión de los mismos. En este proyecto, como hipótesis de partida, se van a aplicar diversas técnicas computacionales y conceptos de Inteligencia Artificial. Dado que el principal activo de estas aplicaciones son los datos, optamos por el término ”Data-Driven” (DD), ya que el término de Inteligencia Artificial puede causar confusión en los no expertos. Este es un campo en expansión en todos los aspectos de la ciencia y de la vida, donde al tiempo que se incrementan las capacidades de computación y de procesamiento, se incrementa la generación de datos. Ahí tenemos la tecnología 5G, o el internet de las cosas, donde el incremento exponencial del volumen de datos que se utilizan nos obliga a desarrollar marcos para el tratamiento y el análisis de los mismos. Los métodos DD tienen un enorme potencial para transformar nuestra habilidad de establecer un seguimiento supervisado y predecir estados de variables hidro-meteorológicas. Su aplicación provee claramente de beneficios, sin embargo realizar estos ejercicios requiere una práctica y un conocimiento específico. Por ello, es necesario un entendimiento más profundo de las capacidades y de las limitaciones de estas técnicas computacionales, dentro de nuestro campo de conocimiento y casos específicos. Por estos motivos, es esencial realizar experiencias ”hidro-informáticas” bajo este supuesto, identificando así que puntos son los más relevantes y a tener en cuenta en el desarrollo y la validación de estos modelos en condiciones o marcos más regionales. Para ello, trabajaremos con las series temporales recogidas en las diferentes redes de monitorización, con series resultantes de modelado hidro-meteorológico y con series de las oscilaciones atmosféricas más relevantes en la zona de estudio. El objetivo principal de este trabajo es el desarrollo y la validación de marcos metodológicos basados en datos. Para ello, se seleccionan puntos de interés, con el fin de desarrollar marcos hidro-meteorológicos ´útiles en la gestión y optimización de los recursos hídricos. En este supuesto, nos interesa ver la aplicabilidad práctica de estas herramientas de aprendizaje automático, machine learning, en condiciones mediterráneas y locales, donde los datos a veces son escasos o de baja calidad. En el primer capítulo (Cap.1) se realiza una introducción a la tesis doctoral, estableciendo los objetivos tanto generales como específicos, y la motivación de la tesis. Seguidamente se realiza a modo introductorio una descripción de los tres ejercicios fundamentales a realizar en el trabajo de investigación: Regresión, Clasificación y Optimización. Finalmente, se realiza una revisión del estado del arte de trabajos previos bajo condiciones climáticas mediterráneas y similares. El capítulo 2 presenta las zonas de estudio, analizando las características espacio-temporales de dos cuencas mediterráneas andaluzas situadas en el sureste español: río Guadalhorce (GH) y río Guadalfeo (GF). Son cuencas hidrográficas con unos patrones espaciotemporales altamente variables/heterogéneos. El primer sistema hidrológico, GH, contiene una zona de gran importancia socio-económica como es la ciudad de Málaga. El segundo, GF, al norte tiene situado el Parque Nacional de Sierra Nevada, coronado por el pico Mulhacén y desemboca a pocos kilómetros en la costa de Motril. Esto hace que este sea un sistema con grandes gradientes geo-morfológicos e hidro-meteorológicos. En ambas cuencas existen estructuras de regulación de gran interés para el desarrollo y estudio de su optimización. También se revisan las redes de monitorización disponibles en estas cuencas, y que agentes deben ser tenidos en cuenta para la consecución de los objetivos del presente trabajo. En la etapa de análisis de datos de este capítulo, nos centramos en la evolución espacio temporal del riesgo frente a las inundaciones en las desembocaduras de ambos sistemas hidrológicos al mar de Alborán. Se cuantifica el aumento del riesgo frente a inundaciones ante la intrusión de usos del suelo con altos costes en las zonas potencialmente inundables en estos ´últimos años, constatando así una mala práctica en la planificación del territorio dentro de la zona de estudio. También, en este capítulo se analizan los datos registrados con el fin de comprender la ocurrencia de avenidas en el río GH y su relación con los desembalses aguas arriba. En este análisis se pudo identificar, como ante algunos eventos pluviométricos extremos (> 100mm/24h), esta cuenca tiene limitaciones en la regulación, no pudiendo así mitigar los costes aguas abajo. Parte de los resultados obtenidos formaron parte del trabajo presentado en Egüen et al. (2015). Estos análisis nos permiten identificar la necesidad de una optimización temporal más precisa en la gestión de avenidas en este sistema hidrológico. Finalmente, realizamos un análisis de otro riesgo hidrológico importante como son las sequías, y cómo podemos representar este déficit hídrico mediante índices estandarizados, tanto para la pluviometría como para la escorrentía. En el capítulo 3 se analizan los diversos enfoques y metodologías para el modelado de series temporales hidro-meteorológicas. Los enfoques se exponen de forma antagonista entre las diferentes opciones de modelado que tenemos: caja negra vs. caja gris vs. caja blanca, paramétricos vs. no-paramétricos, estático vs. dinámico, lineal vs. no-lineal, frecuentista vs. bayesiano, único vs múltiple, entre otros..., enumerando las ventajas e inconvenientes de cada enfoque. Algunas ideas surgidas en esta parte de la investigación fueron expuestas en Herrero et al. (2014). Por otro lado, también se discuten los pasos de partición, gestión y transformación de los datos para una correcta aplicación de este tipo de métodos experimentales. Esto es de gran importancia, ya que parte del trabajo duro en la aplicación de este tipo de metodologías, proviene de la transformación de los datos para que los algoritmos y las funciones de transferencia funcionen correctamente. En la parte final de este capítulo, nos centramos en cómo evaluar y validar el comportamiento determinista y probabilístico mediante coeficientes evaluativos. En este punto, prestamos especial atención en evitar la utilización de coeficientes que enmascaren los resultados o muy generalistas, y por lo tanto nos centramos en aquellos que evalúan las capacidades predictivas y de precisión de los modelos. También se ha tenido en cuenta la parsimonia para los modelos basados en redes neuronales, ya que pueden caer fácilmente en una sobre-parametrización. El capítulo 4 expone trabajo puramente experimental, donde se realizan siete regresiones lluvia escorrentía a corto plazo, seis diarias y una horaria. Los casos de estudio corresponden a diversos puntos de interés dentro de las zonas de estudio, con importantes implicaciones en la gestión hidrológica. A escala horaria se analiza las capacidades de eficiencia y predictivas de la Regresión Lineal Múltiple (MLR) y Redes Neuronales Bayesianas (BNN) a diez horizontes temporales para el nivel del río Guadalhorce en el puente de Cártama. Se encontró que, para horizontes predictivos más cercanos, un enfoque más sencillo como puede ser el lineal (MLR), puede superar a uno con mayores capacidades predictivas a priori, como pueden ser uno no lineal (BNN). Simplificando así, el desarrollo y la implementación de este tipo de técnicas computacionales bajo este tipo de marcos hidrológicos. Por otro lado, a escala diaria se establece un marco comparativo entre los dos modelos anteriores, MLR y BNN, y un método bayesiano completo: Procesos Gaussianos (GP). Esta técnica computacional, nos permite aplicar funciones de transferencia de diferente naturaleza bajo un único modelo. Esto es una ventaja con respecto a los otros dos modelos computacionales, ya que los resultados nos indican que a veces funcionan bien en un dominio, pero no funcionan bien en el contrario. Durante la construcción de los modelos, la selección de las variables de entrada se realiza de forma progresiva, mediante un método de prueba y error, donde se tienen en cuenta las mejoras significativas con respecto a la última estructura de predictores preservando el principio de parsimonia. Se han utilizado datos de diferente naturaleza: datos reales recogidos en las redes de monitorización y datos generados paralelamente de modalización hidrológica con base física (WiMMed). Los resultados son robustos donde la principal limitación es el alto coste computacional por el método recurrente e iterativo. Resultados de este capítulo fueron presentados en Gulliver et al. (2014). En el capítulo 5 se realizan tres

    Operational use of machine learning models for sea-level modeling

    Get PDF
    1427-1434Intense activity offshore warrants a temporal and accurate prediction of sea-level variability. Besides, the sea-level plays an important role in the groundwater level and quality of coastal aquifer. Climate change influences considerable change in all the hydrological parameters and apparently affects sea-level variability. For prediction, highly complex numerical models are usually generated. To address these challenges, the study proposes the use of machine learning (ML) models with the climate change predictands and sea-level predictors. Three ML models are employed in this study, viz., Regression Vector Machine (RVM), Extreme Learning Machine (ELM), and Gaussian Process Regression (GPR). The performance of the developed models is evaluated by visual comparison of predicted and observed datasets. Regression error curve plots, frequency of forecasting errors and Taylor diagram, along with statistical performance metrics were developed. Overall, it is found that the operational use of the selected ML algorithms was quite appealing for modeling studies. Among the three ML models, GPR performed slightly better than ELM and RVM

    Identification of Influential Climate Indicators, Prediction of Long-Term Streamflow and Great Salt Lake Elevation Using Machine Learning Approach

    Get PDF
    To meet the surging water demand due to rapid population growth and changing climatic conditions around the world, and to reduce the impact of floods and droughts, comprehensive water management and planning is necessary. Climatic variability, hydrologic uncertainty and variability of hydrologic quantities in time and space are inherent to hydrological modeling. Hydrologic modeling using a physically-based model can be very complex and typically requires detailed knowledge of physical processes. The availability of data is an important issue to justify the use of these models. Data-driven models are an alternative choice. This is a relatively new and efficient approach to modeling. Data-drive models bridge the gap between the classical regression and physically-based models. By using a data-driven model that relies on the machine learning approach, it is possible to produce reasonable predictions from a limited data set and limited knowledge of underlying physical processes of the system by just relating input and output. This dissertation uses the Multivariate Relevance Vector Machine (MVRVM) and Support Vector Machine (SVM) for predicting a variety of hydrological quantities. These models are used in this dissertation for identifying influential climate indicators, and are used for long-term streamflow prediction for multiple lead times at different locations in Utah. They are also used for prediction of Great Salt Lake (GSL) elevation series. They provide reasonable predictions of hydrological quantities from the available data. The predictions from these models are robust and parsimonious. This research presents the first attempt to identify influential climate indicators and predict long lead-time streamflow in Utah, and to predict lake elevation using machine learning models. The approach presented herein has potential value for water resources planning and management especially for irrigation and flood management

    Multivariate Bayesian Machine Learning Regression for Operation and Management of Multiple Reservoir, Irrigation Canal, and River Systems

    Get PDF
    The principal objective of this dissertation is to develop Bayesian machine learning models for multiple reservoir, irrigation canal, and river system operation and management. These types of models are derived from the emerging area of machine learning theory; they are characterized by their ability to capture the underlying physics of the system simply by examination of the measured system inputs and outputs. They can be used to provide probabilistic predictions of system behavior using only historical data. The models were developed in the form of a multivariate relevance vector machine (MVRVM) that is based on a sparse Bayesian learning machine approach for regression. Using this Bayesian approach, a predictive confidence interval is obtained from the model that captures the uncertainty of both the model and the data. The models were applied to the multiple reservoir, canal and river system located in the regulated Lower Sevier River Basin in Utah. The models were developed to perform predictions of multi-time-ahead releases of multiple reservoirs, diversions of multiple canals, and streamflow and water loss/gain in a river system. This research represents the first attempt to use a multivariate Bayesian learning regression approach to develop simultaneous multi-step-ahead predictions with predictive confidence intervals for multiple outputs in a regulated river basin system. These predictions will be of potential value to reservoir and canal operators in identifying the best decisions for operation and management of irrigation water supply systems

    Multi-step Ahead Inflow Forecasting for a Norwegian Hydro-Power Use-Case, Based on Spatial-Temporal Attention Mechanism

    Get PDF
    Hydrological forecasting has been an ongoing area of research due to its importance to improve decision making on water resource management, flood management, and climate change mitigation. With the increasing availability of hydrological data, Machine Learning (ML) techniques have started to play an important role, enabling us to better understand and predict complex hydrological events. However, some challenges remain. Hydrological processes have spatial and temporal dependencies that are not always easy to capture with traditional ML models, and a thorough understanding of these dependencies is essential when developing accurate predictive models. This thesis explores the use of ML techniques in hydrological forecasting and consists of an introduction, two papers, and an application developed alongside the case study. The motivation for this research is to enhance our understanding of the spatial and temporal dependencies in hydrological processes and to explore how ML techniques, particularly those incorporating attention mechanisms, can aid in hydrological forecasting. The first paper is a chronological literature review that explores the development of data-driven forecasting in hydrology, and highlighting the potential application of attention mechanisms in hydrological forecasting. These attention mechanisms have proven to be successful in various domains, allowing models to focus on the most relevant parts of the input for making predictions, which is particularly useful when dealing with spatial and temporal data. The second paper is a case study of a specific ML model incorporating these attention mechanisms. The focus is to illustrate the influence of spatial and temporal dependencies in a real-world hydrological forecasting scenario, thereby showcasing the practical application of these techniques. In parallel with the case study, an application has been developed, employing the principles and techniques discovered throughout the course of this research. The application aims to provide a practical demonstration of the concepts explored in the thesis, contributing to the field of hydrological forecasting by introducing a tool for hydropower suppliers.Masteroppgave i Programvareutvikling samarbeid med HVLPROG399MAMN-PRO

    Prediction of River Discharge by Using Gaussian Basis Function

    Get PDF
    For design of water resources engineering related project such as hydraulic structures like dam, barrage and weirs river discharge data is vital. However, prediction of river discharge is complicated by variations in geometry and boundary roughness. The conventional method of estimation of river discharge tends to be inaccurate because river discharge is nonlinear but the method is linear. Therefore, an alternative method to overcome problem to predict river discharge is required. Soft computing technique such as artificial neural network (ANN) was able to predict nonlinear parameter such as river discharge. In this study, prediction of river discharge in Pari River is predicted using soft computing technique, specifically gaussian basis function. Water level raw data from year 2011 to 2012 is used as input. The data divided into two section, training dataset and testing dataset. From 314 data, 200 are allocated as training data and the remaining 100 are used as testing data. After that, the data will be run by using Matlab software. Three input variables used in this study were current water level, 1-antecendent water level, and 2-antecendent water level. 19 numbers of hidden neurons with spread value of 0.69106 was the best choice which creates the best result for model architecture after numbers of trial. The output variable was river discharge. Performance evaluation measures such as root mean square error, mean absolute error, correlation of efficiency (CE) and coefficient of determination (R2) was used to indicate the overall performance of the selected network. R2 for training dataset was 0.983 which showed predicted discharge is highly correlated with observed discharge value. However, testing stage performance is decline from training stage as R2 obtained was 0.775 consequently presence of outliers have affect scattering of whole data of testing and resulted in less accuracy as the R2 obtained much lower compared to training dataset. This happened because less number of input loaded into testing than training. RMSE and MSE recorded for training much lower than testing indicated that the better the performance of the model since the error is lesser. The comparison of with other types of neural network showed that Gaussian basis function is recommended to be used for river discharge prediction in Pari river
    corecore