Search CORE

68 research outputs found

SAHRA Integrated Modeling Approach Towards Basin-Scale Water Resources Management

Author: Gupta et al. Hoshin V.
Publication venue: OpenSIUC
Publication date: 19/07/2006
Field of study

Water resources decisions in the 21st Century will have strong economic and environmental components and can therefore benefit from scenario analyses that make use of integrated river basin models. SAHRA (the National Science Foundation Science and Technology Center for Sustainability of semi-Arid Hydrology and Riparian Areas) is developing an integrated modeling framework based on four hierarchical levels – a physical systems model (including surface, subsurface and atmospheric components where appropriate), an engineering systems model (including agriculture, reservoirs, etc.), a human systems behavioral model (socio-economic components) and an institutional systems model (laws, compacts etc.). This integrated framework is rooted in a perceptual-conceptual systems model of the river basin and a database support structure. This paper describes the SAHRA approach to linking the various hierarchical levels and discusses how it is being applied to answer the question, under what conditions are water markets and water banking feasible? Integration of the four hierarchical levels will allow water resource managers to consider the trading of water rights and third party impacts in evaluating the potential for market-based mechanisms to allocate water resources effectively

OpenSIUC

A Mass-Conserving-Perceptron for Machine Learning-Based Modeling of Geoscientific Systems

Author: Gupta Hoshin V.
Wang Yuan-Heng
Publication venue
Publication date: 23/10/2023
Field of study

Although decades of effort have been devoted to building Physical-Conceptual (PC) models for predicting the time-series evolution of geoscientific systems, recent work shows that Machine Learning (ML) based Gated Recurrent Neural Network technology can be used to develop models that are much more accurate. However, the difficulty of extracting physical understanding from ML-based models complicates their utility for enhancing scientific knowledge regarding system structure and function. Here, we propose a physically-interpretable Mass Conserving Perceptron (MCP) as a way to bridge the gap between PC-based and ML-based modeling approaches. The MCP exploits the inherent isomorphism between the directed graph structures underlying both PC models and GRNNs to explicitly represent the mass-conserving nature of physical processes while enabling the functional nature of such processes to be directly learned (in an interpretable manner) from available data using off-the-shelf ML technology. As a proof of concept, we investigate the functional expressivity (capacity) of the MCP, explore its ability to parsimoniously represent the rainfall-runoff (RR) dynamics of the Leaf River Basin, and demonstrate its utility for scientific hypothesis testing. To conclude, we discuss extensions of the concept to enable ML-based physical-conceptual representation of the coupled nature of mass-energy-information flows through geoscientific systems.Comment: 60 pages and 7 figures in the main text. 10 figures, and 10 tables in the supplementary material

arXiv.org e-Print Archive

Do Nash values have value?

Author: Gupta Hoshin V.
Schaefli Bettina
Publication venue: 'Wiley'
Publication date: 21/01/2011
Field of study

How Do We Communicate Model Performance? The process of model performance evaluation is of primary importance, not only in the model development and calibration process, but also when communicating the results to other researchers and to stakeholders. The basic ‘rule’ is that every modelling result should be put into context, for example, by indicating the model performance using appropriate indicators, and by highlighting potential sources of uncertainty, and this practice has found its entry into the large majority of papers and conference presentations. While the question of how to communicate the performance of a model to potential end-users is currently receiving increasing interest (e.g. Pappenberger and Beven, 2006), we–as well as many other colleagues–observe regularly that researchers take much less care when communicating model performance amongst ourselves. We seem to assume that we are speaking about familiar performance concepts and that they have comparable significance for various types of model applications and case studies. In doing so, we do not pay sufficient attention to making clear what the values represented by our performance measures really mean. Even concepts as simple as the bias between an observed and a simulated time series need to be put into proper context: whereas a 10% bias in simulation of simulated discharge may be unacceptable in a climate change impact assessment, it may be of less concern in the context of real-time flood forecasting. While some performance measures can have an absolute meaning, such as the common measure of linear correlation, the vast majority of performance measures, and in particular quadratic-error-based measures, can only be properly interpreted when viewed in the context of a reference value (..

Infoscience - École polytechnique fédérale de Lausanne

On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data

Author: Ehret Uwe
Gupta Hoshin V.
Guthke Anneli
Álvarez Chaves Manuel
Publication venue: MDPI
Publication date: 28/10/2024
Field of study

Using information-theoretic quantities in practical applications with continuous data is often hindered by the fact that probability density functions need to be estimated in higher dimensions, which can become unreliable or even computationally unfeasible. To make these useful quantities more accessible, alternative approaches such as binned frequencies using histograms and k-nearest neighbors (k-NN) have been proposed. However, a systematic comparison of the applicability of these methods has been lacking. We wish to fill this gap by comparing kernel-density-based estimation (KDE) with these two alternatives in carefully designed synthetic test cases. Specifically, we wish to estimate the information-theoretic quantities: entropy, Kullback–Leibler divergence, and mutual information, from sample data. As a reference, the results are compared to closed-form solutions or numerical integrals. We generate samples from distributions of various shapes in dimensions ranging from one to ten. We evaluate the estimators’ performance as a function of sample size, distribution characteristics, and chosen hyperparameters. We further compare the required computation time and specific implementation challenges. Notably, k-NN estimation tends to outperform other methods, considering algorithmic implementation, computational efficiency, and estimation accuracy, especially with sufficient data. This study provides valuable insights into the strengths and limitations of the different estimation methods for information-theoretic quantities. It also highlights the significance of considering the characteristics of the data, as well as the targeted information-theoretic quantity when selecting an appropriate estimation technique. These findings will assist scientists and practitioners in choosing the most suitable method, considering their specific application and available data. We have collected the compared estimation methods in a ready-to-use open-source Python 3 toolbox and, thereby, hope to promote the use of information-theoretic quantities by researchers and practitioners to evaluate the information in data and models in various disciplines

KITopen

Decision Support Systems in Water Resources Planning and Management: Stakeholder Participation and the Sustainable Path to Science-Based Decision Making

Author: B. Valdes Juan
Serrat-Capdevila Aleix
V. Gupta Hoshin
Publication venue: 'IntechOpen'
Publication date: 09/09/2011
Field of study

IntechOpen

Crossref

Model Calibration in Watershed Hydrology

Author: Gupta Hoshin V.
Sorooshian Soroosh
Vrugt Jasper A.
Yilmaz Koray K.
Publication venue
Publication date: 01/01/2010
Field of study

Hydrologic models use relatively simple mathematical equations to conceptualize and aggregate the complex, spatially distributed, and highly interrelated water, energy, and vegetation processes in a watershed. A consequence of process aggregation is that the model parameters often do not represent directly measurable entities and must, therefore, be estimated using measurements of the system inputs and outputs. During this process, known as model calibration, the parameters are adjusted so that the behavior of the model approximates, as closely and consistently as possible, the observed response of the hydrologic system over some historical period of time. This Chapter reviews the current state-of-the-art of model calibration in watershed hydrology with special emphasis on our own contributions in the last few decades. We discuss the historical background that has led to current perspectives, and review different approaches for manual and automatic single- and multi-objective parameter estimation. In particular, we highlight the recent developments in the calibration of distributed hydrologic models using parameter dimensionality reduction sampling, parameter regularization and parallel computing

Crossref

NASA Technical Reports Server

OpenMETU (Middle East Technical University)

Decomposition of the Mean Squared Error and NSE Performance Criteria: Implications for Improving Hydrological Modelling

Author: Gupta Hoshin V.
Kling Harald
Martinez-Baquero Guillermo F.
Yilmaz Koray K.
Publication venue
Publication date: 20/10/2009
Field of study

The mean squared error (MSE) and the related normalization, the Nash-Sutcliffe efficiency (NSE), are the two criteria most widely used for calibration and evaluation of hydrological models with observed data. Here, we present a diagnostically interesting decomposition of NSE (and hence MSE), which facilitates analysis of the relative importance of its different components in the context of hydrological modelling, and show how model calibration problems can arise due to interactions among these components. The analysis is illustrated by calibrating a simple conceptual precipitation-runoff model to daily data for a number of Austrian basins having a broad range of hydro-meteorological characteristics. Evaluation of the results clearly demonstrates the problems that can be associated with any calibration based on the NSE (or MSE) criterion. While we propose and test an alternative criterion that can help to reduce model calibration problems, the primary purpose of this study is not to present an improved measure of model performance. Instead, we seek to show that there are systematic problems inherent with any optimization based on formulations related to the MSE. The analysis and results have implications to the manner in which we calibrate and evaluate environmental models; we discuss these and suggest possible ways forward that may move us towards an improved and diagnostically meaningful approach to model performance evaluation and identification

NASA Technical Reports Server

OpenMETU (Middle East Technical University)

Estimating epistemic and aleatory uncertainties during hydrologic modeling: An information theoretic approach

Author: Gong Wei
Gupta Hoshin V.
Hero Alfred O.
Sricharan Kumar
Yang Dawen
Publication venue: 'Wiley'
Publication date: 01/04/2013
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/98239/1/wrcr20161.pd

Deep Blue Documents

On the accurate estimation of information-theoretic quantities from multi-dimensional sample data

Author: Ehret Uwe
Gupta Hoshin V.
Guthke Anneli
Álvarez Chaves Manuel
Publication venue
Publication date: 19/06/2024
Field of study

Using information-theoretic quantities in practical applications with continuous data is often hindered by the fact that probability density functions need to be estimated in higher dimensions, which can become unreliable or even computationally unfeasible. To make these useful quantities more accessible, alternative approaches such as binned frequencies using histograms and k -nearest neighbors ( k -NN) have been proposed. However, a systematic comparison of the applicability of these methods has been lacking. We wish to fill this gap by comparing kernel-density-based estimation (KDE) with these two alternatives in carefully designed synthetic test cases. Specifically, we wish to estimate the information-theoretic quantities: entropy, Kullback–Leibler divergence, and mutual information, from sample data. As a reference, the results are compared to closed-form solutions or numerical integrals. We generate samples from distributions of various shapes in dimensions ranging from one to ten. We evaluate the estimators’ performance as a function of sample size, distribution characteristics, and chosen hyperparameters. We further compare the required computation time and specific implementation challenges. Notably, k -NN estimation tends to outperform other methods, considering algorithmic implementation, computational efficiency, and estimation accuracy, especially with sufficient data. This study provides valuable insights into the strengths and limitations of the different estimation methods for information-theoretic quantities. It also highlights the significance of considering the characteristics of the data, as well as the targeted information-theoretic quantity when selecting an appropriate estimation technique. These findings will assist scientists and practitioners in choosing the most suitable method, considering their specific application and available data. We have collected the compared estimation methods in a ready-to-use open-source Python 3 toolbox and, thereby, hope to promote the use of information-theoretic quantities by researchers and practitioners to evaluate the information in data and models in various disciplines.We acknowledge funding by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) both under Germany’s Excellence Strategy—EXC 2075–390740016 and the project 507884992.Deutsche Forschungsgemeinschaft (DFG, German Research Foundation