18 research outputs found
Legitimising data-driven models: exemplification of a new data-driven mechanistic modelling framework
In this paper the difficult problem of how to legitimisedata-driven hydrological models is addressed using an example of a simple artificial neural network modelling problem. Many data-driven models in hydrology have been criticised for their black-box characteristics, which prohibit adequate understanding of their mechanistic behaviour and restrict their wider heuristic value. In response, presented here is a new generic data-driven mechanistic modelling framework. The framework is significant because it incorporates an evaluation of the legitimacy of a data-driven model’s internal modelling mechanism as a core element in the modelling process. The framework’s value is demonstrated by two simple artificial neural network river forecasting scenarios. We develop a novel adaptation of first-order partial derivative, relative sensitivity analysis to enable each model’s mechanistic legitimacy to be evaluated within the framework. The results demonstrate the limitations of standard, goodness-of-fit validation procedures by highlighting how the internal mechanisms of complex models that produce the best fit scores can have lower mechanistic legitimacy than simpler counterparts whose scores are only slightly inferior. Thus, our study directly tackles one of the key debates in data-driven, hydrological modelling: is it acceptable for our ends (i.e. model fit) to justify our means (i.e. the numerical basis by which that fit is achieved)
Ideal point error for model assessment in data-driven river flow forecasting
When analysing the performance of hydrological
models in river forecasting, researchers use a number of diverse statistics. Although some statistics appear to be used more regularly in such analyses than others, there is a distinct lack of consistency in evaluation, making studies undertaken by different authors or performed at different locations difficult to compare in a meaningful manner. Moreover, even within individual reported case studies, substantial contradictions
are found to occur between one measure of performance
and another. In this paper we examine the ideal
point error (IPE) metric – a recently introduced measure of
model performance that integrates a number of recognised
metrics in a logical way. Having a single, integrated measure of performance is appealing as it should permit more straightforward model inter-comparisons. However, this is reliant on a transferrable standardisation of the individual metrics that are combined to form the IPE. This paper examines one potential option for standardisation: the use of naive model benchmarking
Sensitivity analysis for comparison, validation and physical-legitimacy of neural network-based hydrological models
This paper addresses the difficult question of how to perform meaningful comparisons between neural network-based hydrological models and alternative modelling approaches. Standard, goodness-of-fit metric approaches are limited since they only assess numerical performance and not physical legitimacy of the means by which output is achieved. Consequently, the potential for general application or catchment transfer of such models is seldom understood. This paper presents a partial derivative, relative sensitivity analysis method as a consistent means by which the physical legitimacy of models can be evaluated. It is used to compare the behaviour and physical rationality of a generalised linear model and two neural network models for predicting median flood magnitude in rural catchments. The different models perform similarly in terms of goodness-of-fit statistics, but behave quite distinctly when the relative sensitivities of their inputs are evaluated. The neural solutions are seen to offer an encouraging degree of physical legitimacy in their behaviour, over that of a generalised linear modelling counterpart, particularly when overfitting is constrained. This indicates that neural models offer preferable solutions for transfer into ungauged catchments. Thus, the importance of understanding both model performance and physical legitimacy when comparing neural models with alternative modelling approaches is demonstrated
Including spatial distribution in a data-driven rainfall-runoff model to improve reservoir inflow forecasting in Taiwan
Multi-step ahead inflow forecasting has a critical role to play in reservoir operation and management in Taiwan during typhoons as statutory legislation requires a minimum of 3-hours warning to be issued before any reservoir releases are made. However, the complex spatial and temporal heterogeneity of typhoon rainfall, coupled with a remote and mountainous physiographic context makes the development of real-time rainfall-runoff models that can accurately predict reservoir inflow several hours ahead of time challenging. Consequently, there is an urgent, operational requirement for models that can enhance reservoir inflow prediction at forecast horizons of more than 3-hours. In this paper we develop a novel semi-distributed, data-driven, rainfall-runoff model for the Shihmen catchment, north Taiwan. A suite of Adaptive Network-based Fuzzy Inference System solutions is created using various combinations of auto-regressive, spatially-lumped radar and point-based rain gauge predictors. Different levels of spatially-aggregated radar-derived rainfall data are used to generate 4, 8 and 12 sub-catchment input drivers. In general, the semi-distributed radar rainfall models outperform their less complex counterparts in predictions of reservoir inflow at lead-times greater than 3-hours. Performance is found to be optimal when spatial aggregation is restricted to 4 sub-catchments, with up to 30% improvements in the performance over lumped and point-based models being evident at 5-hour lead times. The potential benefits of applying semi-distributed, data-driven models in reservoir inflow modelling specifically, and hydrological modelling more generally, is thus demonstrated
Neuroemulation: definition and key benefits for water resources research
Neuroemulation is the art and science of using a neural network model to replicate the external behaviour of some other model and it is an activity that is distinct from neural-network-based simulation. Whilst is has become a recognised and established sub-discipline in many fields of study, it remains poorly defined in the field of water resources and its many potential benefits have not been adequately recognised to date. One reason for the lack of recognition of the field is the difficulty in identifying, collating and synthesising published neuro-emulation studies because simple database searching fails to identifying papers concerned with a field of study for which an agreed conceptual and terminological framework does not yet exist. Therefore, in this paper we provide a first attempt at defining this framework for use in water resources. We identify eight key benefits offered by neuro-emulation and exemplify these with relevant examples from the literature. The concluding section highlights a number of strategic research directions, related to the identified potential of neuroemulators in water resources modelling
Improved validation framework and R-package for artificial neural network models
Validation is a critical component of any modelling process. In artificial neural network (ANN) modelling, validation generally consists of the assessment of model predictive performance on an independent validation set (predictive validity). However, this ignores other aspects of model validation considered to be good practice in other areas of environmental modelling, such as residual analysis (replicative validity) and checking the plausibility of the model in relation to a priori system understanding (structural validity). In order to address this shortcoming, a validation framework for ANNs is introduced in this paper that covers all of the above aspects of validation. In addition, the validann R-package is introduced that enables these validation methods to be implemented in a user-friendly and consistent fashion. The benefits of the framework and R-package are demonstrated for two environmental modelling case studies, highlighting the importance of considering replicative and structural validity in addition to predictive validity
Data-driven modelling approaches for socio-hydrology: opportunities and challenges within the Panta Rhei Science Plan
“Panta Rhei – Everything Flows” is the science plan for the International Association of Hydrological Sciences scientific decade 2013–2023. It is founded on the need for improved understanding of the mutual, two-way interactions occurring at the interface of hydrology and society, and their role in influencing future hydrologic system change. It calls for strategic research effort focussed on the delivery of coupled, socio-hydrologic models. In this paper we explore and synthesize opportunities and challenges that socio-hydrology present for data-driven modelling. We highlight the potential for a new era of collaboration between data-driven and more physically-based modellers that should improve our ability to model and manage socio-hydrologic systems. Crucially, we approach data-driven, conceptual and physical modelling paradigms as being complementary rather than competing; positioning them along a continuum of modelling approaches that reflects the relative extent to which hypotheses and / or data are available to inform the model development process
GeoComputation, Second Edition
A revision of Openshaw and Abraharts seminal work, 'GeoComputation, Second Edition' retains influences of its originators while also providing updated, state-of-the-art information on changes in the computational environment. In keeping with the field's development, this new edition takes a broader view and provides comprehensive coverage across the field of GeoComputation