237 research outputs found

    The need for operational reasoning in data-driven rating curve prediction of suspended sediment

    Get PDF
    The use of data-driven modelling techniques to deliver improved suspended sediment rating curves has received considerable interest in recent years. Studies indicate an increased level of performance over traditional approaches when such techniques are adopted. However, closer scrutiny reveals that, unlike their traditional counterparts, data-driven solutions commonly include lagged sediment data as model inputs and this seriously limits their operational application. In this paper we argue the need for a greater degree of operational reasoning underpinning data-driven rating curve solutions and demonstrate how incorrect conclusions about the performance of a data-driven modelling technique can be reached when the model solution is based upon operationally-invalid input combinations. We exemplify the problem through the re-analysis and augmentation of a recent and typical published study which uses gene expression programming to model the rating curve. We compare and contrast the previously-published, solutions, whose inputs negate their operational application, with a range of newly developed and directly comparable traditional and data-driven solutions which do have operational value. Results clearly demonstrate that the performance benefits of the published gene expression programming solutions are dependent on the inclusion of operationally-limiting, lagged data inputs. Indeed, when operationally inapplicable input combinations are discounted from the models, and the analysis is repeated, gene expression programming fails to perform as well as many simpler, more standard multiple linear regression, piecewise linear regression and neural network counterparts. The potential for overstatement of the benefits of the data-driven paradigm in rating curve studies is thus highlighted

    Neural network emulation of a rainfall-runoff model

    No full text
    International audienceThe potential of an artificial neural network to perform simple non-linear hydrological transformations is examined. Four neural network models were developed to emulate different facets of a recognised non-linear hydrological transformation equation that possessed a small number of variables and contained no temporal component. The modeling process was based on a set of uniform random distributions. The cloning operation facilitated a direct comparison with the exact equation-based relationship. It also provided broader information about the power of a neural network to emulate existing equations and model non-linear relationships. Several comparisons with least squares multiple linear regression were performed. The first experiment involved a direct emulation of the Xinanjiang Rainfall-Runoff Model. The next two experiments were designed to assess the competencies of two neural solutions that were developed on a reduced number of inputs. This involved the omission and conflation of previous inputs. The final experiment used derived variables to model intrinsic but otherwise concealed internal relationships that are of hydrological interest. Two recent studies have suggested that neural solutions offer no worthwhile improvements in comparison to traditional weighted linear transfer functions for capturing the non-linear nature of hydrological relationships. Yet such fundamental properties are intrinsic aspects of catchment processes that cannot be excluded or ignored. The results from the four experiments that are reported in this paper are used to challenge the interpretations from these two earlier studies and thus further the debate with regards to the appropriateness of neural networks for hydrological modelling

    Legitimising data-driven models: exemplification of a new data-driven mechanistic modelling framework

    Get PDF
    In this paper the difficult problem of how to legitimisedata-driven hydrological models is addressed using an example of a simple artificial neural network modelling problem. Many data-driven models in hydrology have been criticised for their black-box characteristics, which prohibit adequate understanding of their mechanistic behaviour and restrict their wider heuristic value. In response, presented here is a new generic data-driven mechanistic modelling framework. The framework is significant because it incorporates an evaluation of the legitimacy of a data-driven model’s internal modelling mechanism as a core element in the modelling process. The framework’s value is demonstrated by two simple artificial neural network river forecasting scenarios. We develop a novel adaptation of first-order partial derivative, relative sensitivity analysis to enable each model’s mechanistic legitimacy to be evaluated within the framework. The results demonstrate the limitations of standard, goodness-of-fit validation procedures by highlighting how the internal mechanisms of complex models that produce the best fit scores can have lower mechanistic legitimacy than simpler counterparts whose scores are only slightly inferior. Thus, our study directly tackles one of the key debates in data-driven, hydrological modelling: is it acceptable for our ends (i.e. model fit) to justify our means (i.e. the numerical basis by which that fit is achieved)

    Ideal point error for model assessment in data-driven river flow forecasting

    Get PDF
    When analysing the performance of hydrological models in river forecasting, researchers use a number of diverse statistics. Although some statistics appear to be used more regularly in such analyses than others, there is a distinct lack of consistency in evaluation, making studies undertaken by different authors or performed at different locations difficult to compare in a meaningful manner. Moreover, even within individual reported case studies, substantial contradictions are found to occur between one measure of performance and another. In this paper we examine the ideal point error (IPE) metric – a recently introduced measure of model performance that integrates a number of recognised metrics in a logical way. Having a single, integrated measure of performance is appealing as it should permit more straightforward model inter-comparisons. However, this is reliant on a transferrable standardisation of the individual metrics that are combined to form the IPE. This paper examines one potential option for standardisation: the use of naive model benchmarking

    Sensitivity analysis for comparison, validation and physical-legitimacy of neural network-based hydrological models

    Get PDF
    This paper addresses the difficult question of how to perform meaningful comparisons between neural network-based hydrological models and alternative modelling approaches. Standard, goodness-of-fit metric approaches are limited since they only assess numerical performance and not physical legitimacy of the means by which output is achieved. Consequently, the potential for general application or catchment transfer of such models is seldom understood. This paper presents a partial derivative, relative sensitivity analysis method as a consistent means by which the physical legitimacy of models can be evaluated. It is used to compare the behaviour and physical rationality of a generalised linear model and two neural network models for predicting median flood magnitude in rural catchments. The different models perform similarly in terms of goodness-of-fit statistics, but behave quite distinctly when the relative sensitivities of their inputs are evaluated. The neural solutions are seen to offer an encouraging degree of physical legitimacy in their behaviour, over that of a generalised linear modelling counterpart, particularly when overfitting is constrained. This indicates that neural models offer preferable solutions for transfer into ungauged catchments. Thus, the importance of understanding both model performance and physical legitimacy when comparing neural models with alternative modelling approaches is demonstrated

    DAMP: a protocol for contextualising goodness-of-fit statistics in sediment-discharge data-driven modelling

    Get PDF
    The decision sequence which guides the selection of a preferred data-driven modelling solution is usually based solely on statistical assessment of fit to a test dataset, and lacks the incorporation of essential contextual knowledge and understanding included in the evaluation of conventional empirical models. This paper demonstrates how hydrological insight and knowledge of data quality issues can be better incorporated into the sediment-discharge data-driven model assessment procedure: by the plotting of datasets and modelled relationships; and from an understanding and appreciation of the hydrological context of the catchment being modelled. DAMP: a four-point protocol for evaluating the hydrological soundness of data-driven single-input single-output sediment rating curve solutions is presented. The approach is adopted and exemplified in an evaluation of seven explicit sediment-discharge models that are used to predict daily suspended sediment concentration values for a small tropical catchment on the island of Puerto Rico. Four neurocomputing counterparts are compared and contrasted against a set of traditional log-log linear sediment rating curve solutions and a simple linear regression model. The statistical assessment procedure provides one indication of the best model, whilst graphical and hydrological interpretation of the depicted datasets and models challenge this overly-simplistic interpretation. Traditional log-log sediment rating curves, in terms of soundness and robustness, are found to deliver a superior overall product — irrespective of their poorer global goodness-of-fit statistics

    Including spatial distribution in a data-driven rainfall-runoff model to improve reservoir inflow forecasting in Taiwan

    Get PDF
    Multi-step ahead inflow forecasting has a critical role to play in reservoir operation and management in Taiwan during typhoons as statutory legislation requires a minimum of 3-hours warning to be issued before any reservoir releases are made. However, the complex spatial and temporal heterogeneity of typhoon rainfall, coupled with a remote and mountainous physiographic context makes the development of real-time rainfall-runoff models that can accurately predict reservoir inflow several hours ahead of time challenging. Consequently, there is an urgent, operational requirement for models that can enhance reservoir inflow prediction at forecast horizons of more than 3-hours. In this paper we develop a novel semi-distributed, data-driven, rainfall-runoff model for the Shihmen catchment, north Taiwan. A suite of Adaptive Network-based Fuzzy Inference System solutions is created using various combinations of auto-regressive, spatially-lumped radar and point-based rain gauge predictors. Different levels of spatially-aggregated radar-derived rainfall data are used to generate 4, 8 and 12 sub-catchment input drivers. In general, the semi-distributed radar rainfall models outperform their less complex counterparts in predictions of reservoir inflow at lead-times greater than 3-hours. Performance is found to be optimal when spatial aggregation is restricted to 4 sub-catchments, with up to 30% improvements in the performance over lumped and point-based models being evident at 5-hour lead times. The potential benefits of applying semi-distributed, data-driven models in reservoir inflow modelling specifically, and hydrological modelling more generally, is thus demonstrated

    Lösliche Perylen-Fluoreszenzfarbstoffe mit hoher Photostabilität

    Get PDF
    Die Darstellung einer Reihe von 3,4,9,10-Perylentetracarbonsäurediimiden 1 wird beschrieben und deren Lichtechtheit quantitativ untersucht und diskutiert. Es läß sich zeigen, daß durch Einführung von tert-Butyl-Substituenten die als sehr schwerlöslich bekannten Perylen-Pigmentfarbstoffe in organischen Solventien leicht löslich werden und mit hohen Quantenausbeuten fluoreszieren

    HydroTest: a web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts

    Get PDF
    This paper presents details of an open access web site that can be used by hydrologists and other scientists to evaluate time series models. There is at present a general lack of consistency in the way in which hydrological models are assessed that handicaps the comparison of reported studies and hinders the development of superior models. The HydroTest web site provides a wide range of objective metrics and consistent tests of model performance to assess forecasting skill. This resource is designed to promote future transparency and consistency between reported models and includes an open forum that is intended to encourage further discussion and debate on the topic of hydrological performance evaluation metrics. It is envisaged that the provision of such facilities will lead to the creation of superior forecasting metrics and the development of international benchmark time series datasets

    A typology of different development and testing options for symbolic regression modelling of measured and calculated datasets

    Get PDF
    AbstractData-driven modelling is used to develop two alternative types of predictive environmental model: a simulator, a model of a real-world process developed from either a conceptual understanding of physical relations and/or using measured records, and an emulator, an imitator of some other model developed on predicted outputs calculated by that source model. A simple four-way typology called Emulation Simulation Typology (EST) is proposed that distinguishes between (i) model type and (ii) different uses of model development period and model test period datasets. To address the question of to what extent simulator and emulator solutions might be considered interchangeable i.e. provide similar levels of output accuracy when tested on data different from that used in their development, a pair of counterpart pan evaporation models was created using symbolic regression. Each model type delivered similar levels of predictive skill to that other of published solutions. Input–output sensitivity analysis of the two different model types likewise confirmed two very similar underlying response functions. This study demonstrates that the type and quality of data on which a model is tested, has a greater influence on model accuracy assessment, than the type and quality of data on which a model is developed, providing that the development record is sufficiently representative of the conceptual underpinnings of the system being examined. Thus, previously reported substantial disparities occurring in goodness-of-fit statistics for pan evaporation models are most likely explained by the use of either measured or calculated data to test particular models, where lower scores do not necessarily represent major deficiencies in the solution itself
    corecore