12 research outputs found

    On how data are partitioned in model development and evaluation: Confronting the elephant in the room to enhance model generalization

    Get PDF
    Models play a pivotal role in advancing our understanding of Earth\u27s physical nature and environmental systems, aiding in their efficient planning and management. The accuracy and reliability of these models heavily rely on data, which are generally partitioned into subsets for model development and evaluation. Surprisingly, how this partitioning is done is often not justified, even though it determines what model we end up with, how we assess its performance and what decisions we make based on the resulting model outputs. In this study, we shed light on the paramount importance of meticulously considering data partitioning in the model development and evaluation process, and its significant impact on model generalization. We identify flaws in existing data-splitting approaches and propose a forward-looking strategy to effectively confront the “elephant in the room”, leading to improved model generalization capabilities

    On how data are partitioned in model development and evaluation: Confronting the elephant in the room to enhance model generalization

    Get PDF
    This is the final version. Available on open access from Elsevier via the DOI in this recordData availability: No data was used for the research described in the article.Models play a pivotal role in advancing our understanding of Earth's physical nature and environmental systems, aiding in their efficient planning and management. The accuracy and reliability of these models heavily rely on data, which are generally partitioned into subsets for model development and evaluation. Surprisingly, how this partitioning is done is often not justified, even though it determines what model we end up with, how we assess its performance and what decisions we make based on the resulting model outputs. In this study, we shed light on the paramount importance of meticulously considering data partitioning in the model development and evaluation process, and its significant impact on model generalization. We identify flaws in existing data-splitting approaches and propose a forward-looking strategy to effectively confront the “elephant in the room”, leading to improved model generalization capabilities.National Natural Science Foundation of ChinaAustralian Research Council (ARC

    Gaussian Process emulation of spatiotemporal outputs of a 2D inland flood model

    Get PDF
    The computational limitations of complex numerical models have led to adoption of statistical emulators across a variety of problems in science and engineering disciplines to circumvent the high computational costs associated with numerical simulations. In flood modelling, many hydraulic and hydrodynamic numerical models, especially when operating at high spatiotemporal resolutions, have prohibitively high computational costs for tasks requiring the instantaneous generation of very large numbers of simulation results. This study examines the appropriateness and robustness of Gaussian Process (GP) models to emulate the results from a hydraulic inundation model. The developed GPs produce real-time predictions based on the simulation output from LISFLOOD-FP numerical model. An efficient dimensionality reduction scheme is developed to tackle the high dimensionality of the output space and is combined with the GPs to investigate the predictive performance of the proposed emulator for estimation of the inundation depth. The developed GP-based framework is capable of robust and straightforward quantification of the uncertainty associated with the predictions, without requiring additional model evaluations and simulations. Further, this study explores the computational advantages of using a GP-based emulator over alternative methodologies such as neural networks, by undertaking a comparative analysis. For the case study data presented in this paper, the GP model was found to accurately reproduce water depths and inundation extent by classification and produce computational speedups of approximately 10,000 times compared with the original simulator, and 80 times for a neural network-based emulator

    Recent insights on uncertainties present in integrated catchment water quality modelling

    Get PDF
    This paper aims to stimulate discussion based on the experiences derived from the QUICS project (Quantifying Uncertainty in Integrated Catchment Studies). First it briefly discusses the current state of knowledge on uncertainties in sub-models of integrated catchment models and the existing frameworks for analysing uncertainty. Furthermore, it compares the relative approaches of both building and calibrating fully integrated models or linking separate sub-models. It also discusses the implications of model linkage on overall uncertainty and how to define an acceptable level of model complexity. This discussion includes, whether we should shift our attention from uncertainties due to linkage, when using linked models, to uncertainties in model structure by necessary simplification or by using more parameters. This discussion attempts to address the question as to whether there is an increase in uncertainty by linking these models or if a compensation effect could take place and that overall uncertainty in key water quality parameters actually decreases. Finally, challenges in the application of uncertainty analysis in integrated catchment water quality modelling, as encountered in this project, are discussed and recommendations for future research areas are highlighted

    Effective modeling for integrated water resource management: a guide to contextual practices by phases and steps and future opportunities

    Get PDF
    The effectiveness of Integrated Water Resource Management (IWRM) modeling hinges on the quality of practices employed through the process, starting from early problem definition all the way through to using the model in a way that serves its intended purpose. The adoption and implementation of effective modeling practices need to be guided by a practical understanding of the variety of decisions that modelers make, and the information considered in making these choices. There is still limited documented knowledge on the modeling workflow, and the role of contextual factors in determining this workflow and which practices to employ. This paper attempts to contribute to this knowledge gap by providing systematic guidance of the modeling practices through the phases (Planning, Development, Application, and Perpetuation) and steps that comprise the modeling process, positing questions that should be addressed. Practice-focused guidance helps explain the detailed process of conducting IWRM modeling, including the role of contextual factors in shaping practices. We draw on findings from literature and the authors’ collective experience to articulate what and how contextual factors play out in employing those practices. In order to accelerate our learning about how to improve IWRM modeling, the paper concludes with five key areas for future practice-related research: knowledge sharing, overcoming data limitations, informed stakeholder involvement, social equity and uncertainty management. © 2019 Elsevier Lt

    Eight grand challenges in socio-environmental systems modeling

    Full text link
    Modeling is essential to characterize and explore complex societal and environmental issues in systematic and collaborative ways. Socio-environmental systems (SES) modeling integrates knowledge and perspectives into conceptual and computational tools that explicitly recognize how human decisions affect the environment. Depending on the modeling purpose, many SES modelers also realize that involvement of stakeholders and experts is fundamental to support social learning and decision-making processes for achieving improved environmental and social outcomes. The contribution of this paper lies in identifying and formulating grand challenges that need to be overcome to accelerate the development and adaptation of SES modeling. Eight challenges are delineated: bridging epistemologies across disciplines; multi-dimensional uncertainty assessment and management; scales and scaling issues; combining qualitative and quantitative methods and data; furthering the adoption and impacts of SES modeling on policy; capturing structural changes; representing human dimensions in SES; and leveraging new data types and sources. These challenges limit our ability to effectively use SES modeling to provide the knowledge and information essential for supporting decision making. Whereas some of these challenges are not unique to SES modeling and may be pervasive in other scientific fields, they still act as barriers as well as research opportunities for the SES modeling community. For each challenge, we outline basic steps that can be taken to surmount the underpinning barriers. Thus, the paper identifies priority research areas in SES modeling, chiefly related to progressing modeling products, processes and practices.</jats:p

    Eight grand challenges in socio-environmental systems modeling

    Get PDF
    Modeling is essential to characterize and explore complex societal and environmental issues in systematic and collaborative ways. Socio-environmental systems (SES) modeling integrates knowledge and perspectives into conceptual and computational tools that explicitly recognize how human decisions affect the environment. Depending on the modeling purpose, many SES modelers also realize that involvement of stakeholders and experts is fundamental to support social learning and decision-making processes for achieving improved environmental and social outcomes. The contribution of this paper lies in identifying and formulating grand challenges that need to be overcome to accelerate the development and adaptation of SES modeling. Eight challenges are delineated: bridging epistemologies across disciplines; multi-dimensional uncertainty assessment and management; scales and scaling issues; combining qualitative and quantitative methods and data; furthering the adoption and impacts of SES modeling on policy; capturing structural changes; representing human dimensions in SES; and leveraging new data types and sources. These challenges limit our ability to effectively use SES modeling to provide the knowledge and information essential for supporting decision making. Whereas some of these challenges are not unique to SES modeling and may be pervasive in other scientific fields, they still act as barriers as well as research opportunities for the SES modeling community. For each challenge, we outline basic steps that can be taken to surmount the underpinning barriers. Thus, the paper identifies priority research areas in SES modeling, chiefly related to progressing modeling products, processes and practices

    Uncertainty analysis of a semi-distributed hydrologic model based on a Gaussian Process emulator

    No full text
    Despite various criticisms of GLUE (Generalized Likelihood Uncertainty Estimation), it is still a widely-used uncertainty analysis technique in hydrologic modelling that can give an appreciation of the level and sources of uncertainty. We introduce an augmented GLUE approach based on a Gaussian Process (GP) emulator, involving GP to conduct a Bayesian sensitivity analysis to narrow down the influential factor space, and then performing a standard GLUE uncertainty analysis. This approach is demonstrated for a SWAT (Soil and Water Assessment Tool) application in a watershed in China using a calibration and two validation periods. Results show: 1) the augmented approach led to the screening out of 14–18 unimportant factors, effectively narrowing factor space; 2) compared to the more standard GLUE, it substantially improved the sampling efficiency, and located the optimal factor region at lower computational cost. This approach can be used for other uncertainty analysis techniques in hydrologic and non-hydrologic models.The research was supported by National Natural Science Foundation of China (41361140361), and State Key Laboratory of Desert and Oasis Ecology Project (Y471161)

    The Future of Sensitivity Analysis: An essential discipline for systems modeling and policy support

    Get PDF
    Sensitivity analysis (SA) is en route to becoming an integral part of mathematical modeling. The tremendous potential benefits of SA are, however, yet to be fully realized, both for advancing mechanistic and data-driven modeling of human and natural systems, and in support of decision making. In this perspective paper, a multidisciplinary group of researchers and practitioners revisit the current status of SA, and outline research challenges in regard to both theoretical frameworks and their applications to solve real-world problems. Six areas are discussed that warrant further attention, including (1) structuring and standardizing SA as a discipline, (2) realizing the untapped potential of SA for systems modeling, (3) addressing the computational burden of SA, (4) progressing SA in the context of machine learning, (5) clarifying the relationship and role of SA to uncertainty quantification, and (6) evolving the use of SA in support of decision making. An outlook for the future of SA is provided that underlines how SA must underpin a wide variety of activities to better serve science and society.John Jakeman’s work was supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program. Joseph Guillaume received funding from an Australian Research Council Discovery Early Career Award (project no. DE190100317). Arnald Puy worked on this paper on a Marie Sklodowska-Curie Global Fellowship, grant number 792178. Takuya Iwanaga is supported through an Australian Government Research Training Program (AGRTP) Scholarship and the ANU Hilda-John Endowment Fun
    corecore