80 research outputs found

    New Insights into History Matching via Sequential Monte Carlo

    Get PDF
    The aim of the history matching method is to locate non-implausible regions of the parameter space of complex deterministic or stochastic models by matching model outputs with data. It does this via a series of waves where at each wave an emulator is fitted to a small number of training samples. An implausibility measure is defined which takes into account the closeness of simulated and observed outputs as well as emulator uncertainty. As the waves progress, the emulator becomes more accurate so that training samples are more concentrated on promising regions of the space and poorer parts of the space are rejected with more confidence. Whilst history matching has proved to be useful, existing implementations are not fully automated and some ad-hoc choices are made during the process, which involves user intervention and is time consuming. This occurs especially when the non-implausible region becomes small and it is difficult to sample this space uniformly to generate new training points. In this article we develop a sequential Monte Carlo (SMC) algorithm for implementation which is semi-automated. Our novel SMC approach reveals that the history matching method yields a non-implausible distribution that can be multi-modal, highly irregular and very difficult to sample uniformly. Our SMC approach offers a much more reliable sampling of the non-implausible space, which requires additional computation compared to other approaches used in the literature

    Exploiting Field Dependencies for Learning on Categorical Data

    Full text link
    Traditional approaches for learning on categorical data underexploit the dependencies between columns (\aka fields) in a dataset because they rely on the embedding of data points driven alone by the classification/regression loss. In contrast, we propose a novel method for learning on categorical data with the goal of exploiting dependencies between fields. Instead of modelling statistics of features globally (i.e., by the covariance matrix of features), we learn a global field dependency matrix that captures dependencies between fields and then we refine the global field dependency matrix at the instance-wise level with different weights (so-called local dependency modelling) w.r.t. each field to improve the modelling of the field dependencies. Our algorithm exploits the meta-learning paradigm, i.e., the dependency matrices are refined in the inner loop of the meta-learning algorithm without the use of labels, whereas the outer loop intertwines the updates of the embedding matrix (the matrix performing projection) and global dependency matrix in a supervised fashion (with the use of labels). Our method is simple yet it outperforms several state-of-the-art methods on six popular dataset benchmarks. Detailed ablation studies provide additional insights into our method.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence (submitted June 2022, accepted July 2023

    Modelling the Wolbachia incompatible insect technique: strategies for effective mosquito population elimination

    Get PDF
    Background: The Wolbachia incompatible insect technique (IIT) shows promise as a method for eliminating populations of invasive mosquitoes such as Aedes aegypti (Linnaeus) (Diptera: Culicidae) and reducing the incidence of vector-borne diseases such as dengue, chikungunya and Zika. Successful implementation of this biological control strategy relies on high-fidelity separation of male from female insects in mass production systems for inundative release into landscapes. Processes for sex-separating mosquitoes are typically error-prone and laborious, and IIT programmes run the risk of releasing Wolbachia-infected females and replacing wild mosquito populations. Results: We introduce a simple Markov population process model for studying mosquito populations subjected to a Wolbachia-IIT programme which exhibit an unstable equilibrium threshold. The model is used to study, in silico, scenarios that are likely to yield a successful elimination result. Our results suggest that elimination is best achieved by releasing males at rates that adapt to the ever-decreasing wild population, thus reducing the risk of releasing Wolbachia-infected females while reducing costs. Conclusions: While very high-fidelity sex separation is required to avoid establishment, release programmes tend to be robust to the release of a small number of Wolbachia-infected females. These findings will inform and enhance the next generation of Wolbachia-IIT population control strategies that are already showing great promise in field trials

    Field evaluation of tolerance to Tobacco streak virus in sunflower germplasm, and observations of seasonal disease spread

    Get PDF
    Strong statistical evidence was found for differences in tolerance to natural infections of Tobacco streak virus (TSV) in sunflower hybrids. Data from 470 plots involving 23 different sunflower hybrids tested in multiple trials over 5 years in Australia were analysed. Using a Bayesian Hierarchical Logistic Regression model for analysis provided: (i) a rigorous method for investigating the relative effects of hybrid, seasonal rainfall and proximity to inoculum source on the incidence of severe TSV disease; (ii) a natural method for estimating the probability distributions of disease incidence in different hybrids under historical rainfall conditions; and (iii) a method for undertaking all pairwise comparisons of disease incidence between hybrids whilst controlling the familywise error rate without any drastic reduction in statistical power. The tolerance identified in field trials was effective against the main TSV strain associated with disease outbreaks, TSV-parthenium. Glasshouse tests indicate this tolerance to also be effective against the other TSV strain found in central Queensland, TSV-crownbeard. The use of tolerant germplasm is critical to minimise the risk of TSV epidemics in sunflower in this region. We found strong statistical evidence that rainfall during the early growing months of March and April had a negative effect on the incidence of severe infection with greatly reduced disease incidence in years that had high rainfall during this period

    Re-thinking soil carbon modelling: a stochastic approach to quantify uncertainties

    Get PDF
    The benefits of sequestering carbon are many, including improved crop productivity, reductions in greenhouse gases, and financial gains through the sale of carbon credits. Achieving better understanding of the sequestration process has motivated many deterministic models of soil carbon dynamics, but none of these models addresses uncertainty in a comprehensive manner. Uncertainty arises in many ways - around the model inputs, parameters, and dynamics, and subsequently model predictions. In this paper, these uncertainties are addressed in concert by incorporating a physical-statistical model for carbon dynamics within a Bayesian hierarchical modelling framework. This comprehensive approach to accounting for uncertainty in soil carbon modelling has not been attempted previously. This paper demonstrates proof-of-concept based on a one-pool model and identifies requirements for extension to multi-pool carbon modelling. Our model is based on the soil carbon dynamics in Tarlee, South Australia. We specify the model conditionally through its parameters, soil carbon input and decay processes, and observations of those processes. We use a particle marginal Metropolis-Hastings approach specified using the LibBi modelling language. We highlight how samples from the posterior distribution can be used to summarise our knowledge about model parameters, to estimate the probabilities of sequestering carbon, and to forecast changes in carbon stocks under crop rotations not represented explicitly in the original field trials

    Bayesian Physics Informed Neural Networks for Data Assimilation and Spatio-Temporal Modelling of Wildfires

    Full text link
    We apply the Physics Informed Neural Network (PINN) to the problem of wildfire fire-front modelling. We use the PINN to solve the level-set equation, which is a partial differential equation that models a fire-front through the zero-level-set of a level-set function. The result is a PINN that simulates a fire-front as it propagates through the spatio-temporal domain. We show that popular optimisation cost functions used in the literature can result in PINNs that fail to maintain temporal continuity in modelled fire-fronts when there are extreme changes in exogenous forcing variables such as wind direction. We thus propose novel additions to the optimisation cost function that improves temporal continuity under these extreme changes. Furthermore, we develop an approach to perform data assimilation within the PINN such that the PINN predictions are drawn towards observations of the fire-front. Finally, we incorporate our novel approaches into a Bayesian PINN (B-PINN) to provide uncertainty quantification in the fire-front predictions. This is significant as the standard solver, the level-set method, does not naturally offer the capability for data assimilation and uncertainty quantification. Our results show that, with our novel approaches, the B-PINN can produce accurate predictions with high quality uncertainty quantification on real-world data.Comment: Accepted for publication in Spatial Statistic

    Geostatistical based optimization of groundwater monitoring well network design

    Get PDF
    Monitoring groundwater quality in economically important and other aquifers is carried out regularly as part of regulatory processes for water and other resource development. Many water quality parameters are measured as part of baseline monitoring around mining and onshore gas resource development regions to develop improved understanding of the hydrogeological system as well as to inform managerial decisions to assess and manage contamination risks and health hazards. Water quality distribution in an aquifer is most often inferred from point measurements from limited number of bores drilled at arbitrary locations. Estimating the distribution of water quality parameters in the aquifer based on these point measurements is often a challenging task and results in high uncertainty in the estimates due to limited data availability. Minimizing uncertainty can be achieved by drilling more bores to collect water quality data and several approaches are available to identify optimal bore hole locations to minimize estimation uncertainty. However, optimization of borehole locations is difficult when multiple water quality parameters are of interest and have different spatial distributions in the aquifer. In this study we use geostatistical kriging to interpolate a large number of groundwater quality parameters. Then we integrate these predicted values and use the Differential Evolution algorithm to determine optimal locations for bores that would simultaneously reduce spatial prediction uncertainty of all parameters. The method is applied for designing a groundwater monitoring network in the Namoi region of Australia for monitoring groundwater quality in an economically important aquifer of the Great Artesian Basin. Optimal locations for 10 new monitoring bores are identified using this approach
    corecore