52 research outputs found

    Metropolis-Hastings within Partially Collapsed Gibbs Samplers

    Full text link
    The Partially Collapsed Gibbs (PCG) sampler offers a new strategy for improving the convergence of a Gibbs sampler. PCG achieves faster convergence by reducing the conditioning in some of the draws of its parent Gibbs sampler. Although this can significantly improve convergence, care must be taken to ensure that the stationary distribution is preserved. The conditional distributions sampled in a PCG sampler may be incompatible and permuting their order may upset the stationary distribution of the chain. Extra care must be taken when Metropolis-Hastings (MH) updates are used in some or all of the updates. Reducing the conditioning in an MH within Gibbs sampler can change the stationary distribution, even when the PCG sampler would work perfectly if MH were not used. In fact, a number of samplers of this sort that have been advocated in the literature do not actually have the target stationary distributions. In this article, we illustrate the challenges that may arise when using MH within a PCG sampler and develop a general strategy for using such updates while maintaining the desired stationary distribution. Theoretical arguments provide guidance when choosing between different MH within PCG sampling schemes. Finally we illustrate the MH within PCG sampler and its computational advantage using several examples from our applied work

    Retrieving rice (Oryza sativa L.) net photosynthetic rate from UAV multispectral images based on machine learning methods

    Get PDF
    Photosynthesis is the key physiological activity in the process of crop growth and plays an irreplaceable role in carbon assimilation and yield formation. This study extracted rice (Oryza sativa L.) canopy reflectance based on the UAV multispectral images and analyzed the correlation between 25 vegetation indices (VIs), three textural indices (TIs), and net photosynthetic rate (Pn) at different growth stages. Linear regression (LR), support vector regression (SVR), gradient boosting decision tree (GBDT), random forest (RF), and multilayer perceptron neural network (MLP) models were employed for Pn estimation, and the modeling accuracy was compared under the input condition of VIs, VIs combined with TIs, and fusion of VIs and TIs with plant height (PH) and SPAD. The results showed that VIs and TIs generally had the relatively best correlation with Pn at the jointing–booting stage and the number of VIs with significant correlation (p< 0.05) was the largest. Therefore, the employed models could achieve the highest overall accuracy [coefficient of determination (R2) of 0.383–0.938]. However, as the growth stage progressed, the correlation gradually weakened and resulted in accuracy decrease (R2 of 0.258–0.928 and 0.125–0.863 at the heading–flowering and ripening stages, respectively). Among the tested models, GBDT and RF models could attain the best performance based on only VIs input (with R2 ranging from 0.863 to 0.938 and from 0.815 to 0.872, respectively). Furthermore, the fusion input of VIs, TIs with PH, and SPAD could more effectively improve the model accuracy (R2 increased by 0.049–0.249, 0.063–0.470, and 0.113–0.471, respectively, for three growth stages) compared with the input combination of VIs and TIs (R2 increased by 0.015–0.090, 0.001–0.139, and 0.023–0.114). Therefore, the GBDT and RF model with fused input could be highly recommended for rice Pn estimation and the methods could also provide reference for Pn monitoring and further yield prediction at field scale

    Bayesian phylogenetic inference using relaxed-clocks and the multispecies coalescent

    Get PDF
    The multispecies coalescent (MSC) model accommodates both species divergences and within-species coalescent and provides a natural framework for phylogenetic analysis of genomic data when the gene trees vary across the genome. The MSC model implemented in the program BPP assumes a molecular clock and the Jukes-Cantor model, and is suitable for analyzing genomic data from closely related species. Here we extend our implementation to more general substitution models and relaxed clocks to allow the rate to vary among species. The MSCwith-relaxed-clock model allows the estimation of species divergence times and ancestral population sizes using genomic sequences sampled from contemporary species when the strict clock assumption is violated, and provides a simulation framework for evaluating species tree estimation methods. We conducted simulations and analyzed two real datasets to evaluate the utility of the new models. We confirm that the clock-JC model is adequate for inference of shallow trees with closely related species, but it is important to account for clock violation for distant species. Our simulation suggests that there is valuable phylogenetic information in the gene-tree branch lengths even if the molecular clock assumption is seriously violated, and the relaxed-clock models implemented in BPP are able to extract such information. Our Markov chain Monte Carlo (MCMC) algorithms suffer from mixing problems when used for species tree estimation under the relaxed clock and we discuss possible improvements. We conclude that the new models are currently most effective for estimating population parameters such as species divergence times when the species tree is fixed

    Estimation of species divergence times in presence of cross-species gene flow

    Get PDF
    Cross-species introgression can have significant impacts on phylogenomic reconstruction of species divergence events. Here, we used simulations to show how the presence of even a small amount of introgression can bias divergence time estimates when gene flow is ignored in the analysis. Using advances in analytical methods under the multispecies coalescent (MSC) model, we demonstrate that by accounting for incomplete lineage sorting and introgression using large phylogenomic data sets this problem can be avoided. The multispecies-coalescent-with-introgression (MSci) model is capable of accurately estimating both divergence times and ancestral effective population sizes, even when only a single diploid individual per species is sampled. We characterize some general expectations for biases in divergence time estimation under three different scenarios: 1) introgression between sister species, 2) introgression between non-sister species, and 3) introgression from an unsampled (i.e., ghost) outgroup lineage. We also conducted simulations under the isolation-with-migration (IM) model, and found that the MSci model assuming episodic gene flow was able to accurately estimate species divergence times despite high levels of continuous gene flow. We estimated divergence times under the MSC and MSci models from two published empirical datasets with previous evidence of introgression, one of 372 target-enrichment loci from baobabs (Adansonia), and another of 1,000 transcriptome loci from fourteen species of the tomato relative, Jaltomata. The empirical analyses not only confirm our findings from simulations, demonstrating that the MSci model can reliably estimate divergence times, but also show that divergence time estimation under the MSC can be robust to the presence of small amounts of introgression in empirical datasets with extensive taxon sampling

    Using surrogate distributions to improve the convergence properties of gibbs-type samplers

    No full text
    Gibbs-type samplers are widely used tools for obtaining Monte Carlo samples from posterior distributions under complicated Bayesian models. Standard Gibbs samplers update component quantities of the parameter by sequentially sampling their conditional distributions under the target joint distribution. However, this strategy can be slow to converge if the components are highly correlated. We formalize a general strategy to construct more efficient samplers by replacing some of the conditional distributions with conditionals of a surrogate distribution. The surrogate distribution is designed to share certain marginal distributions with the target, but with lower correlations among its components. Although not necessarily recognized when they were introduced, a number of existing strategies for improving Gibbs can be formulated in this way (e.g., Marginal Data Augmentation, Partially Collapsed Gibbs sampling, Ancillarity-Sufficiency Interweaving Strategy, etc.). The use of surrogate distributions in Gibbs-type samplers may lead to incompatible conditional distributions and thus sensitivity to the order of the component draws. We propose a framework to combine different strategies involving surrogate distributions into a single coherent sampler that maintains the target stationary distribution and outperforms any of its component algorithms in terms of convergence. We use both theoretical arguments and numerical examples to illustrate the implementation and efficiency of our strategy. A problem in supernova cosmology has motivated our work and serves as a realistic testing ground for our methods. Finally, we correct two errors in the related Marginal Data Augmentation algorithms of Imai and van~Dyk (2005) that are quite popular for fitting multinomial probit models.Open Acces

    Zeolite as a Tool to Recycle Nitrogen and Phosphorus in Paddy Fields under Straw Returning Conditions

    No full text
    Excess nitrogen (N) caused by straw returning to paddy fields undergoing flooding irrigation deteriorates the water quality. The purpose of this research was to use both simulated field and pot experiments to explore a new approach using zeolite to recycle this excess N. The results from simulated field experiments in stagnant water showed N adsorption with different zeolite applications (25, 50, 75, 100, 125, and 150 g L−1). Pot experiments revealed how straw and reused zeolite applications affected the concentrations of ammonia N (NH4+-N), nitrate N (NO3−-N), total N (TN), and total phosphorus (TP) in the surface water and soil layers of the paddy field. Zeolite showed a strong ability to adsorb NH4+-N in wastewater, even in a simulated drainage ditch (100 g L−1 zeolite adsorbed 74% NH4+-N). The zeolite recycled from the drainage ditch was still able to reduce N concentration caused by straw decomposition in the surface water. Zeolite adsorption reduced the peak values of NH4+-N, TN, and TP by 30%, 19%, and 5%, respectively. Based on these findings and conventional field designs, the use of 20 t ha−1 zeolite in the field is effective for recycling N and P. This research provides a sustainable development method to mitigate the water quality deterioration caused by straw returning to the field
    • …