97,135 research outputs found

    Uncertainty and sensitivity analysis in quantitative pest risk assessments : practical rules for risk assessors

    Get PDF
    Quantitative models have several advantages compared to qualitative methods for pest risk assessments (PRA). Quantitative models do not require the definition of categorical ratings and can be used to compute numerical probabilities of entry and establishment, and to quantify spread and impact. These models are powerful tools, but they include several sources of uncertainty that need to be taken into account by risk assessors and communicated to decision makers. Uncertainty analysis (UA) and sensitivity analysis (SA) are useful for analyzing uncertainty in models used in PRA, and are becoming more popular. However, these techniques should be applied with caution because several factors may influence their results. In this paper, a brief overview of methods of UA and SA are given. As well, a series of practical rules are defined that can be followed by risk assessors to improve the reliability of UA and SA results. These rules are illustrated in a case study based on the infection model of Magarey et al. (2005) where the results of UA and SA are shown to be highly dependent on the assumptions made on the probability distribution of the model inputs

    Assessment of the potential impacts of plant traits across environments by combining global sensitivity analysis and dynamic modeling in wheat

    Full text link
    A crop can be viewed as a complex system with outputs (e.g. yield) that are affected by inputs of genetic, physiology, pedo-climatic and management information. Application of numerical methods for model exploration assist in evaluating the major most influential inputs, providing the simulation model is a credible description of the biological system. A sensitivity analysis was used to assess the simulated impact on yield of a suite of traits involved in major processes of crop growth and development, and to evaluate how the simulated value of such traits varies across environments and in relation to other traits (which can be interpreted as a virtual change in genetic background). The study focused on wheat in Australia, with an emphasis on adaptation to low rainfall conditions. A large set of traits (90) was evaluated in a wide target population of environments (4 sites x 125 years), management practices (3 sowing dates x 2 N fertilization) and CO2CO_2 (2 levels). The Morris sensitivity analysis method was used to sample the parameter space and reduce computational requirements, while maintaining a realistic representation of the targeted trait x environment x management landscape (\sim 82 million individual simulations in total). The patterns of parameter x environment x management interactions were investigated for the most influential parameters, considering a potential genetic range of +/- 20% compared to a reference. Main (i.e. linear) and interaction (i.e. non-linear and interaction) sensitivity indices calculated for most of APSIM-Wheat parameters allowed the identifcation of 42 parameters substantially impacting yield in most target environments. Among these, a subset of parameters related to phenology, resource acquisition, resource use efficiency and biomass allocation were identified as potential candidates for crop (and model) improvement.Comment: 22 pages, 8 figures. This work has been submitted to PLoS On

    Parallel Implementation of Efficient Search Schemes for the Inference of Cancer Progression Models

    Full text link
    The emergence and development of cancer is a consequence of the accumulation over time of genomic mutations involving a specific set of genes, which provides the cancer clones with a functional selective advantage. In this work, we model the order of accumulation of such mutations during the progression, which eventually leads to the disease, by means of probabilistic graphic models, i.e., Bayesian Networks (BNs). We investigate how to perform the task of learning the structure of such BNs, according to experimental evidence, adopting a global optimization meta-heuristics. In particular, in this work we rely on Genetic Algorithms, and to strongly reduce the execution time of the inference -- which can also involve multiple repetitions to collect statistically significant assessments of the data -- we distribute the calculations using both multi-threading and a multi-node architecture. The results show that our approach is characterized by good accuracy and specificity; we also demonstrate its feasibility, thanks to a 84x reduction of the overall execution time with respect to a traditional sequential implementation

    Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications

    Get PDF
    Dropout events in single-cell RNA sequencing (scRNA-seq) cause many transcripts to go undetected and induce an excess of zero read counts, leading to power issues in differential expression (DE) analysis. This has triggered the development of bespoke scRNA-seq DE methods to cope with zero inflation. Recent evaluations, however, have shown that dedicated scRNA-seq tools provide no advantage compared to traditional bulk RNA-seq tools. We introduce a weighting strategy, based on a zero-inflated negative binomial model, that identifies excess zero counts and generates gene-and cell-specific weights to unlock bulk RNA-seq DE pipelines for zero-inflated data, boosting performance for scRNA-seq

    MOLNs: A cloud platform for interactive, reproducible and scalable spatial stochastic computational experiments in systems biology using PyURDME

    Full text link
    Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools, a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments

    aFold – using polynomial uncertainty modelling for differential gene expression estimation from RNA sequencing data

    No full text
    Data normalization and identification of significant differential expression represent crucial steps in RNA-Seq analysis. Many available tools rely on assumptions that are often not met by real data, including the common assumption of symmetrical distribution of up- and down-regulated genes, the presence of only few differentially expressed genes and/or few outliers. Moreover, the cut-off for selecting significantly differentially expressed genes for further downstream analysis often depend on arbitrary choices

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
    corecore