97,135 research outputs found
Uncertainty and sensitivity analysis in quantitative pest risk assessments : practical rules for risk assessors
Quantitative models have several advantages compared to qualitative methods for pest risk assessments (PRA). Quantitative models do not require the definition of categorical ratings and can be used to compute numerical probabilities of entry and establishment, and to quantify spread and impact. These models are powerful tools, but they include several sources of uncertainty that need to be taken into account by risk assessors and communicated to decision makers. Uncertainty analysis (UA) and sensitivity analysis (SA) are useful for analyzing uncertainty in models used in PRA, and are becoming more popular. However, these techniques should be applied with caution because several factors may influence their results. In this paper, a brief overview of methods of UA and SA are given. As well, a series of practical rules are defined that can be followed by risk assessors to improve the reliability of UA and SA results. These rules are illustrated in a case study based on the infection model of Magarey et al. (2005) where the results of UA and SA are shown to be highly dependent on the assumptions made on the probability distribution of the model inputs
Assessment of the potential impacts of plant traits across environments by combining global sensitivity analysis and dynamic modeling in wheat
A crop can be viewed as a complex system with outputs (e.g. yield) that are
affected by inputs of genetic, physiology, pedo-climatic and management
information. Application of numerical methods for model exploration assist in
evaluating the major most influential inputs, providing the simulation model is
a credible description of the biological system. A sensitivity analysis was
used to assess the simulated impact on yield of a suite of traits involved in
major processes of crop growth and development, and to evaluate how the
simulated value of such traits varies across environments and in relation to
other traits (which can be interpreted as a virtual change in genetic
background). The study focused on wheat in Australia, with an emphasis on
adaptation to low rainfall conditions. A large set of traits (90) was evaluated
in a wide target population of environments (4 sites x 125 years), management
practices (3 sowing dates x 2 N fertilization) and (2 levels). The
Morris sensitivity analysis method was used to sample the parameter space and
reduce computational requirements, while maintaining a realistic representation
of the targeted trait x environment x management landscape ( 82 million
individual simulations in total). The patterns of parameter x environment x
management interactions were investigated for the most influential parameters,
considering a potential genetic range of +/- 20% compared to a reference. Main
(i.e. linear) and interaction (i.e. non-linear and interaction) sensitivity
indices calculated for most of APSIM-Wheat parameters allowed the identifcation
of 42 parameters substantially impacting yield in most target environments.
Among these, a subset of parameters related to phenology, resource acquisition,
resource use efficiency and biomass allocation were identified as potential
candidates for crop (and model) improvement.Comment: 22 pages, 8 figures. This work has been submitted to PLoS On
Parallel Implementation of Efficient Search Schemes for the Inference of Cancer Progression Models
The emergence and development of cancer is a consequence of the accumulation
over time of genomic mutations involving a specific set of genes, which
provides the cancer clones with a functional selective advantage. In this work,
we model the order of accumulation of such mutations during the progression,
which eventually leads to the disease, by means of probabilistic graphic
models, i.e., Bayesian Networks (BNs). We investigate how to perform the task
of learning the structure of such BNs, according to experimental evidence,
adopting a global optimization meta-heuristics. In particular, in this work we
rely on Genetic Algorithms, and to strongly reduce the execution time of the
inference -- which can also involve multiple repetitions to collect
statistically significant assessments of the data -- we distribute the
calculations using both multi-threading and a multi-node architecture. The
results show that our approach is characterized by good accuracy and
specificity; we also demonstrate its feasibility, thanks to a 84x reduction of
the overall execution time with respect to a traditional sequential
implementation
Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications
Dropout events in single-cell RNA sequencing (scRNA-seq) cause many transcripts to go undetected and induce an excess of zero read counts, leading to power issues in differential expression (DE) analysis. This has triggered the development of bespoke scRNA-seq DE methods to cope with zero inflation. Recent evaluations, however, have shown that dedicated scRNA-seq tools provide no advantage compared to traditional bulk RNA-seq tools. We introduce a weighting strategy, based on a zero-inflated negative binomial model, that identifies excess zero counts and generates gene-and cell-specific weights to unlock bulk RNA-seq DE pipelines for zero-inflated data, boosting performance for scRNA-seq
MOLNs: A cloud platform for interactive, reproducible and scalable spatial stochastic computational experiments in systems biology using PyURDME
Computational experiments using spatial stochastic simulations have led to
important new biological insights, but they require specialized tools, a
complex software stack, as well as large and scalable compute and data analysis
resources due to the large computational cost associated with Monte Carlo
computational workflows. The complexity of setting up and managing a
large-scale distributed computation environment to support productive and
reproducible modeling can be prohibitive for practitioners in systems biology.
This results in a barrier to the adoption of spatial stochastic simulation
tools, effectively limiting the type of biological questions addressed by
quantitative modeling. In this paper, we present PyURDME, a new, user-friendly
spatial modeling and simulation package, and MOLNs, a cloud computing appliance
for distributed simulation of stochastic reaction-diffusion models. MOLNs is
based on IPython and provides an interactive programming platform for
development of sharable and reproducible distributed parallel computational
experiments
aFold – using polynomial uncertainty modelling for differential gene expression estimation from RNA sequencing data
Data normalization and identification of significant differential expression represent crucial steps in RNA-Seq analysis. Many available tools rely on assumptions that are often not met by real data, including the common assumption of symmetrical distribution of up- and down-regulated genes, the presence of only few differentially expressed genes and/or few outliers. Moreover, the cut-off for selecting significantly differentially expressed genes for further downstream analysis often depend on arbitrary choices
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
- …