34 research outputs found
Cancer Biomarker Discovery: The Entropic Hallmark
Background: It is a commonly accepted belief that cancer cells modify their transcriptional state during the progression of the disease. We propose that the progression of cancer cells towards malignant phenotypes can be efficiently tracked using high-throughput technologies that follow the gradual changes observed in the gene expression profiles by employing Shannon's mathematical theory of communication. Methods based on Information Theory can then quantify the divergence of cancer cells' transcriptional profiles from those of normally appearing cells of the originating tissues. The relevance of the proposed methods can be evaluated using microarray datasets available in the public domain but the method is in principle applicable to other high-throughput methods. Methodology/Principal Findings: Using melanoma and prostate cancer datasets we illustrate how it is possible to employ Shannon Entropy and the Jensen-Shannon divergence to trace the transcriptional changes progression of the disease. We establish how the variations of these two measures correlate with established biomarkers of cancer progression. The Information Theory measures allow us to identify novel biomarkers for both progressive and relatively more sudden transcriptional changes leading to malignant phenotypes. At the same time, the methodology was able to validate a large number of genes and processes that seem to be implicated in the progression of melanoma and prostate cancer. Conclusions/Significance: We thus present a quantitative guiding rule, a new unifying hallmark of cancer: the cancer cell's transcriptome changes lead to measurable observed transitions of Normalized Shannon Entropy values (as measured by high-throughput technologies). At the same time, tumor cells increment their divergence from the normal tissue profile increasing their disorder via creation of states that we might not directly measure. This unifying hallmark allows, via the the Jensen-Shannon divergence, to identify the arrow of time of the processes from the gene expression profiles, and helps to map the phenotypical and molecular hallmarks of specific cancer subtypes. The deep mathematical basis of the approach allows us to suggest that this principle is, hopefully, of general applicability for other diseases
A model to account for data dependency when estimating floral cover in different land use types over a season
We propose a model to consider data dependencies and assess spatial and temporal variability in land use specific floral coverage across landscapes. Data dependence arising from repeated measurements across the flowering season is taken into account using hierarchical Archimedean copulas, where the correlation is assumed to be stronger within seasonal periods than between periods. For each seasonal period, a bounded probability distribution is assigned to capture spatial variability in floral cover. The model uses a Bayesian approach and can assess land-use-specific floral covers by integrating experts judgments and field data. The model is applied to assess floral covers in four land use types in southern Sweden, where seasonal variability is captured by dividing the season into two periods according to winter oilseed rape flowering. Floral cover is updated using Markov Chain Monte Carlo sampling based on data from 16 landscapes and 2 years, with repeated measures available from each of the two seasonal periods. Our results indicate that considering data dependence improved the estimation of floral cover based on data observed during a season. Different copula families specifying multivariate probability distributions were tested, and no family had a consistently higher performance in the four tested land use types. Uncertainty in both mode and variability of floral cover was higher when data dependence were accounted for. Posterior modes of floral covers in semi-natural grassland were higher than in field edges, but both expert’s best guesses were higher than these estimates. This confirms previous findings in expert elicitation processes that experts may fail to discriminate extreme values on a bounded range. Floral cover in flower strips were estimated to be smaller/higher than semi-natural grasslands early/late in the season. The mode of floral cover in oil seed rape was estimated to be close to 100%, and higher than estimates provided by expert judgment. Floral covers for different land use classes are key parameters when quantifying floral resources at a landscape level whose assessments rely on both expert judgment and field measurements
Pollinator population size and pollination ecosystem service responses to enhancing floral and nesting resources
Modeling pollination ecosystem services requires a spatially explicit, process-based approach because they depend on both the behavioral responses of pollinators to the amount and spatial arrangement of habitat and on the within- and between-season dynamics of pollinator populations in response to land use. We describe a novel pollinator model predicting flower visitation rates by wild central-place foragers (e.g., nesting bees) in spatially explicit landscapes. The model goes beyond existing approaches by (1) integrating preferential use of more rewarding floral and nesting resources; (2) considering population growth over time; (3) allowing different dispersal distances for workers and reproductives; (4) providing visitation rates for use in crop pollination models. We use the model to estimate the effect of establishing grassy field margins offering nesting resources and a low quantity of flower resources, and/or late-flowering flower strips offering no nesting resources but abundant flowers, on bumble bee populations and visitation rates to flowers in landscapes that differ in amounts of linear seminatural habitats and early mass-flowering crops. Flower strips were three times more effective in increasing pollinator populations and visitation rates than field margins, and this effect increased over time. Late-blooming flower strips increased early-season visitation rates, but decreased visitation rates in other late-season flowers. Increases in population size over time in response to flower strips and amounts of linear seminatural habitats reduced this apparent competition for pollinators. Our spatially explicit, process-based model generates emergent patterns reflecting empirical observations, such that adding flower resources may have contrasting short- and long-term effects due to apparent competition for pollinators and pollinator population size increase. It allows exploring these effects and comparing effect sizes in ways not possible with other existing models. Future applications include species comparisons, analysis of the sensitivity of predictions to life-history traits, as well as large-scale management intervention and policy assessment
Calibration of a bumble bee foraging model using Approximate Bayesian Computation
1. Challenging calibration of complex models can be approached by using prior
knowledge on the parameters. However, the natural choice of Bayesian inference
can be computationally heavy when relying on Markov Chain Monte Carlo (MCMC)
sampling. When the likelihood of the data is intractable, alternative Bayesian
methods have been proposed. Approximate Bayesian Computation (ABC) only
requires sampling from the data generative model, but may be problematic when
the dimension of the data is high.
2. We studied alternative strategies to handle high dimensional data in ABC
applied to the calibration of a spatially explicit foraging model for
\textit{Bombus terrestris}. The first step consisted in building a set of
summary statistics carrying enough biological meaning, i.e. as much as the
original data, and then applying ABC on this set. Two ABC strategies, the use
of regression adjustment leading to the production of ABC posterior samples,
and the use of machine learning approaches to approximate ABC posterior
quantiles, were compared with respect to coverage of model estimates and true
parameter values. The comparison was made on simulated data as well as on data
from two field studies.
3. Results from simulated data showed that some model parameters were easier
to calibrate than others. Approaches based on random forests in general
performed better on simulated data. They also performed well on field data,
even though the posterior predictive distribution exhibited a higher variance.
Nonlinear regression adjustment performed better than linear ones, and the
classical ABC rejection algorithm performed badly.
4. ABC is an interesting and appealing approach for the calibration of
complex models in biology, such as spatially explicit foraging models. However,
while ABC methods are easy to implement, they require considerable tuning