14,207 research outputs found

    An Evolutionary Algorithm for the Estimation of Threshold Vector Error Correction Models

    Get PDF
    We develop an evolutionary algorithm to estimate Threshold Vector Error Correction models (TVECM) with more than two cointegrated variables. Since disregarding a threshold in cointegration models renders standard approaches to the estimation of the cointegration vectors inefficient, TVECM necessitate a simultaneous estimation of the cointegration vector(s) and the threshold. As far as two cointegrated variables are considered this is commonly achieved by a grid search. However, grid search quickly becomes computationally unfeasible if more than two variables are cointegrated. Therefore, the likelihood function has to be maximized using heuristic approaches. Depending on the precise problem structure the evolutionary approach developed in the present paper for this purpose saves 90 to 99 per cent of the computation time of a grid search.evolutionary strategy, genetic algorithm, TVECM

    Directional genetic differentiation and asymmetric migration

    Get PDF
    Understanding the population structure and patterns of gene flow within species is of fundamental importance to the study of evolution. In the fields of population and evolutionary genetics, measures of genetic differentiation are commonly used to gather this information. One potential caveat is that these measures assume gene flow to be symmetric. However, asymmetric gene flow is common in nature, especially in systems driven by physical processes such as wind or water currents. Since information about levels of asymmetric gene flow among populations is essential for the correct interpretation of the distribution of contemporary genetic diversity within species, this should not be overlooked. To obtain information on asymmetric migration patterns from genetic data, complex models based on maximum likelihood or Bayesian approaches generally need to be employed, often at great computational cost. Here, a new simpler and more efficient approach for understanding gene flow patterns is presented. This approach allows the estimation of directional components of genetic divergence between pairs of populations at low computational effort, using any of the classical or modern measures of genetic differentiation. These directional measures of genetic differentiation can further be used to calculate directional relative migration and to detect asymmetries in gene flow patterns. This can be done in a user-friendly web application called divMigrate-online introduced in this paper. Using simulated data sets with known gene flow regimes, we demonstrate that the method is capable of resolving complex migration patterns under a range of study designs.Comment: 25 pages, 8 (+3) figures, 1 tabl

    An integrative computational model for intestinal tissue renewal

    Get PDF
    Objectives\ud \ud The luminal surface of the gut is lined with a monolayer of epithelial cells that acts as a nutrient absorptive engine and protective barrier. To maintain its integrity and functionality, the epithelium is renewed every few days. Theoretical models are powerful tools that can be used to test hypotheses concerning the regulation of this renewal process, to investigate how its dysfunction can lead to loss of homeostasis and neoplasia, and to identify potential therapeutic interventions. Here we propose a new multiscale model for crypt dynamics that links phenomena occurring at the subcellular, cellular and tissue levels of organisation.\ud \ud Methods\ud \ud At the subcellular level, deterministic models characterise molecular networks, such as cell-cycle control and Wnt signalling. The output of these models determines the behaviour of each epithelial cell in response to intra-, inter- and extracellular cues. The modular nature of the model enables us to easily modify individual assumptions and analyse their effects on the system as a whole.\ud \ud Results\ud \ud We perform virtual microdissection and labelling-index experiments, evaluate the impact of various model extensions, obtain new insight into clonal expansion in the crypt, and compare our predictions with recent mitochondrial DNA mutation data. \ud \ud Conclusions\ud \ud We demonstrate that relaxing the assumption that stem-cell positions are fixed enables clonal expansion and niche succession to occur. We also predict that the presence of extracellular factors near the base of the crypt alone suffices to explain the observed spatial variation in nuclear beta-catenin levels along the crypt axis

    Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

    Get PDF
    The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Special attention is given to practical aspects such as the selection of parameters, available RF implementations, and important pitfalls and biases of RF and its variable importance measures (VIMs). The paper surveys recent developments of the methodology relevant to bioinformatics as well as some representative examples of RF applications in this context and possible directions for future research

    A population Monte Carlo scheme with transformed weights and its application to stochastic kinetic models

    Get PDF
    This paper addresses the problem of Monte Carlo approximation of posterior probability distributions. In particular, we have considered a recently proposed technique known as population Monte Carlo (PMC), which is based on an iterative importance sampling approach. An important drawback of this methodology is the degeneracy of the importance weights when the dimension of either the observations or the variables of interest is high. To alleviate this difficulty, we propose a novel method that performs a nonlinear transformation on the importance weights. This operation reduces the weight variation, hence it avoids their degeneracy and increases the efficiency of the importance sampling scheme, specially when drawing from a proposal functions which are poorly adapted to the true posterior. For the sake of illustration, we have applied the proposed algorithm to the estimation of the parameters of a Gaussian mixture model. This is a very simple problem that enables us to clearly show and discuss the main features of the proposed technique. As a practical application, we have also considered the popular (and challenging) problem of estimating the rate parameters of stochastic kinetic models (SKM). SKMs are highly multivariate systems that model molecular interactions in biological and chemical problems. We introduce a particularization of the proposed algorithm to SKMs and present numerical results.Comment: 35 pages, 8 figure

    Two-Locus Likelihoods under Variable Population Size and Fine-Scale Recombination Rate Estimation

    Full text link
    Two-locus sampling probabilities have played a central role in devising an efficient composite likelihood method for estimating fine-scale recombination rates. Due to mathematical and computational challenges, these sampling probabilities are typically computed under the unrealistic assumption of a constant population size, and simulation studies have shown that resulting recombination rate estimates can be severely biased in certain cases of historical population size changes. To alleviate this problem, we develop here new methods to compute the sampling probability for variable population size functions that are piecewise constant. Our main theoretical result, implemented in a new software package called LDpop, is a novel formula for the sampling probability that can be evaluated by numerically exponentiating a large but sparse matrix. This formula can handle moderate sample sizes (n50n \leq 50) and demographic size histories with a large number of epochs (D64\mathcal{D} \geq 64). In addition, LDpop implements an approximate formula for the sampling probability that is reasonably accurate and scales to hundreds in sample size (n256n \geq 256). Finally, LDpop includes an importance sampler for the posterior distribution of two-locus genealogies, based on a new result for the optimal proposal distribution in the variable-size setting. Using our methods, we study how a sharp population bottleneck followed by rapid growth affects the correlation between partially linked sites. Then, through an extensive simulation study, we show that accounting for population size changes under such a demographic model leads to substantial improvements in fine-scale recombination rate estimation. LDpop is freely available for download at https://github.com/popgenmethods/ldpopComment: 32 pages, 13 figure
    corecore