33 research outputs found

    Mlcoalsim: Multilocus Coalescent Simulations

    Get PDF
    Coalescent theory is a powerful tool for population geneticists as well as molecular biologists interested in understanding the patterns and levels of DNA variation. Using coalescent Monte Carlo simulations it is possible to obtain the empirical distributions for a number of statistics across a wide range of evolutionary models; these distributions can be used to test evolutionary hypotheses using experimental data. The mlcoalsim application presented here (based on a version of the ms program, Hudson, 2002) adds important new features to improve methodology (uncertainty and conditional methods for mutation and recombination), models (including strong positive selection, finite sites and heterogeneity in mutation and recombination rates) and analyses (calculating a number of statistics used in population genetics and P-values for observed data). One of the most important features of mlcoalsim is the analysis of multilocus data in linked and independent regions. In summary, mlcoalsim is an integrated software application aimed at researchers interested in molecular evolution. mlcoalsim is written in ANSI C and is available at: http://www.ub.es/softevol/mlcoalsim

    Decomposing the site frequency spectrum: the impact of tree topology on neutrality tests

    Full text link
    We investigate the dependence of the site frequency spectrum (SFS) on the topological structure of genealogical trees. We show that basic population genetic statistics - for instance estimators of θ\theta or neutrality tests such as Tajima's DD - can be decomposed into components of waiting times between coalescent events and of tree topology. Our results clarify the relative impact of the two components on these statistics. We provide a rigorous interpretation of positive or negative values of an important class of neutrality tests in terms of the underlying tree shape. In particular, we show that values of Tajima's DD and Fay and Wu's HH depend in a direct way on a peculiar measure of tree balance which is mostly determined by the root balance of the tree. We present a new test for selection in the same class as Fay and Wu's HH and discuss its interpretation and power. Finally, we determine the trees corresponding to extreme expected values of these neutrality tests and present formulae for these extreme values as a function of sample size and number of segregating sites.Comment: 23 pages, 8 figure

    The expected neutral frequency spectrum of linked sites

    Full text link
    We present an exact, closed expression for the expected neutral Site Frequency Spectrum for two neutral sites, 2-SFS, without recombination. This spectrum is the immediate extension of the well known single site θ/f\theta/f neutral SFS. Similar formulae are also provided for the case of the expected SFS of sites that are linked to a focal neutral mutation of known frequency. Formulae for finite samples are obtained by coalescent methods and remarkably simple expressions are derived for the SFS of a large population, which are also solutions of the multi-allelic Kolmogorov equations. Besides the general interest of these new spectra, they relate to interesting biological cases such as structural variants and introgressions. As an example, we present the expected neutral frequency spectrum of regions with a chromosomal inversion.Comment: 26 pages, 5 figure

    The Site Frequency/Dosage Spectrum of Autopolyploid Populations

    Get PDF
    The Site Frequency Spectrum (SFS) and the heterozygosity of allelic variants are among the most important summary statistics for population genetic analysis of diploid organisms. We discuss the generalization of these statistics to populations of autopolyploid organisms in terms of the joint Site Frequency/Dosage Spectrum and its expected value for autopolyploid populations that follow the standard neutral model. Based on these results, we present estimators of nucleotide variability from High-Throughput Sequencing (HTS) data of autopolyploids and discuss potential issues related to sequencing errors and variant calling. We use these estimators to generalize Tajima's D and other SFS-based neutrality tests to HTS data from autopolyploid organisms. Finally, we discuss how these approaches fail when the number of individuals is small. In fact, in autopolyploids there are many possible deviations from the Hardy–Weinberg equilibrium, each reflected in a different shape of the individual dosage distribution. The SFS from small samples is often dominated by the shape of these deviations of the dosage distribution from its Hardy–Weinberg expectations

    The Identification of Runs of Homozygosity Gives a Focus on the Genetic Diversity and Adaptation of the “Charolais de Cuba” Cattle

    Get PDF
    Inbreeding and effective population size (Ne) are fundamental indicators for the management and conservation of genetic diversity in populations. Genomic inbreeding gives accurate estimates of inbreeding, and the Ne determines the rate of the loss of genetic variation. The objective of this work was to study the distribution of runs of homozygosity (ROHs) in order to estimate genomic inbreeding (FROH) and an effective population size using 38,789 Single Nucleotide Polymorphisms (SNPs) from the Illumina Bovine 50K BeadChip in 86 samples from populations of Charolais de Cuba (n = 40) cattle and to compare this information with French (n = 20) and British Charolais (n = 26) populations. In the Cuban, French, and British Charolais populations, the average estimated genomic inbreeding values using the FROH statistics were 5.7%, 3.4%, and 4%, respectively. The dispersion measured by variation coefficient was high at 43.9%, 37.0%, and 54.2%, respectively. The effective population size experienced a very similar decline during the last century in Charolais de Cuba (from 139 to 23 individuals), in French Charolais (from 142 to 12), and in British Charolais (from 145 to 14) for the ~20 last generations. However, the high variability found in the ROH indicators and FROH reveals an opportunity for maintaining the genetic diversity of this breed with an adequate mating strategy, which can be favored with the use of molecular markers. Moreover, the detected ROH were compared to previous results obtained on the detection of signatures of selection in the same breed. Some of the observed signatures were confirmed by the ROHs, emphasizing the process of adaptation to tropical climate experienced by the Charolais de Cuba population.info:eu-repo/semantics/publishedVersio

    Worldwide genetic relationships of pigs as inferred from X chromosome SNPs

    Get PDF
    The phylogeography of the porcine X chromosome has not been studied despite the unique characteristics of this chromosome. Here, we genotyped 59 single nucleotide polymorphisms (SNPs) in 312 pigs from around the world, representing 39 domestic breeds and wild boars in 30 countries. Overall, widespread commercial breeds showed the highest heterozygosity values, followed by African and American populations. Structuring, as inferred from FST and analysis of molecular variance, was consistently larger in the non-pseudoautosomal (NPAR) than in the pseudoautosomal regions (PAR). Our results show that genetic relationships between populations can vary widely between the NPAR and the PAR, underscoring the fact that their genetic trajectories can be quite different. NPAR showed an increased commercial-like genetic component relative to the PAR, probably because human selection processes to obtain individuals with high productive parameters were mediated by introgressing boars rather than sows.WBP is funded by COLCIENCIAS (Departamento Administrativo de Ciencia, Tecnología e Innovación, Francisco José de Caldas fellowship 497/2009, Colombia). CAS was funded by a PhD grant from CAPES (Coordenaçao de Aperfeiçoamento de Pessoal de Nível Superior, Brazil) and Universidade Catolica de Brasilia, Brazil. This work is funded by grants AGL2010-14822 (MICINN, Spain) to MPE, CGL2009-09346 (MICINN, Spain) to SERO, and Consolider project (MICINN, Spain) to Centre of Research in Agricultural Genomics.Peer reviewe

    Whole genome scanning of a Mediterranean basin hotspot collection provides new insights into olive tree biodiversity and biology

    Get PDF
    Olive tree (Olea europaea L. subsp. europaea var. europaea) is one of the most important species of the Mediterranean region and one of the most ancient species domesticated. The availability of whole genome assemblies and annotations of olive tree cultivars and oleaster (O. europaea subsp. europaea var. sylvestris) has contributed to a better understanding of genetic and genomic differences between olive tree cultivars. However, compared to other plant species there is still a lack of genomic resources for olive tree populations that span the entire Mediterranean region. In the present study we developed the most complete genomic variation map and the most comprehensive catalog/resource of molecular variation to date for 89 olive tree genotypes originating from the entire Mediterranean basin, revealing the genetic diversity of this commercially significant crop tree and explaining the divergence/similarity among different variants. Additionally, the monumental ancient tree ‘Throuba Naxos’ was studied to characterize the potential origin or routes of olive tree domestication. Several candidate genes known to be associated with key agronomic traits, including olive oil quality and fruit yield, were uncovered by a selective sweep scan to be under selection pressure on all olive tree chromosomes. To further exploit the genomic and phenotypic resources obtained from the current work, genome-wide association analyses were performed for 23 morphological and two agronomic traits. Significant associations were detected for eight traits that provide valuable candidates for fruit tree breeding and for deeper understanding of olive tree biology.This research was financed by Greek Public Investments Program (PIP) of General Secretariat for Research & Technology (GSRT), under the Emblematic Action ‘The Olive Road’ (project code:2018ΣE01300000). Sebastián Ramos-Onsins is supported by the grant PID2020-119255GB-I00 (MICINN, Spain) and the CERCA Programme/Generalitat de Catalunya and acknowledges financial support from the Spanish Ministry of Economy and Competitiveness, through the Severo Ochoa Programme for Centres of Excellence in R&D 2016–2019 and 2020–2023 (SEV-2015-0533, CEX2019-000917) and the European Regional Development Fund (ERDF).The publication of the article in OA mode was financially supported by HEAL-Link.With funding from the Spanish government through the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000917).Peer reviewe

    Porcine Y-chromosome variation is consistent with the occurrence of paternal gene flow from non-Asian to Asian populations

    Get PDF
    Altres ajuts: CERCA Programme/Generalitat de Catalunya.Pigs (Sus scrofa) originated in Southeast Asia and expanded to Europe and North Africa approximately 1 MYA. Analyses of porcine Y-chromosome variation have shown the existence of two main haplogroups that are highly divergent, a result that is consistent with previous mitochondrial and autosomal data showing that the Asian and non-Asian pig populations remained geographically isolated until recently. Paradoxically, one of these Y-chromosome haplogroups is extensively shared by pigs and wild boars from Asia and Europe, an observation that is difficult to reconcile with a scenario of prolonged geographic isolation. To shed light on this issue, we genotyped 33 Y-linked SNPs and one indel in a worldwide sample of pigs and wild boars and sequenced a total of 9903 nucleotide sites from seven loci distributed along the Y-chromosome. Notably, the nucleotide diversity per site at the Y-linked loci (0.0015 in Asian pigs) displayed the same order of magnitude as that described for autosomal loci (~0.0023), a finding compatible with a process of sustained and intense isolation. We performed an approximate Bayesian computation analysis focused on the paternal diversity of wild boars and local pig breeds in which we compared three demographic models: two isolation models (I models) differing in the time of isolation and a model of isolation with recent unidirectional migration (IM model). Our results suggest that the most likely explanation for the extensive sharing of one Y-chromosome haplogroup between non-Asian and Asian populations is a recent and unidirectional (non-Asian > Asian) paternal migration event

    eSPiGA: a population genomic analyses package with graphical interface

    No full text
    Trabajo presentado a la Bioinformatics Community Conference (BCC), celebrada de forma Online del 17 al 25 de julio de 2020.Peer reviewe
    corecore