5,007 research outputs found

    Removing exogenous information using pedigree data

    Full text link
    Management of certain populations requires the preservation of its pure genetic background. When, for different reasons, undesired alleles are introduced, the original genetic conformation must be recovered. The present study tested, through computer simulations, the power of recovery (the ability for removing the foreign information) from genealogical data. Simulated scenarios comprised different numbers of exogenous individuals taking partofthe founder population anddifferent numbers of unmanaged generations before the removal program started. Strategies were based on variables arising from classical pedigree analyses such as founders? contribution and partial coancestry. The ef?ciency of the different strategies was measured as the proportion of native genetic information remaining in the population. Consequences on the inbreeding and coancestry levels of the population were also evaluated. Minimisation of the exogenous founders? contributions was the most powerful method, removing the largest amount of genetic information in just one generation.However, as a side effect, it led to the highest values of inbreeding. Scenarios with a large amount of initial exogenous alleles (i.e. high percentage of non native founders), or many generations of mixing became very dif?cult to recover, pointing out the importance of being careful about introgression events in populatio

    A hierarchical Dirichlet process mixture model for haplotype reconstruction from multi-population data

    Full text link
    The perennial problem of "how many clusters?" remains an issue of substantial interest in data mining and machine learning communities, and becomes particularly salient in large data sets such as populational genomic data where the number of clusters needs to be relatively large and open-ended. This problem gets further complicated in a co-clustering scenario in which one needs to solve multiple clustering problems simultaneously because of the presence of common centroids (e.g., ancestors) shared by clusters (e.g., possible descents from a certain ancestor) from different multiple-cluster samples (e.g., different human subpopulations). In this paper we present a hierarchical nonparametric Bayesian model to address this problem in the context of multi-population haplotype inference. Uncovering the haplotypes of single nucleotide polymorphisms is essential for many biological and medical applications. While it is uncommon for the genotype data to be pooled from multiple ethnically distinct populations, few existing programs have explicitly leveraged the individual ethnic information for haplotype inference. In this paper we present a new haplotype inference program, Haploi, which makes use of such information and is readily applicable to genotype sequences with thousands of SNPs from heterogeneous populations, with competent and sometimes superior speed and accuracy comparing to the state-of-the-art programs. Underlying Haploi is a new haplotype distribution model based on a nonparametric Bayesian formalism known as the hierarchical Dirichlet process, which represents a tractable surrogate to the coalescent process. The proposed model is exchangeable, unbounded, and capable of coupling demographic information of different populations.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS225 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Genetic Stratigraphy of Key Demographic Events in Arabia

    Get PDF
    The issue of admixture in human populations is normally addressed by genome-wide (GW) studies, and several approaches have been developed to date admixture events [1,2,3,4,5]. Admixed populations bear chromosomes with segments of DNA from all contributing source groups, the size of which decreases over successive generations until recombination renders them undetectably short. Several algorithms attempt to date admixture events by inferring the size of the nuclear ancestry segments, and these can work well when dating recent episodes in human history, such as the sub-Saharan African input into the New World [6], but they fail to detect several known episodes that took place at earlier times, such as the African input into Iberia [1] and genetic exchanges across the Red Sea [7]. Simulations with the suite of methods available at the ADMIXTOOLS package indicated that these methods could detect admixture events as early as 500 generation ago, but real data did not allow the tracing of such old events [8]. A recent improved algorithm, called GLOBETROTTER, has been used to tackle the detection of the co-occurrence of several mixture events by decomposing each chromosome into a series of haplotypic chunks and then analysing each chunk independently [3], but the problem of detecting ancient events remains. Its application to the systematic screening of worldwide admixture events was able to reveal around 100 events, but all occurring over only the past 4,000 years [3

    Genome-wide signatures of population bottlenecks and diversifying selection in European wolves

    Get PDF
    Genomic resources developed for domesticated species provide powerful tools for studying the evolutionary history of their wild relatives. Here we use 61K single-nucleotide polymorphisms (SNPs) evenly spaced throughout the canine nuclear genome to analyse evolutionary relationships among the three largest European populations of grey wolves in comparison with other populations worldwide, and investigate genome-wide effects of demographic bottlenecks and signatures of selection. European wolves have a discontinuous range, with large and connected populations in Eastern Europe and relatively smaller, isolated populations in Italy and the Iberian Peninsula. Our results suggest a continuous decline in wolf numbers in Europe since the Late Pleistocene, and long-term isolation and bottlenecks in the Italian and Iberian populations following their divergence from the Eastern European population. The Italian and Iberian populations have low genetic variability and high linkage disequilibrium, but relatively few autozygous segments across the genome. This last characteristic clearly distinguishes them from populations that underwent recent drastic demographic declines or founder events, and implies long-term bottlenecks in these two populations. Although genetic drift due to spatial isolation and bottlenecks seems to be a major evolutionary force diversifying the European populations, we detected 35 loci that are putatively under diversifying selection. Two of these loci flank the canine platelet-derived growth factor gene, which affects bone growth and may influence differences in body size between wolf populations. This study demonstrates the power of population genomics for identifying genetic signals of demographic bottlenecks and detecting signatures of directional selection in bottlenecked populations, despite their low background variability.Heredity advance online publication, 18 December 2013; doi:10.1038/hdy.2013.122

    FRANz: reconstruction of wild multi-generation pedigrees

    Get PDF
    Summary: We present a software package for pedigree reconstruction in natural populations using co-dominant genomic markers such as microsatellites and single nucleotide polymorphisms (SNPs). If available, the algorithm makes use of prior information such as known relationships (sub-pedigrees) or the age and sex of individuals. Statistical confidence is estimated by Markov Chain Monte Carlo (MCMC) sampling. The accuracy of the algorithm is demonstrated for simulated data as well as an empirical dataset with known pedigree. The parentage inference is robust even in the presence of genotyping errors

    The French Canadian founder population : lessons and insights for genetic epidemiological research

    Get PDF
    La population canadienne-française a une histoire dĂ©mographique unique faisant d’elle une population d’intĂ©rĂȘt pour l’épidĂ©miologie et la gĂ©nĂ©tique. Cette thĂšse vise Ă  mettre en valeur les caractĂ©ristiques de la population quĂ©bĂ©coise qui peuvent ĂȘtre utilisĂ©es afin d’amĂ©liorer la conception et l’analyse d’études d’épidĂ©miologie gĂ©nĂ©tique. Dans un premier temps, nous profitons de la prĂ©sence d’information gĂ©nĂ©alogique dĂ©taillĂ©e concernant les Canadiens français pour estimer leur degrĂ© d’apparentement et le comparer au degrĂ© d’apparentement gĂ©nĂ©tique. L’apparentement gĂ©nĂ©tique calculĂ© Ă  partir du partage gĂ©nĂ©tique identique par ascendance est corrĂ©lĂ© Ă  l’apparentement gĂ©nĂ©alogique, ce qui dĂ©montre l'utilitĂ© de la dĂ©tection des segments identiques par ascendance pour capturer l’apparentement complexe, impliquant entre autres de la consanguinitĂ©. Les conclusions de cette premiĂšre Ă©tude pourront guider l'interprĂ©tation des rĂ©sultats dans d’autres populations ne disposant pas d’information gĂ©nĂ©alogique. Dans un deuxiĂšme temps, afin de tirer profit pleinement du potentiel des gĂ©nĂ©alogies canadienne-françaises profondes, bien conservĂ©es et quasi complĂštes, nous prĂ©sentons le package R GENLIB, dĂ©veloppĂ© pour Ă©tudier de grands ensembles de donnĂ©es gĂ©nĂ©alogiques. Nous Ă©tudions Ă©galement le partage identique par ascendance Ă  l’aide de simulations et nous mettons en Ă©vidence le fait que la structure des populations rĂ©gionales peut faciliter l'identification de fondateurs importants, qui auraient pu introduire des mutations pathologiques, ce qui ouvre la porte Ă  la prĂ©vention et au dĂ©pistage de maladies hĂ©rĂ©ditaires liĂ©es Ă  certains fondateurs. Finalement, puisque nous savons que les Canadiens français ont accumulĂ© des segments homozygotes, Ă  cause de la prĂ©sence de consanguinitĂ© lointaine, nous estimons la consanguinitĂ© chez les individus canadiens-français et nous Ă©tudions son impact sur plusieurs traits de santĂ©. Nous montrons comment la dĂ©pression endogamique influence des traits complexes tels que la grandeur et des traits hĂ©matologiques. Nos rĂ©sultats ne sont que quelques exemples de ce que nous pouvons apprendre de la population canadienne-française. Ils nous aideront Ă  mieux comprendre les caractĂ©ristiques des autres populations de mĂȘme qu’ils pourront aider la recherche en Ă©pidĂ©miologie gĂ©nĂ©tique au sein de la population canadienne-française.The French Canadian founder population has a demographic history that makes it an important population for epidemiology and genetics. This work aims to explain what features can be used to improve the design and analysis of genetic epidemiological studies in the Quebec population. First we take advantage of the presence of extended genealogical records among French Canadians to estimate relatedness from those records and compare it to the genetic kinship. The kinship based on identical-by-descent sharing correlates well with the genealogical kinship, further demonstrating the usefulness of genomic identical-by-descent detection to capture complex relatedness involving inbreeding and our findings can guide the interpretation of results in other population without genealogical data. Second to optimally exploit the full potential of these well preserved, exhaustive and detailed French Canadian genealogical data we present the GENLIB R package developed to study large genealogies. We also investigate identical-by-descent sharing with simulations and highlight the fact that regional population structure can facilitate the identification of notable founders that could have introduced disease mutations, opening the door to prevention and screening of founder-related diseases. Third, knowing that French Canadians have accumulated segments of homozygous genotypes, as a result of inbreeding due to distant ancestors, we estimate the inbreeding in French Canadian individuals and investigate its impact on multiple health traits. We show how inbreeding depression influences complex traits such as height and blood-related traits. Those results are a few examples of what we can learn from the French Canadian population and will help to gain insight on other populations’ characteristics as well as help the genetic epidemiological research within the French Canadian population

    Genetic structure of the Utah Mormons: comparison of results based on RFLPs, blood groups, migration matrices, isonymy, and pedigrees

    Get PDF
    Journal ArticleThe genetic structure of the Utah Mormon population is examined using 25 blood group and 47 RFLP alleles obtained from 442 subjects living in 8 geographic subdivisions. Nei's Gst was 0.013 (p 0.4) for the blood group data, showing that only 1% of the genetic variance in this population can be attributed to subdivision effects. A comparison of intersubdivision distance matrices based on blood groups, RFLPs, migration matrices, isonymy, and pedigrees shows that genetic distances have relatively low and nonsignificant correlations with the other three types of data
    • 

    corecore