Search CORE

231 research outputs found

Demography and the age of rare variants

Author: Mathieson Iain
McVean Gil
Publication venue
Publication date: 06/06/2014
Field of study

Large whole-genome sequencing projects have provided access to much of the rare variation in human populations, which is highly informative about population structure and recent demography. Here, we show how the age of rare variants can be estimated from patterns of haplotype sharing and how these ages can be related to historical relationships between populations. We investigate the distribution of the age of variants occurring exactly twice (f2 variants) in a worldwide sample sequenced by the 1000 Genomes Project, revealing enormous variation across populations. The median age of haplotypes carrying f2 variants is 50 to 160 generations across populations within Europe or Asia, and 170 to 320 generations within Africa. Haplotypes shared between continents are much older with median ages for haplotypes shared between Europe and Asia ranging from 320 to 670 generations. The distribution of the ages of f2 haplotypes is informative about their demography, revealing recent bottlenecks, ancient splits, and more modern connections between populations. We see the signature of selection in the observation that functional variants are significantly younger than nonfunctional variants of the same frequency. This approach is relatively insensitive to mutation rate and complements other nonparametric methods for demographic inference.Comment: Revised versio

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

PubMed Central

FigShare

Integrating genealogical and dynamical modelling to infer escape and reversion rates in HIV epitopes

Author: Frater John
McLean Angela
McVean Gil
Palmer Duncan
Philips Rodney
Publication venue
Publication date: 01/01/2013
Field of study

The rates of escape and reversion in response to selection pressure arising from the host immune system, notably the cytotoxic T-lymphocyte (CTL) response, are key factors determining the evolution of HIV. Existing methods for estimating these parameters from cross-sectional population data using ordinary differential equations (ODE) ignore information about the genealogy of sampled HIV sequences, which has the potential to cause systematic bias and over-estimate certainty. Here, we describe an integrated approach, validated through extensive simulations, which combines genealogical inference and epidemiological modelling, to estimate rates of CTL escape and reversion in HIV epitopes. We show that there is substantial uncertainty about rates of viral escape and reversion from cross-sectional data, which arises from the inherent stochasticity in the evolutionary process. By application to empirical data, we find that point estimates of rates from a previously published ODE model and the integrated approach presented here are often similar, but can also differ several-fold depending on the structure of the genealogy. The model-based approach we apply provides a framework for the statistical analysis of escape and reversion in population data and highlights the need for longitudinal and denser cross-sectional sampling to enable accurate estimate of these key parameters

arXiv.org e-Print Archive

PubMed Central

Oxford University Research Archive

Optimal strategies for learning multi-ancestry polygenic scores vary across traits

Author: Holmes Chris
Lehmann Brieuc
Mackintosh Maxine
McVean Gil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/07/2023
Field of study

Polygenic scores (PGSs) are individual-level measures that aggregate the genome-wide genetic predisposition to a given trait. As PGS have predominantly been developed using European-ancestry samples, trait prediction using such European ancestry-derived PGS is less accurate in non-European ancestry individuals. Although there has been recent progress in combining multiple PGS trained on distinct populations, the problem of how to maximize performance given a multiple-ancestry cohort is largely unexplored. Here, we investigate the effect of sample size and ancestry composition on PGS performance for fifteen traits in UK Biobank. For some traits, PGS estimated using a relatively small African-ancestry training set outperformed, on an African-ancestry test set, PGS estimated using a much larger European-ancestry only training set. We observe similar, but not identical, results when considering other minority-ancestry groups within UK Biobank. Our results emphasise the importance of targeted data collection from underrepresented groups in order to address existing disparities in PGS performance

UCL Discovery

Positive Selection and Increased Antiviral Activity Associated with the PARP-Containing Isoform of Human Zinc-Finger Antiviral Protein

Author: Gil McVean
Harmit S Malik
Julie A Kerns
Michael Emerman
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Intrinsic immunity relies on specific recognition of viral epitopes to mount a cell-autonomous defense against viral infections. Viral recognition determinants in intrinsic immunity genes are expected to evolve rapidly as host genes adapt to changing viruses, resulting in a signature of adaptive evolution. Zinc-finger antiviral protein (ZAP) from rats was discovered to be an intrinsic immunity gene that can restrict murine leukemia virus, and certain alphaviruses and filoviruses. Here, we used an approach combining molecular evolution and cellular infectivity assays to address whether ZAP also acts as a restriction factor in primates, and to pinpoint which protein domains may directly interact with the virus. We find that ZAP has evolved under positive selection throughout primate evolution. Recurrent positive selection is only found in the poly(ADP-ribose) polymerase (PARP)–like domain present in a longer human ZAP isoform. This PARP-like domain was not present in the previously identified and tested rat ZAP gene. Using infectivity assays, we found that the longer isoform of ZAP that contains the PARP-like domain is a stronger suppressor of murine leukemia virus expression and Semliki forest virus infection. Our study thus finds that human ZAP encodes a potent antiviral activity against alphaviruses. The striking congruence between our evolutionary predictions and cellular infectivity assays strongly validates such a combined approach to study intrinsic immunity genes

Crossref

Directory of Open Access Journals

PubMed Central

Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank

Author: Jiang Xilin
Lunter Gerton
McVean Gil
Mentzer Alexander J.
Zhang Yidong
Publication venue
Publication date: 09/08/2023
Field of study

Many diseases show patterns of co-occurrence, possibly driven by systemic dysregulation of underlying processes affecting multiple traits. We have developed a method (treeLFA) for identifying such multimorbidities from routine health-care data, which combines topic modeling with an informative prior derived from medical ontology. We apply treeLFA to UK Biobank data and identify a variety of topics representing multimorbidity clusters, including a healthy topic. We find that loci identified using topic weights as traits in a genome-wide association study (GWAS) analysis, which we validated with a range of approaches, only partially overlap with loci from GWASs on constituent single diseases. We also show that treeLFA improves upon existing methods like latent Dirichlet allocation in various ways. Overall, our findings indicate that topic models can characterize multimorbidity patterns and that genetic analysis of these patterns can provide insight into the etiology of complex traits that cannot be determined from the analysis of constituent traits alone.</p

Dissertations of the University of Groningen

Gene Family Evolution across 12 Drosophila Genomes

Author: Gil McVean
Matthew W Hahn
Mira V Han
Sang-Gook Han
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

Comparison of whole genomes has revealed large and frequent changes in the size of gene families. These changes occur because of high rates of both gene gain (via duplication) and loss (via deletion or pseudogenization), as well as the evolution of entirely new genes. Here we use the genomes of 12 fully sequenced Drosophila species to study the gain and loss of genes at unprecedented resolution. We find large numbers of both gains and losses, with over 40% of all gene families differing in size among the Drosophila. Approximately 17 genes are estimated to be duplicated and fixed in a genome every million years, a rate on par with that previously found in both yeast and mammals. We find many instances of extreme expansions or contractions in the size of gene families, including the expansion of several sex- and spermatogenesis-related families in D. melanogaster that also evolve under positive selection at the nucleotide level. Newly evolved gene families in our dataset are associated with a class of testes-expressed genes known to have evolved de novo in a number of cases. Gene family comparisons also allow us to identify a number of annotated D. melanogaster genes that are unlikely to encode functional proteins, as well as to identify dozens of previously unannotated D. melanogaster genes with conserved homologs in the other Drosophila. Taken together, our results demonstrate that the apparent stasis in total gene number among species has masked rapid turnover in individual gene gain and loss. It is likely that this genomic revolving door has played a large role in shaping the morphological, physiological, and metabolic differences among species

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank

Author: Jiang Xilin
Lunter Gerton
McVean Gil
Mentzer Alexander J.
Zhang Yidong
Publication venue
Publication date: 09/08/2023
Field of study

Dissertations of the University of Groningen

Perspectives on Human Genetic Variation from the HapMap Project

Author: Chris C. A Spencer
Gil McVean
Raphaelle Chaix
The International HapMap Consortium
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

The completion of the International HapMap Project marks the start of a new phase in human genetics. The aim of the project was to provide a resource that facilitates the design of efficient genome-wide association studies, through characterising patterns of genetic variation and linkage disequilibrium in a sample of 270 individuals across four geographical populations. In total, over one million SNPs have been typed across these genomes, providing an unprecedented view of human genetic diversity. In this review we focus on what the HapMap project has taught us about the structure of human genetic variation and the fundamental molecular and evolutionary processes that shape it

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

Hal-Diderot