39 research outputs found
Phenotypic Importance And Population Dynamics Of Genomic Structural Variation
The fundamental goals of molecular genetic studies are 1) understanding how variation within an organism's genome translates into phenotypic differences and 2) how this variation segregates and becomes fixed within populations. An enormous amount of work has been invested in looking at point mutations, or single nucleotide polymorphisms (SNPs). However, far less work has been devoted to studying larger structural variation (SV). Advances in molecular techniques (i.e. PCR, Sanger sequencing, genotyping arrays and comparative hybridization arrays) have greatly expanded our ability to detect these larger mutations. Here, I focus on developing methods for the analysis of data generated using these techniques and apply these methods to several data sets to broaden our understanding of the role of SV in genome evolution as well as the phenotypic consequences of this variation. Specifically, I first develop a method for the detection of a particular type of SV, known as copy number variation (CNV). This method is applied to genotyping array data collected from domestic dogs. I analyze these data to detect the extent of CNV in domestic dogs and their close relatives and to better understand population genetics and evolution of CNV. The analysis reveals nearly 10,000 CNVs segregating in domestic dogs and covering nearly 400 Mb of the dog genome. Next, using a retrospective study design, I investigate the role of CCL3L CNV in the progression rate to simian AIDS in rhesus macaque. I find strong evidence that reduced copy number of CCL3L increases progression rates in rhesus macaques. This is a similar finding to that seen in humans in an earlier study. Therefore, rhesus macaque is a promising model organism for understanding how CCL3L CNV is affecting HIV progression in humans. Characterizing the role of CCL3L CNV will allow researchers to increase power in vaccine trials by controlling for this natural variation. Finally, I introduce a novel method for mapping the pseudoautosomal region in mammalian genomes. I apply this method to data collected from domestic and wild canids as well as rhesus macaque and use the results of this study to further the understanding of PAR evolution across the mammalian tree
PCAdmix: Principal Components-Based Assignment of Ancestry along Each Chromosome in Individuals with Admixed Ancestry from Two or More Populations
Identifying ancestry along each chromosome in admixed individuals provides a wealth of information for understanding the population genetic history of admixture events and is valuable for admixture mapping and identifying recent targets of selection. We present PCAdmix (available at https://sites.google.com/site/pcadmix/home), a Principal Componentsbased algorithm for determining ancestry along each chromosome from a high-density, genome-wide set of phased single-nucleotide polymorphism (SNP) genotypes of admixed individuals. We compare our method to HAPMIX on simulated data from two ancestral populations, and we find high concordance between the methods. Our method also has better accuracy than LAMP when applied to three-population admixture, a situation as yet unaddressed by HAPMIX. Finally, we apply our method to a data set of four Latino populations with European, African, and Native American ancestry. We find evidence of assortative mating in each of the four populations, and we identify regions of shared ancestry that may be recent targets of selection and could serve as candidate regions for admixture-based association mapping
Population genomic analysis reveals a rich speciation and demographic history of orang-utans (Pongo pygmaeus and Pongo abelii)
To gain insights into evolutionary forces that have shaped the history of Bornean and Sumatran populations of orang-utans, we compare patterns of variation across more than 11 million single nucleotide polymorphisms found by previous mitochondrial and autosomal genome sequencing of 10 wild-caught orang-utans. Our analysis of the mitochondrial data yields a far more ancient split time between the two populations (~3.4 million years ago) than estimates based on autosomal data (0.4 million years ago), suggesting a complex speciation process with moderate levels of primarily male migration. We find that the distribution of selection coefficients consistent with the observed frequency spectrum of autosomal non-synonymous polymorphisms in orang-utans is similar to the distribution in humans. Our analysis indicates that 35% of genes have evolved under detectable negative selection. Overall, our findings suggest that purifying natural selection, genetic drift, and a complex demographic history are the dominant drivers of genome evolution for the two orang-utan populations.Publisher PDFPeer reviewe
Assessing the Evolutionary Impact of Amino Acid Mutations in the Human Genome
Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27–29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30–42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%). Our results are consistent with 10–20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits
Copy Number Variation of CCL3-like Genes Affects Rate of Progression to Simian-AIDS in Rhesus Macaques (Macaca mulatta)
Variation in genes underlying host immunity can lead to marked differences in susceptibility to HIV infection among humans. Despite heavy reliance on non-human primates as models for HIV/AIDS, little is known about which host factors are shared and which are unique to a given primate lineage. Here, we investigate whether copy number variation (CNV) at CCL3-like genes (CCL3L), a key genetic host factor for HIV/AIDS susceptibility and cell-mediated immune response in humans, is also a determinant of time until onset of simian-AIDS in rhesus macaques. Using a retrospective study of 57 rhesus macaques experimentally infected with SIVmac, we find that CCL3L CNV explains approximately 18% of the variance in time to simian-AIDS (p<0.001) with lower CCL3L copy number associating with more rapid disease course. We also find that CCL3L copy number varies significantly (p<10−6) among rhesus subpopulations, with Indian-origin macaques having, on average, half as many CCL3L gene copies as Chinese-origin macaques. Lastly, we confirm that CCL3L shows variable copy number in humans and chimpanzees and report on CCL3L CNV within and among three additional primate species. On the basis of our findings we suggest that (1) the difference in population level copy number may explain previously reported observations of longer post-infection survivorship of Chinese-origin rhesus macaques, (2) stratification by CCL3L copy number in rhesus SIV vaccine trials will increase power and reduce noise due to non-vaccine-related differences in survival, and (3) CCL3L CNV is an ancestral component of the primate immune response and, therefore, copy number variation has not been driven by HIV or SIV per se
A Simple Genetic Architecture Underlies Morphological Variation in Dogs
The largest genetic study to date of morphology in domestic dogs identifies genes
controlling nearly 100 morphological traits and identifies important trends in
phenotypic variation within this species
Analysis of shared heritability in common disorders of the brain
ience, this issue p. eaap8757 Structured Abstract INTRODUCTION Brain disorders may exhibit shared symptoms and substantial epidemiological comorbidity, inciting debate about their etiologic overlap. However, detailed study of phenotypes with different ages of onset, severity, and presentation poses a considerable challenge. Recently developed heritability methods allow us to accurately measure correlation of genome-wide common variant risk between two phenotypes from pools of different individuals and assess how connected they, or at least their genetic risks, are on the genomic level. We used genome-wide association data for 265,218 patients and 784,643 control participants, as well as 17 phenotypes from a total of 1,191,588 individuals, to quantify the degree of overlap for genetic risk factors of 25 common brain disorders. RATIONALE Over the past century, the classification of brain disorders has evolved to reflect the medical and scientific communities' assessments of the presumed root causes of clinical phenomena such as behavioral change, loss of motor function, or alterations of consciousness. Directly observable phenomena (such as the presence of emboli, protein tangles, or unusual electrical activity patterns) generally define and separate neurological disorders from psychiatric disorders. Understanding the genetic underpinnings and categorical distinctions for brain disorders and related phenotypes may inform the search for their biological mechanisms. RESULTS Common variant risk for psychiatric disorders was shown to correlate significantly, especially among attention deficit hyperactivity disorder (ADHD), bipolar disorder, major depressive disorder (MDD), and schizophrenia. By contrast, neurological disorders appear more distinct from one another and from the psychiatric disorders, except for migraine, which was significantly correlated to ADHD, MDD, and Tourette syndrome. We demonstrate that, in the general population, the personality trait neuroticism is significantly correlated with almost every psychiatric disorder and migraine. We also identify significant genetic sharing between disorders and early life cognitive measures (e.g., years of education and college attainment) in the general population, demonstrating positive correlation with several psychiatric disorders (e.g., anorexia nervosa and bipolar disorder) and negative correlation with several neurological phenotypes (e.g., Alzheimer's disease and ischemic stroke), even though the latter are considered to result from specific processes that occur later in life. Extensive simulations were also performed to inform how statistical power, diagnostic misclassification, and phenotypic heterogeneity influence genetic correlations. CONCLUSION The high degree of genetic correlation among many of the psychiatric disorders adds further evidence that their current clinical boundaries do not reflect distinct underlying pathogenic processes, at least on the genetic level. This suggests a deeply interconnected nature for psychiatric disorders, in contrast to neurological disorders, and underscores the need to refine psychiatric diagnostics. Genetically informed analyses may provide important "scaffolding" to support such restructuring of psychiatric nosology, which likely requires incorporating many levels of information. By contrast, we find limited evidence for widespread common genetic risk sharing among neurological disorders or across neurological and psychiatric disorders. We show that both psychiatric and neurological disorders have robust correlations with cognitive and personality measures. Further study is needed to evaluate whether overlapping genetic contributions to psychiatric pathology may influence treatment choices. Ultimately, such developments may pave the way toward reduced heterogeneity and improved diagnosis and treatment of psychiatric disorders
Genomic Relationships, Novel Loci, and Pleiotropic Mechanisms across Eight Psychiatric Disorders
Genetic influences on psychiatric disorders transcend diagnostic boundaries, suggesting substantial pleiotropy of contributing loci. However, the nature and mechanisms of these pleiotropic effects remain unclear. We performed analyses of 232,964 cases and 494,162 controls from genome-wide studies of anorexia nervosa, attention-deficit/hyper-activity disorder, autism spectrum disorder, bipolar disorder, major depression, obsessive-compulsive disorder, schizophrenia, and Tourette syndrome. Genetic correlation analyses revealed a meaningful structure within the eight disorders, identifying three groups of inter-related disorders. Meta-analysis across these eight disorders detected 109 loci associated with at least two psychiatric disorders, including 23 loci with pleiotropic effects on four or more disorders and 11 loci with antagonistic effects on multiple disorders. The pleiotropic loci are located within genes that show heightened expression in the brain throughout the lifespan, beginning prenatally in the second trimester, and play prominent roles in neurodevelopmental processes. These findings have important implications for psychiatric nosology, drug development, and risk prediction.Peer reviewe