22 research outputs found

    A Genome-Wide Survey of Genetic Variation in Gorillas Using Reduced Representation Sequencing

    Get PDF
    <div><p>All non-human great apes are endangered in the wild, and it is therefore important to gain an understanding of their demography and genetic diversity. Whole genome assembly projects have provided an invaluable foundation for understanding genetics in all four genera, but to date genetic studies of multiple individuals within great ape species have largely been confined to mitochondrial DNA and a small number of other loci. Here, we present a genome-wide survey of genetic variation in gorillas using a reduced representation sequencing approach, focusing on the two lowland subspecies. We identify 3,006,670 polymorphic sites in 14 individuals: 12 western lowland gorillas (<i>Gorilla gorilla gorilla</i>) and 2 eastern lowland gorillas (<i>Gorilla beringei graueri</i>). We find that the two species are genetically distinct, based on levels of heterozygosity and patterns of allele sharing. Focusing on the western lowland population, we observe evidence for population substructure, and a deficit of rare genetic variants suggesting a recent episode of population contraction. In western lowland gorillas, there is an elevation of variation towards telomeres and centromeres on the chromosomal scale. On a finer scale, we find substantial variation in genetic diversity, including a marked reduction close to the major histocompatibility locus, perhaps indicative of recent strong selection there. These findings suggest that despite their maintaining an overall level of genetic diversity equal to or greater than that of humans, population decline, perhaps associated with disease, has been a significant factor in recent and long-term pressures on wild gorilla populations.</p></div

    Rates and ratios of heterozygous and homozygous variants in each of the gorillas sampled.

    No full text
    <p>Rates of heterozygous (light blue) and homozygous (dark blue) variants were called from sequence alignments against the gorilla reference genome and expressed as percentage rates. Note Kamilah’s low rate of homozygous variants, due to her providing the DNA from which the reference genome was assembled. The corresponding hom/het ratios (ratios of homozygous to heterozygous variant rates) show that eastern lowland gorillas (black) have higher ratios than western lowland gorillas (red). Additional data for Mukisi and EB(JC) (Mukisi_PvuII and EB_JC_PvuII) were taken from a previously published study <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0065066#pone.0065066-Scally1" target="_blank">[3]</a>.</p

    Principal components analysis (PCA).

    No full text
    <p>A. PCA based on 123,591 polymorphic sites in 12 western lowland gorillas and two eastern lowland gorillas. Here, PC1 separates western gorillas from eastern gorillas. B. PCA based on 110,971 polymorphic sites in the 12 western lowland gorillas only.</p

    Segregating sites in nine western lowland gorillas.

    No full text
    <p>A. Density of segregating sites in 1 Mbp bins. Sites passing quality and depth filtering thresholds in all nine gorillas BiKira, EB(JC), Fubu, Guy, Kamilah, Kesho, Matadi, Murphy and Ruby were binned in 1 Mb bins and the density of segregating sites calculated. Resulting densities are plotted on an ideogram, with the scale expressed as number of sites per kbp. B. Profile of mean segregating site density as a function of chromosomal position on both long and short arms, averaged over all chromosomes. Position is normalised by chromosome length, with the centromere at 0.0 and the telomere at 1.0 on both arms. An increase in genetic diversity is evident towards the centromere and telomere on both arms.</p

    Allele frequency spectra.

    No full text
    <p>Conditional allele frequency spectra are shown for eight western lowland gorillas (red) and eight samples from each of three human populations: African (black), European (blue) and Asian (green). Spectra are conditioned on sites ascertained in one individual (and, for gorilla, averaged over all samples). The dashed line is the theoretical expectation for a constant population size. Error bars for the gorilla samples represent standard deviations.</p

    Positive selection at the <i>CXXC1</i> locus.

    No full text
    <p>A ~6 kb region on chromosome 18 that spans <i>CXXC1</i> showing GENCODE (Version 19) transcript annotation. The three short-listed candidate regulatory variants driving the selection signal in East Asians are all located in ENCODE annotated regions of open chromatin, depicted in the DNase I Hypersensitivity Clusters in 125 cell lines track, and show ENCODE chromatin state segmentation associated with an active promoter site in nine human cell lines. The latter include lymphoblastoids [GM12878]; embryonic stem cells [H1-hESC]; chronic myelogenous leukemia [K562]; hepatocellular carcinoma [HepG2]; umbilical vein endothelial [HUVEC]; mammary epithelial [HMEC]; skeletal muscle myoblast [HSMM]; skin epidermal keratinocytes [NHEK] and lung fibroblasts [NHLF]). Positions of histone modifications in osteoblasts are indicated by shaded bands and the black shade signifies enrichment. In osteoblasts the position of the histone sequence variant, H2A.Z, that determines accessibility of the transcription start site (TSS) and histone modifications like H3K4me3 that are enriched around TSS (dark bands) encompasses the candidate regulatory variant site and show binding for many transcription factors. H3K4me1 and H3K27ac modifications and p300 marks are enriched around active enhancers and CTCF indicates insulator regions. The lower part of the figure shows median joining haplotype networks in this region that is in high LD (r<sup>2</sup> ≥ 0.95) in CHB. Phased haplotypes generated by the 1000 Genomes Project were used to construct this network. The derived C allele for the regulatory variant <i>rs59393148</i> lies on the branch leading towards the most frequent haplotype found in East Asians, and shows a star like expansion typical of a selection signal. Note the proximity of archaic human haplotypes with a subset of East Asian (ASN) and European Finnish samples. These samples lie on a divergent branch that is closer to the Neanderthal (Nea) and Denisovan (Den) haplotype when compared with the rest of the modern human population samples.</p

    Positive selection at the <i>LRP5</i> locus.

    No full text
    <p>(A). A 140 kb region on chromosome 6 that spans <i>LRP5</i> showing GENCODE (Version 19) transcript annotation. Positions of 11 candidate regulatory variants and DNase I Hypersensitivity Clusters are shown along with the −log<sub>10</sub> of the combined p-values from frequency-spectrum-based analysis in three continental populations. The significance threshold is indicated by the dashed line and two non-overlapping 10 kb windows have a significant combined p-value in CHB. (B). A closer look at the 3’ selected region in East Asians (highlighted in blue). The region contains both variants with the highest derived allele frequency in East Asians (<i>rs649772</i> and <i>rs671494</i>) that lie in a DNase I hypersensitivity cluster and show ENCODE chromatin state segmentation associated with enhancer binding in several cell lines. In osteoblasts the variants lie within the histone sequence variant, H2A.Z, that determines accessibility of the transcription start site (dark bands) and there are additional H3K4me1 and H3K27ac histone modifications upstream of the variant. The candidate regulatory variant site also shows binding for many transcription factors. The lower part of the panel shows median joining haplotype networks in a ~20 kb region that is in high LD (r<sup>2</sup> ≥ 0.95) in CHB. Phased haplotypes generated by the 1000 Genomes Project were used to construct this network. The derived alleles for the regulatory variants <i>rs649772</i> and <i>rs671494</i> lie on the branch leading towards the most frequent haplotype found in East Asians and shows a star like expansion typical of a selection signal. The non-synonymous variant <i>rs3736228</i> (red line) that is associated with bone mineral density in genome wide association studies lies on a separate branch.</p

    Vitamin D and folate acquisition, metabolism and gene sets analyzed in this study.

    No full text
    <p>The upper part shows metabolism of vitamin D (yellow arrows) and folate (black arrows). Vitamin D<sub>3</sub> can be obtained from the diet, but it is mainly synthesised in the skin from 7-dehydrocholesterol (7-DHC) in response to light. It is then transported into the liver where it is hydroxylated to produce 25-hydroxyvitamin D<sub>3</sub> which is subsequently converted into its active form 1α, 25- dihydroxy vitamin D<sub>3</sub>. This is transported in blood by vitamin D binding protein (DBP) and binds vitamin D receptor (VDR). The lower part shows gene sets analyzed in this study. The circles are proportional to number of genes in each set. The numbers in blue or pink circles indicate number of genes in each set that were present in AmiGO using the search terms “Vitamin D” (blue) or “Folate” (pink). Additional vitamin D (Vit D or VD) and folate (FA) gene sets are shown in shades of yellow and grey, respectively. The vitamin D gene sets that were generated included vitamin D targets identified by ChIP-Seq (VDR targets), genes involved in vitamin D action in bones, kidneys and intestines and all proteins involved in the VDR activation complex, including those directly interacting with VDR (VDRIP) and RXR (RXRIP) receptors. Folate gene sets include enzymes and receptors involved in dietary folate uptake and transport (FAU), proteins involved in nucleic acid synthesis (NAS) and methylation. The latter were sub-divided into genes involved in metabolism of methionine (Met), homocysteine (HCV) and S-adenosyl methionine methylation (SAM). The small blue and pink circles indicate the number of genes in the manually curated vitamin D and folate gene sets, respectively, that were also identified by AmiGO.</p
    corecore