53 research outputs found

    A Macaque's-Eye View of Human Insertions and Deletions: Differences in Mechanisms

    Get PDF
    Insertions and deletions (indels) cause numerous genetic diseases and lead to pronounced evolutionary differences among genomes. The macaque sequences provide an opportunity to gain insights into the mechanisms generating these mutations on a genome-wide scale by establishing the polarity of indels occurring in the human lineage since its divergence from the chimpanzee. Here we apply novel regression techniques and multiscale analyses to demonstrate an extensive regional indel rate variation stemming from local fluctuations in divergence, GC content, male and female recombination rates, proximity to telomeres, and other genomic factors. We find that both replication and, surprisingly, recombination are significantly associated with the occurrence of small indels. Intriguingly, the relative inputs of replication versus recombination differ between insertions and deletions, thus the two types of mutations are likely guided in part by distinct mechanisms. Namely, insertions are more strongly associated with factors linked to recombination, while deletions are mostly associated with replication-related features. Indel as a term misleadingly groups the two types of mutations together by their effect on a sequence alignment. However, here we establish that the correct identification of a small gap as an insertion or a deletion (by use of an outgroup) is crucial to determining its mechanism of origin. In addition to providing novel insights into insertion and deletion mutagenesis, these results will assist in gap penalty modeling and eventually lead to more reliable genomic alignments

    Alu Recombination-Mediated Structural Deletions in the Chimpanzee Genome

    Get PDF
    With more than 1.2 million copies, Alu elements are one of the most important sources of structural variation in primate genomes. Here, we compare the chimpanzee and human genomes to determine the extent of Alu recombination-mediated deletion (ARMD) in the chimpanzee genome since the divergence of the chimpanzee and human lineages (∼6 million y ago). Combining computational data analysis and experimental verification, we have identified 663 chimpanzee lineage-specific deletions (involving a total of ∼771 kb of genomic sequence) attributable to this process. The ARMD events essentially counteract the genomic expansion caused by chimpanzee-specific Alu inserts. The RefSeq databases indicate that 13 exons in six genes, annotated as either demonstrably or putatively functional in the human genome, and 299 intronic regions have been deleted through ARMDs in the chimpanzee lineage. Therefore, our data suggest that this process may contribute to the genomic and phenotypic diversity between chimpanzees and humans. In addition, we found four independent ARMD events at orthologous loci in the gorilla or orangutan genomes. This suggests that human orthologs of loci at which ARMD events have already occurred in other nonhuman primate genomes may be “at-risk” motifs for future deletions, which may subsequently contribute to human lineage-specific genetic rearrangements and disorders

    On the Origin and Evolution of Vertebrate Olfactory Receptor Genes: Comparative Genome Analysis Among 23 Chordate Species

    Get PDF
    Olfaction is a primitive sense in organisms. Both vertebrates and insects have receptors for detecting odor molecules in the environment, but the evolutionary origins of these genes are different. Among studied vertebrates, mammals have ∼1,000 olfactory receptor (OR) genes, whereas teleost fishes have much smaller (∼100) numbers of OR genes. To investigate the origin and evolution of vertebrate OR genes, I attempted to determine near-complete OR gene repertoires by searching whole-genome sequences of 14 nonmammalian chordates, including cephalochordates (amphioxus), urochordates (ascidian and larvacean), and vertebrates (sea lamprey, elephant shark, five teleost fishes, frog, lizard, and chicken), followed by a large-scale phylogenetic analysis in conjunction with mammalian OR genes identified from nine species. This analysis showed that the amphioxus has >30 vertebrate-type OR genes though it lacks distinctive olfactory organs, whereas all OR genes appear to have been lost in the urochordate lineage. Some groups of genes (θ, κ, and λ) that are phylogenetically nested within vertebrate OR genes showed few gene gains and losses, which is in sharp contrast to the evolutionary pattern of OR genes, suggesting that they are actually non-OR genes. Moreover, the analysis demonstrated a great difference in OR gene repertoires between aquatic and terrestrial vertebrates, reflecting the necessity for the detection of water-soluble and airborne odorants, respectively. However, a minor group (β) of genes that are atypically present in both aquatic and terrestrial vertebrates was also found. These findings should provide a critical foundation for further physiological, behavioral, and evolutionary studies of olfaction in various organisms

    Comparative studies of glycosylphosphatidylinositol-anchored high-density lipoprotein-binding protein 1: evidence for a eutherian mammalian origin for the GPIHBP1 gene from an LY6-like gene

    Get PDF
    Glycosylphosphatidylinositol-anchored high-density lipoprotein-binding protein 1 (GPIHBP1) functions as a platform and transport agent for lipoprotein lipase (LPL) which functions in the hydrolysis of chylomicrons, principally in heart, skeletal muscle and adipose tissue capillary endothelial cells. Previous reports of genetic deficiency for this protein have described severe chylomicronemia. Comparative GPIHBP1 amino acid sequences and structures and GPIHBP1 gene locations were examined using data from several mammalian genome projects. Mammalian GPIHBP1 genes usually contain four coding exons on the positive strand. Mammalian GPIHBP1 sequences shared 41–96% identities as compared with 9–32% sequence identities with other LY6-domain-containing human proteins (LY6-like). The human N-glycosylation site was predominantly conserved among other mammalian GPIHBP1 proteins except cow, dog and pig. Sequence alignments, key amino acid residues and conserved predicted secondary structures were also examined, including the N-terminal signal peptide, the acidic amino acid sequence region which binds LPL, the glycosylphosphatidylinositol linkage group, the Ly6 domain and the C-terminal α-helix. Comparative and phylogenetic studies of mammalian GPIHBP1 suggested that it originated in eutherian mammals from a gene duplication event of an ancestral LY6-like gene and subsequent integration of exon 2, which may have been derived from BCL11A (B-cell CLL/lymphoma 11A gene) encoding an extended acidic amino acid sequence

    Genomics and proteomics of vertebrate cholesterol ester lipase (LIPA) and cholesterol 25-hydroxylase (CH25H)

    Get PDF
    Cholesterol ester lipase (LIPA; EC 3.1.1.13) and cholesterol 25-hydroxylase (CH25H; EC 1.14.99.48) play essential role in cholesterol metabolism in the body by hydrolysing cholesteryl esters and triglycerides within lysosomes (LIPA) and catalysing the formation of 25-hydroxycholesterol from cholesterol (CH25H) which acts to repress cholesterol biosynthesis. Bioinformatic methods were used to predict the amino acid sequences, structures and genomic features of several vertebrate LIPA and CH25H genes and proteins, and to examine the phylogeny of vertebrate LIPA. Amino acid sequence alignments and predicted subunit structures enabled the identification of key sequences previously reported for human LIPA and CH25H and transmembrane structures for vertebrate CH25H sequences. Vertebrate LIPA and CH25H genes were located in tandem on all vertebrate genomes examined and showed several predicted transcription factor binding sites and CpG islands located within the 5′ regions of the human genes. Vertebrate LIPA genes contained nine coding exons, while all vertebrate CH25H genes were without introns. Phylogenetic analysis demonstrated the distinct nature of the vertebrate LIPA gene and protein family in comparison with other vertebrate acid lipases and has apparently evolved from an ancestral LIPA gene which predated the appearance of vertebrates

    On Characterizing Adaptive Events Unique to Modern Humans

    Get PDF
    Ever since the first draft of the human genome was completed in 2001, there has been increased interest in identifying genetic changes that are uniquely human, which could account for our distinct morphological and cognitive capabilities with respect to other apes. Recently, draft sequences of two extinct hominin genomes, a Neanderthal and Denisovan, have been released. These two genomes provide a much greater resolution to identify human-specific genetic differences than the chimpanzee, our closest extant relative. The Neanderthal genome paper presented a list of regions putatively targeted by positive selection around the time of the human–Neanderthal split. We here seek to characterize the evolutionary history of these candidate regions—examining evidence for selective sweeps in modern human populations as well as for accelerated adaptive evolution across apes. Results indicate that 3 of the top 20 candidate regions show evidence of selection in at least one modern human population (P < 5 × 105). Additionally, four genes within the top 20 regions show accelerated amino acid substitutions across multiple apes (P < 0.01), suggesting importance across deeper evolutionary time. These results highlight the importance of evaluating evolutionary processes across both recent and ancient evolutionary timescales and intriguingly suggest a list of candidate genes that may have been uniquely important around the time of the human–Neanderthal split

    Long Conserved Fragments Upstream of Mammalian Polyadenylation Sites

    Get PDF
    Polyadenylation is a cotranscriptional nuclear RNA processing event involving endonucleolytic cleavage of the nascent, emerging pre-messenger RNA (pre-mRNA) from the RNA polymerase, immediately followed by the polymerization of adenine ribonucleotides, called the poly(A) tail, to the cleaved 3′ end of the polyadenylation site (PAS). This apparently simple molecular processing step has been discovered to be connected to transcription and splicing therefore increasing its potential for regulation of gene expression. Here, through a bioinformatic analysis of cis-PAS–regulatory elements in mammals that includes taking advantage of multiple evolutionary time scales, we find unexpected selection pressure much further upstream, up to 200 nt, from the PAS than previously thought. Strikingly, close to 3,000 long (30–500 nt) noncoding conserved fragments (CFs) were discovered in the PAS flanking region of three remotely related mammalian species, human, mouse, and cow. When an even more remote transitional mammal, platypus, was included, still over a thousand CFs were found in the proximity of the PAS. Even though the biological function of these CFs remains unknown, their considerable sizes makes them unlikely to serve as protein recognition sites, which are typically ≤15 nt. By harnessing genome wide DNaseI hypersensitivity data, we have discovered that the presence of CFs correlates with chromatin accessibility. Our study is important in highlighting novel experimental targets, which may provide new understanding about the regulatory aspects of polyadenylation

    Alu pair exclusions in the human genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The human genome contains approximately one million <it>Alu </it>elements which comprise more than 10% of human DNA by mass. <it>Alu </it>elements possess direction, and are distributed almost equally in positive and negative strand orientations throughout the genome. Previously, it has been shown that closely spaced <it>Alu </it>pairs in opposing orientation (inverted pairs) are found less frequently than <it>Alu </it>pairs having the same orientation (direct pairs). However, this imbalance has only been investigated for <it>Alu </it>pairs separated by 650 or fewer base pairs (bp) in a study conducted prior to the completion of the draft human genome sequence.</p> <p>Results</p> <p>We performed a comprehensive analysis of all (> 800,000) full-length <it>Alu </it>elements in the human genome. This large sample size permits detection of small differences in the ratio between inverted and direct <it>Alu </it>pairs (I:D). We have discovered a significant depression in the full-length <it>Alu </it>pair I:D ratio that extends to repeat pairs separated by ≤ 350,000 bp. Within this imbalance bubble (those <it>Alu </it>pairs separated by ≤ 350,000 bp), direct pairs outnumber inverted pairs. Using PCR, we experimentally verified several examples of inverted <it>Alu </it>pair exclusions that were caused by deletions.</p> <p>Conclusions</p> <p>Over 50 million full-length <it>Alu </it>pairs reside within the I:D imbalance bubble. Their collective impact may represent one source of <it>Alu </it>element-related human genomic instability that has not been previously characterized.</p
    corecore