2,214 research outputs found

    Identification and Analysis of Genes and Pseudogenes within Duplicated Regions in the Human and Mouse Genomes

    Get PDF
    The identification and classification of genes and pseudogenes in duplicated regions still constitutes a challenge for standard automated genome annotation procedures. Using an integrated homology and orthology analysis independent of current gene annotation, we have identified 9,484 and 9,017 gene duplicates in human and mouse, respectively. On the basis of the integrity of their coding regions, we have classified them into functional and inactive duplicates, allowing us to define the first consistent and comprehensive collection of 1,811 human and 1,581 mouse unprocessed pseudogenes. Furthermore, of the total of 14,172 human and mouse duplicates predicted to be functional genes, as many as 420 are not included in current reference gene databases and therefore correspond to likely novel mammalian genes. Some of these correspond to partial duplicates with less than half of the length of the original source genes, yet they are conserved and syntenic among different mammalian lineages. The genes and unprocessed pseudogenes obtained here will enable further studies on the mechanisms involved in gene duplication as well as of the fate of duplicated genes

    Rapid Bursts of \u3ci\u3eAndrogen-Binding Protein (Abp)\u3c/i\u3e Gene Duplication Occurred Independently in Diverse Mammals

    Get PDF
    Background The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) Ξ±, Ξ² and Ξ³ subunits. Further investigation of 14 Ξ±-like (Abpa) and 13 Ξ²- or Ξ³-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Results Here, we interrogate the latest \u27finished\u27 mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. Conclusion We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes\u27 participation in chemosensation and/or sexual identification

    The evolution of the natural killer complex; a comparison between mammals using new high-quality genome assemblies and targeted annotation.

    Get PDF
    Natural killer (NK) cells are a diverse population of lymphocytes with a range of biological roles including essential immune functions. NK cell diversity is in part created by the differential expression of cell surface receptors which modulate activation and function, including multiple subfamilies of C-type lectin receptors encoded within the NK complex (NKC). Little is known about the gene content of the NKC beyond rodent and primate lineages, other than it appears to be extremely variable between mammalian groups. We compared the NKC structure between mammalian species using new high-quality draft genome assemblies for cattle and goat; re-annotated sheep, pig, and horse genome assemblies; and the published human, rat, and mouse lemur NKC. The major NKC genes are largely in the equivalent positions in all eight species, with significant independent expansions and deletions between species, allowing us to propose a model for NKC evolution during mammalian radiation. The ruminant species, cattle and goats, have independently evolved a second KLRC locus flanked by KLRA and KLRJ, and a novel KLRH-like gene has acquired an activating tail. This novel gene has duplicated several times within cattle, while other activating receptor genes have been selectively disrupted. Targeted genome enrichment in cattle identified varying levels of allelic polymorphism between the NKC genes concentrated in the predicted extracellular ligand-binding domains. This novel recombination and allelic polymorphism is consistent with NKC evolution under balancing selection, suggesting that this diversity influences individual immune responses and may impact on differential outcomes of pathogen infection and vaccination

    Comprehensive analysis of the pseudogenes of glycolytic enzymes in vertebrates: the anomalously high number of GAPDH pseudogenes highlights a recent burst of retrotrans-positional activity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Pseudogenes provide a record of the molecular evolution of genes. As glycolysis is such a highly conserved and fundamental metabolic pathway, the pseudogenes of glycolytic enzymes comprise a standardized genomic measuring stick and an ideal platform for studying molecular evolution. One of the glycolytic enzymes, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), has already been noted to have one of the largest numbers of associated pseudogenes, among all proteins.</p> <p>Results</p> <p>We assembled the first comprehensive catalog of the processed and duplicated pseudogenes of glycolytic enzymes in many vertebrate model-organism genomes, including human, chimpanzee, mouse, rat, chicken, zebrafish, pufferfish, fruitfly, and worm (available at <url>http://pseudogene.org/glycolysis/</url>). We found that glycolytic pseudogenes are predominantly processed, i.e. retrotransposed from the mRNA of their parent genes. Although each glycolytic enzyme plays a unique role, GAPDH has by far the most pseudogenes, perhaps reflecting its large number of non-glycolytic functions or its possession of a particularly retrotranspositionally active sub-sequence. Furthermore, the number of GAPDH pseudogenes varies significantly among the genomes we studied: none in zebrafish, pufferfish, fruitfly, and worm, 1 in chicken, 50 in chimpanzee, 62 in human, 331 in mouse, and 364 in rat. Next, we developed a simple method of identifying conserved syntenic blocks (consistently applicable to the wide range of organisms in the study) by using orthologous genes as anchors delimiting a conserved block between a pair of genomes. This approach showed that few glycolytic pseudogenes are shared between primate and rodent lineages. Finally, by estimating pseudogene ages using Kimura's two-parameter model of nucleotide substitution, we found evidence for bursts of retrotranspositional activity approximately 42, 36, and 26 million years ago in the human, mouse, and rat lineages, respectively.</p> <p>Conclusion</p> <p>Overall, we performed a consistent analysis of one group of pseudogenes across multiple genomes, finding evidence that most of them were created within the last 50 million years, subsequent to the divergence of rodent and primate lineages.</p

    The Role of Retrotransposons in Gene Family Expansions: Insights from the Mouse \u3ci\u3eAbp\u3c/i\u3e Gene Family

    Get PDF
    Background: Retrotransposons have been suggested to provide a substrate for non-allelic homologous recombination (NAHR) and thereby promote gene family expansion. Their precise role, however, is controversial. Here we ask whether retrotransposons contributed to the recent expansions of the Androgen-binding protein (Abp) gene families that occurred independently in the mouse and rat genomes. Results: Using dot plot analysis, we found that the most recent duplication in the Abp region of the mouse genome is flanked by L1Md_T elements. Analysis of the sequence of these elements revealed breakpoints that are the relicts of the recombination that caused the duplication, confirming that the duplication arose as a result of NAHR using L1 elements as substrates. L1 and ERVII retrotransposons are considerably denser in the Abp regions than in one Mb flanking regions, while other repeat types are depleted in the Abp regions compared to flanking regions. L1 retrotransposons preferentially accumulated in the Abp gene regions after lineage separation and roughly followed the pattern of Abp gene expansion. By contrast, the proportion of shared vs. lineage-specific ERVII repeats in the Abp region resembles the rest of the genome. Conclusions: We confirmed the role of L1 repeats in Abp gene duplication with the identification of recombinant L1Md_T elements at the edges of the most recent mouse Abp gene duplication. High densities of L1 and ERVII repeats were found in the Abp gene region with abrupt transitions at the region boundaries, suggesting that their higher densities are tightly associated with Abp gene duplication. We observed that the major accumulation of L1 elements occurred after the split of the mouse and rat lineages and that there is a striking overlap between the timing of L1 accumulation and expansion of the Abp gene family in the mouse genome. Establishing a link between the accumulation of L1 elements and the expansion of the Abp gene family and identification of an NAHR-related breakpoint in the most recent duplication are the main contributions of our study

    The Mechanism of Expansion and the Volatility it created in Three Pheromone Gene Clusters in the Mouse (\u3ci\u3eMus musculus\u3c/i\u3e) Genome

    Get PDF
    Three families of proteinaceous pheromones have been described in the house mouse: androgen-binding proteins (ABPs), exocrine gland–secreting peptides (ESPs), and major urinary proteins (MUPs), each of which is thought to communicate different information. All three are encoded by large gene clusters in different regions of the mouse genome, clusters that have expanded dramatically during mouse evolutionary history. We report copy number variation among the most recently duplicated Abp genes, which suggests substantial volatility in this gene region. It appears that groups of these genes behave as low copy repeats (LCRs), duplicating as relatively large blocks of genes by nonallelic homologous recombination. An analysis of gene conversion suggested that it did not contribute to the very low or absent divergence among the paralogs duplicated in this way. We evaluated the ESP and MUP gene regions for signs of the LCR pattern but could find no compelling evidence for duplication of gene blocks of any significant size. Assessment of the entire Abp gene region with the Mouse Paralogy Browser supported the conclusion that substantial volatility has occurred there. This was especially evident when comparing strains with all or part of the Mus musculus musculus or Mus musculus castaneus Abp region. No particularly remarkable volatility was observed in the other two gene families, and we discuss the significance of this in light of the various roles proposed for the three families of mouse proteinaceous pheromones

    Comparative analysis of processed ribosomal protein pseudogenes in four mammalian genomes

    Get PDF
    An analysis of ribosomal protein pseudogenes in the four mammalian genomes reveals no correlation between number of pseudogenes and mRNA abundance

    Duplication and relocation of the functional DPY19L2 gene within low copy repeats

    Get PDF
    BACKGROUND: Low copy repeats (LCRs) are thought to play an important role in recent gene evolution, especially when they facilitate gene duplications. Duplicate genes are fundamental to adaptive evolution, providing substrates for the development of new or shared gene functions. Moreover, silencing of duplicate genes can have an indirect effect on adaptive evolution by causing genomic relocation of functional genes. These changes are theorized to have been a major factor in speciation. RESULTS: Here we present a novel example showing functional gene relocation within a LCR. We characterize the genomic structure and gene content of eight related LCRs on human Chromosomes 7 and 12. Two members of a novel transmembrane gene family, DPY19L, were identified in these regions, along with six transcribed pseudogenes. One of these genes, DPY19L2, is found on Chromosome 12 and is not syntenic with its mouse orthologue. Instead, the human locus syntenic to mouse Dpy19l2 contains a pseudogene, DPY19L2P1. This indicates that the ancestral copy of this gene has been silenced, while the descendant copy has remained active. Thus, the functional copy of this gene has been relocated to a new genomic locus. We then describe the expansion and evolution of the DPY19L gene family from a single gene found in invertebrate animals. Ancient duplications have led to multiple homologues in different lineages, with three in fish, frogs and birds and four in mammals. CONCLUSION: Our results show that the DPY19L family has expanded throughout the vertebrate lineage and has undergone recent primate-specific evolution within LCRs

    Rapid bursts of androgen-binding protein (Abp) gene duplication occurred independently in diverse mammals

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The draft mouse (<it>Mus musculus</it>) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) Ξ±, Ξ² and Ξ³ subunits. Further investigation of 14 Ξ±-like (<it>Abpa</it>) and 13 Ξ²- or Ξ³-like (<it>Abpbg</it>) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals.</p> <p>Results</p> <p>Here, we interrogate the latest 'finished' mouse (<it>Mus musculus</it>) genome sequence assembly to show that the <it>Abp </it>gene repertoire is, in fact, twice as large as reported previously, with 30 <it>Abpa </it>and 34 <it>Abpbg </it>genes and pseudogenes. All of these have arisen since the last common ancestor with rat (<it>Rattus norvegicus</it>). We then demonstrate, by sequencing homologs from species within the <it>Mus </it>genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey <it>Abp </it>orthologs in genomes from across the mammalian clade and show that bursts of <it>Abp </it>gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, <it>Oryctolagus cuniculus</it>) and ruminant (cattle, <it>Bos taurus</it>) lineages, although not in other mammalian taxa.</p> <p>Conclusion</p> <p>We conclude that <it>Abp </it>genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification.</p

    Species Specificity in Major Urinary Proteins by Parallel Evolution

    Get PDF
    Species-specific chemosignals, pheromones, regulate social behaviors such as aggression, mating, pup-suckling, territory establishment, and dominance. The identity of these cues remains mostly undetermined and few mammalian pheromones have been identified. Genetically-encoded pheromones are expected to exhibit several different mechanisms for coding 1) diversity, to enable the signaling of multiple behaviors, 2) dynamic regulation, to indicate age and dominance, and 3) species-specificity. Recently, the major urinary proteins (Mups) have been shown to function themselves as genetically-encoded pheromones to regulate species-specific behavior. Mups are multiple highly related proteins expressed in combinatorial patterns that differ between individuals, gender, and age; which are sufficient to fulfill the first two criteria. We have now characterized and fully annotated the mouse Mup gene content in detail. This has enabled us to further analyze the extent of Mup coding diversity and determine their potential to encode species-specific cues
    • …
    corecore