2,583 research outputs found

    Evolution of C2H2-zinc finger genes and subfamilies in mammals: Species-specific duplication and loss of clusters, genes and effector domains

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>C2H2 zinc finger genes (C2H2-ZNF) constitute the largest class of transcription factors in humans and one of the largest gene families in mammals. Often arranged in clusters in the genome, these genes are thought to have undergone a massive expansion in vertebrates, primarily by tandem duplication. However, this view is based on limited datasets restricted to a single chromosome or a specific subset of genes belonging to the large KRAB domain-containing C2H2-ZNF subfamily.</p> <p>Results</p> <p>Here, we present the first comprehensive study of the evolution of the C2H2-ZNF family in mammals. We assembled the complete repertoire of human C2H2-ZNF genes (718 in total), about 70% of which are organized into 81 clusters across all chromosomes. Based on an analysis of their N-terminal effector domains, we identified two new C2H2-ZNF subfamilies encoding genes with a SET or a HOMEO domain. We searched for the syntenic counterparts of the human clusters in other mammals for which complete gene data are available: chimpanzee, mouse, rat and dog. Cross-species comparisons show a large variation in the numbers of C2H2-ZNF genes within homologous mammalian clusters, suggesting differential patterns of evolution. Phylogenetic analysis of selected clusters reveals that the disparity in C2H2-ZNF gene repertoires across mammals not only originates from differential gene duplication but also from gene loss. Further, we discovered variations among orthologs in the number of zinc finger motifs and association of the effector domains, the latter often undergoing sequence degeneration. Combined with phylogenetic studies, physical maps and an analysis of the exon-intron organization of genes from the SCAN and KRAB domains-containing subfamilies, this result suggests that the SCAN subfamily emerged first, followed by the SCAN-KRAB and finally by the KRAB subfamily.</p> <p>Conclusion</p> <p>Our results are in agreement with the "birth and death hypothesis" for the evolution of C2H2-ZNF genes, but also show that this hypothesis alone cannot explain the considerable evolutionary variation within the subfamilies of these genes in mammals. We, therefore, propose a new model involving the interdependent evolution of C2H2-ZNF gene subfamilies.</p

    The evolution of protostome GATA factors: Molecular phylogenetics, synteny, and intron/exon structure reveal orthologous relationships

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Invertebrate and vertebrate GATA transcription factors play important roles in ectoderm and mesendoderm development, as well as in cardiovascular and blood cell fate specification. However, the assignment of evolutionarily conserved roles to GATA homologs requires a detailed framework of orthologous relationships. Although two distinct classes, GATA123 and GATA456, have been unambiguously recognized among deuterostome GATA genes, it has been difficult to resolve exact orthologous relationships among protostome homologs. Protostome GATA genes are often present in multiple copies within any one genome, and rapidly evolving gene sequences have obscured orthology among arthropod and nematode GATA homologs. In addition, a lack of taxonomic sampling has prevented a stepwise reconstruction of protostome GATA gene family evolution.</p> <p>Results</p> <p>We have identified the complete GATA complement (53 genes) from a diverse sampling of protostome genomes, including six arthropods, three lophotrochozoans, and two nematodes. Reciprocal best hit BLAST analysis suggested orthology of these GATA genes to either the ancestral bilaterian GATA123 or the GATA456 class. Using molecular phylogenetic analyses of gene sequences, together with conserved synteny and comparisons of intron/exon structure, we inferred the evolutionary relationships among these 53 protostome GATA homologs. In particular, we resolved the orthology and evolutionary birth order of all arthropod GATA homologs including the highly divergent <it>Drosophila </it>GATA genes.</p> <p>Conclusion</p> <p>Our combined analyses confirm that all protostome GATA transcription factor genes are members of either the GATA123 or GATA456 class, and indicate that there have been multiple protostome-specific duplications of GATA456 homologs. Three GATA456 genes exhibit linkage in multiple protostome species, suggesting that this gene cluster arose by tandem duplications from an ancestral GATA456 gene. Within arthropods this GATA456 cluster appears orthologous and widely conserved. Furthermore, the intron/exon structures of the arthropod GATA456 orthologs suggest a distinct order of gene duplication events. At present, however, the evolutionary relationship to similarly linked GATA456 paralogs in lophotrochozoans remains unclear. Our study shows how sampling of additional genomic data, especially from less derived and interspersed protostome taxa, can be used to resolve the orthologous relationships within more divergent gene families.</p

    Comparative analysis of function and interaction of transcription factors in nematodes: Extensive conservation of orthology coupled to rapid sequence evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Much of the morphological diversity in eukaryotes results from differential regulation of gene expression in which transcription factors (TFs) play a central role. The nematode <it>Caenorhabditis elegans </it>is an established model organism for the study of the roles of TFs in controlling the spatiotemporal pattern of gene expression. Using the fully sequenced genomes of three <it>Caenorhabditid </it>nematode species as well as genome information from additional more distantly related organisms (fruit fly, mouse, and human) we sought to identify orthologous TFs and characterized their patterns of evolution.</p> <p>Results</p> <p>We identified 988 TF genes in <it>C. elegans</it>, and inferred corresponding sets in <it>C. briggsae </it>and <it>C. remanei</it>, containing 995 and 1093 TF genes, respectively. Analysis of the three gene sets revealed 652 3-way reciprocal 'best hit' orthologs (nematode TF set), approximately half of which are zinc finger (ZF-C2H2 and ZF-C4/NHR types) and HOX family members. Examination of the TF genes in <it>C. elegans </it>and <it>C. briggsae </it>identified the presence of significant tandem clustering on chromosome V, the majority of which belong to ZF-C4/NHR family. We also found evidence for lineage-specific duplications and rapid evolution of many of the TF genes in the two species. A search of the TFs conserved among nematodes in <it>Drosophila melanogaster</it>, <it>Mus musculus </it>and <it>Homo sapiens </it>revealed 150 reciprocal orthologs, many of which are associated with important biological processes and human diseases. Finally, a comparison of the sequence, gene interactions and function indicates that nematode TFs conserved across phyla exhibit significantly more interactions and are enriched in genes with annotated mutant phenotypes compared to those that lack orthologs in other species.</p> <p>Conclusion</p> <p>Our study represents the first comprehensive genome-wide analysis of TFs across three nematode species and other organisms. The findings indicate substantial conservation of transcription factors even across distant evolutionary lineages and form the basis for future experiments to examine TF gene function in nematodes and other divergent phyla.</p

    Targeted Deletion and Inversion of Tandemly Arrayed Genes in Arabidopsis thaliana Using Zinc Finger Nucleases

    Get PDF
    Tandemly arrayed genes (TAGs) or gene clusters are prevalent in higher eukaryotic genomes. For example, approximately 17% of genes are organized in tandem in the model plant Arabidopsis thaliana. The genetic redundancy created by TAGs presents a challenge for reverse genetics. As molecular scissors, engineered zinc finger nucleases (ZFNs) make DNA double-strand breaks in a sequence-specific manner. ZFNs thus provide a means to delete TAGs by creating two double-strand breaks in the gene cluster. Using engineered ZFNs, we successfully targeted seven genes from three TAGs on two Arabidopsis chromosomes, including the well-known RPP4 gene cluster, which contains eight resistance (R) genes. The resulting gene cluster deletions ranged from a few kb to 55 kb with frequencies approximating 1% in somatic cells. We also obtained large chromosomal deletions of ~9 Mb at approximately one tenth the frequency, and gene cluster inversions and duplications also were achieved. This study demonstrates the ability to use sequence-specific nucleases in plants to make targeted chromosome rearrangements and create novel chimeric genes for reverse genetics and biotechnology

    Classification and nomenclature of all human homeobox genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The homeobox genes are a large and diverse group of genes, many of which play important roles in the embryonic development of animals. Increasingly, homeobox genes are being compared between genomes in an attempt to understand the evolution of animal development. Despite their importance, the full diversity of human homeobox genes has not previously been described.</p> <p>Results</p> <p>We have identified all homeobox genes and pseudogenes in the euchromatic regions of the human genome, finding many unannotated, incorrectly annotated, unnamed, misnamed or misclassified genes and pseudogenes. We describe 300 human homeobox loci, which we divide into 235 probable functional genes and 65 probable pseudogenes. These totals include 3 genes with partial homeoboxes and 13 pseudogenes that lack homeoboxes but are clearly derived from homeobox genes. These figures exclude the repetitive <it>DUX1 </it>to <it>DUX5 </it>homeobox sequences of which we identified 35 probable pseudogenes, with many more expected in heterochromatic regions. Nomenclature is established for approximately 40 formerly unnamed loci, reflecting their evolutionary relationships to other loci in human and other species, and nomenclature revisions are proposed for around 30 other loci. We use a classification that recognizes 11 homeobox gene 'classes' subdivided into 102 homeobox gene 'families'.</p> <p>Conclusion</p> <p>We have conducted a comprehensive survey of homeobox genes and pseudogenes in the human genome, described many new loci, and revised the classification and nomenclature of homeobox genes. The classification scheme may be widely applicable to homeobox genes in other animal genomes and will facilitate comparative genomics of this important gene superclass.</p
    corecore