Genome-Wide Identification and Evolutionary Analysis of the Animal Specific ETS Transcription Factor Family

Abstract

The ETS proteins are a family of transcription factors (TFs) that regulate a variety of biological processes. We made genome-wide analyses to explore the classification of the ETS gene family. We identified 207 ETS genes which encode 321 ETS TFs from ten animal species. Of the 321 ETS TFs, 155 contain only an ETS domain, about 50% contain a ETS_PEA3_N or a SAM_PNT domain in addition to an ETS domain, the rest (only four) contain a second ETS domain or a second ETS_PEA3_N domain or an another domain (AT_hook or DNA_pol_B). A Neighbor-Joining phylogenetic tree was constructed using the amino acid sequences of the ETS domain of the ETS TFs. The results revealed that the ETS genes of the ten species can be divided into two distinct groups. Group I contains one nematode ETS gene and 18 vertebrate animal ETS genes. Group II contains the majority of the ETS TFs and can be further divided into eleven subgroups. The sequence motifs outside the DNA-binding domain and the conservation of the exon-intron structural patterns of the ETS TFs in human, cattle, and chicken further support the phylogenetic classification among these ETS TFs. Extensive duplication of the ETS genes was found in the genome of each species. The duplicated ETS genes account for ~69% of the total of ETS genes. Furthermore, we also found there are ETS gene clusters in all of the ten animal species. Statistical analysis of the Gene Ontology annotations of the ETS genes showed that the ETS proteins tend to be related to RNA biosynthetic process, biopolymer metabolic process and macromolecule metabolic process expected from the common GO categories of transcriptional factors. We also discussed the functional conservation and diversification of ETS TFs

    Similar works