2 research outputs found
Multiple sequence alignment of Twist1 and Twist2 vertebrate protein sequences.
Acquisition of glycine residues is first seen in fish (T1_Cavfish), particularly in the second glycine region (right red box), while acquisition of glycine residues in the first glycine-rich region is first seen amongst the reptiles (left red box). Twist1 amphibians lack both glycine-rich regions, as observed with all Twist2 proteins. Sequence comparison of Twist proteins in different vertebrate species demonstrates a conserved sequence found in the majority of Twist1 and Twist2 proteins, particularly amongst mammals: SSSPVSPADDSLSNSEEE (the motif sequence for Twist1 in mammals) or SSSPVSPVDSLGTSEEE (the motif sequence for Twist2), depicted by black rectangles. Bold residues represent conserved sub-motifs that are 100% conserved within the mammalian class (highlighted in yellow in the MSA). A red arrow on top of the alignment depicts important residues (underlined threonine for Twist2 sequences and asparagine for Twist1) that are key in differentiating between Twist1 and Twist2 sequences. The alignment was colored based on the different levels of amino acid conservation: Pink represents 100% amino acid conservation, cyan blue represents 75% and green represents 50% conservation respectively. Each of the Twist1 and Twist2 proteins are grouped (highlighted) based on the vertebrate class to which they belong to: fish (grey), reptiles (green), amphibians (purple), birds (blue) and mammals (red). Gaps are indicated as hyphens. Sequence names used represent the common name of the species to which they belong. MSA was performed with T-COFFEE.</p
Evolution of the Twist Subfamily Vertebrate Proteins: Discovery of a Signature Motif and Origin of the Twist1 Glycine-Rich Motifs in the Amino-Terminus Disordered Domain
<div><p>Twist proteins belong to the basic helix-loop-helix (bHLH) family of multifunctional transcriptional factors. These factors are known to use domains other than the common bHLH in protein-protein interactions. There has been much work characterizing the bHLH domain and the C-terminus in protein-protein interactions but despite a few attempts more focus is needed at the N-terminus. Since the region of highest diversity in Twist proteins is the N-terminus, we analyzed the conservation of this region in different vertebrate Twist proteins and study the sequence differences between Twist1 and Twist2 with emphasis on the glycine-rich regions found in Twist1. We found a highly conserved sequence motif in all Twist1 (SSSPVSPADDSLSNSEEE) and Twist2 (SSSPVSPVDSLGTSEEE) mammalian species with unknown function. Through sequence comparison we demonstrate that the Twist protein family ancestor was “Twist2-like” and the two glycine-rich regions found in Twist1 sequences were acquired late in evolution, apparently not at the same time. The second glycine-rich region started developing first in the fish vertebrate group, while the first glycine region arose afterwards within the reptiles. Disordered domain and secondary structure predictions showed that the amino acid sequence and disorder feature found at the N-terminus is highly evolutionary conserved and could be a functional site that interacts with other proteins. Detailed examination of the glycine-rich regions in the N-terminus of Twist1 demonstrate that the first region is completely aliphatic while the second region contains some polar residues that could be subject to post-translational modification. Phylogenetic and sequence space analysis showed that the Twist1 subfamily is the result of a gene duplication during Twist2 vertebrate fish evolution, and has undergone more evolutionary drift than Twist2. We identified a new signature motif that is characteristic of each Twist paralog and identified important residues within this motif that can be used to distinguish between these two paralogs, which will help reduce Twist1 and Twist2 sequence annotation errors in public databases.</p></div
