31 research outputs found
Comparison of unaligned full-length sequences in CLANS.
Full set of unaligned sequences of all 17 families were submitted into CLANS all-against-all BLAST search. A single sequence is represented by one “dot”, circle. The sequences clustered in 3D based on pairwise similarity. Mostly, sequences clustered well into families similarly as was assigned by Pfam. Each Pfam family is shown in unique color. The same colors are used in profile-profile analysis (Fig 5A and S3 Fig). The largest (and also unknotted) family PF00999 was clustered into seven groups, all groups colored in blue. The sequences of slipknotted family PF03390 (colored in orange) clustered into one group. The families IDs with unknotted topologies are highlighted with blue font (PF00999, PF06965, PF01758). The family PF05145 from the clan CL0142 is colored red. The one-domain families IDs (LrgA—PF03788, LrgB—PF04172) are highlighted with magenta font. The fusion LrgA/LrgB that partially co-localize with PF04172 (LrgB) is shown in magenta. Some of one-domain LrgA (PF03788) sequences are located in between fusion protein and LrgA, marked with the star sign “*”. (TIFF)</p
Bayesian phylogenetic tree 4.
The tree was generated from multiple sequence alignment of the extended conserved core (S4 Fig) which includes the conserved 3-TM helical core + 1 next TM helix (hairpin in slipknotted family). The characteristics matrix based on the profile-profile connections was used as in trees S5 and S6 Figs. Additionally, removed N- and C-terminal regions of the domains were introduced into tree calculation as N/C matrices. Characteristics matrix and N/C matrices were multiplied 10 times. For the tree calculation 10 representative sequences of slipknotted PF03390 and one-domain families (PF03788, PF04172) and three representative sequences of remaining families were used. The tree shows three main branches: 1) slipknotted family PF03390 is placed together with both one-domain families PF03788 and PF04172; 2) Also, as previously, a separated branch joins closely related families PF01758, PF013593 and PF3547; 3) Domains A and B of the unknotted family PF00999 are separated on the tree. However, evolution of other families is not resolved in this tree. PF04172 is placed on the same branch with PF03788 and according to profile analysis these families are distantly related. Also, PF03812(B) and PF03601(A) are together with PF00999(B) which is also in agreement with profile analysis. (TIFF)</p
Known protein families from monovalent cation-proton antiporter superfamily investigated toward identification of possible evolution of the slipkotted topology.
IDs of families and clans are from Pfam database. ND—not determined.</p
Methods.
Sequence search. Sequence analysis. Procedure to identify the domains. Sequence profiles. Multiple sequence alignment. Phylogeny tree reconstruction. Visualization and figures preparation. (PDF)</p
Projection of the phylogenetic tree of all 17 families studied here on the Tree of Life.
The phylogenetic tree (Fig 5B) is projected on the Tree of Life. Each protein family is colored by unique color similarly as in CLANS (Fig 5A). Presence of the protein family in a particular group of organism is shown by a cross sign “X”. (TIFF)</p
The linkers connecting two domains in slipknotted and unknotted structures.
Figure shows that slipknotted and unknotted proteins are composed of two inverted domains which are connected by the linker. Panel A shows slipknotted structure (PDBID: 5a1s). From left to right: domain A, linker and domain B are shown. Similarly, panel B-C shows the linkers between the domains in unknotted structures. (TIFF)</p
Bayesian phylogenetic tree 2.
The tree was generated from multiple sequence alignment of the domains using the characteristics matrix multiplied 10 times. The characteristics matrix (S11 Fig) was generated based on profile-profile connections (S3 Fig). For the tree calculation three representative sequences of each family were used. The tree shows several main branches: 1) Both domains of the slipknotted family PF03390 and one-domain family PF03788 are located on the same branch; 2) Another separated branch joins closely related families PF01758, PF013593 and PF3547; 3) Domains A and B of the unknotted family PF00999 were separated into two branches. Domains B of PF00999 are grouped together with the domain B of unknotted family PF06965 and with domain A of PF05684 (unknown topology). Domain A and B of families PF03956, PF06826, PF05145, domain B of PF05982 are located in one branch. (TIFF)</p
Profile-profile comparison of the domains.
Comparison of domains profiles, shown at cut-off 1e-5. Every domain profile is shown as one point with different shapes: star, circle, triangle, square. The connections between domains are shown as straight lines. The connection indicates that profile-profile alignment of these domains has significance value 1e-5 or less. Every family is colored in unique color, same as in the main Fig 5A. Two domains of the slipknotted family are shown as orange stars. All families from CL0062 are shown as circles colored according to the family. The families IDs with known unknotted topology are highlighted in blue font. The one-domain proteins (PF03788 and PF04172) are shown as triangles and families IDs are colored in magenta font. Two domains of PF05145 (CL0142) are shown as red squares. Families (PF00999 (blue), PF01758 (yellow), PF3547 (pink), PF03616 (olive), PF06826 (dark cyan) were divided into several subgroups based on full sequence clustering (S10 Fig), therefore there are more than two domains in these families. Domain A and B of PF00999 have separated into two clear clusters. (TIFF)</p
Sequence logos of both slipknotted and unknotted protein families.
Sequence logo of slipknot family PF03390 showing that multiple glycines are highly conserved across the whole family. The logos were generated from families multiple sequence alignments (available in Pfam) with WebLogo3. (TIFF)</p