21 research outputs found
Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction.
BackgroundOne of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to the sensitivity of the resulting profile. The inclusion of highly diverse sequences will presumably produce a more powerful profile, but distantly related sequences can be difficult to align accurately using only sequence information. Therefore, it would be expected that the use of protein structure alignments to improve the selection and alignment of diverse sequence homologs might yield improved profiles. However, the actual utility of such an approach has remained unclear.ResultsWe explored several iterative protocols for the generation of profile hidden Markov models. These protocols were tailored to allow the inclusion of protein structure alignments in the process, and were used for large-scale creation and benchmarking of structure alignment-enhanced models. We found that models using structure alignments did not provide an overall improvement over sequence-only models for superfamily-level structure predictions. However, the results also revealed that the structure alignment-enhanced models were complimentary to the sequence-only models, particularly at the edge of the "twilight zone". When the two sets of models were combined, they provided improved results over sequence-only models alone. In addition, we found that the beneficial effects of the structure alignment-enhanced models could not be realized if the structure-based alignments were replaced with sequence-based alignments. Our experiments with different iterative protocols for sequence-only models also suggested that simple protocol modifications were unable to yield equivalent improvements to those provided by the structure alignment-enhanced models. Finally, we found that models using structure alignments provided fold-level structure assignments that were superior to those produced by sequence-only models.ConclusionWhen attempting to predict the structure of remote homologs, we advocate a combined approach in which both traditional models and models incorporating structure alignments are used
Structural Evolution of the Protein KinaseāLike Superfamily
The protein kinase family is large and important, but it is only one family in a larger superfamily of homologous kinases that phosphorylate a variety of substrates and play important roles in all three superkingdoms of life. We used a carefully constructed structural alignment of selected kinases as the basis for a study of the structural evolution of the protein kinaseālike superfamily. The comparison of structures revealed a āuniversal coreā domain consisting only of regions required for ATP binding and the phosphotransfer reaction. Remarkably, even within the universal core some kinase structures display notable changes, while still retaining essential activity. Hence, the protein kinaseālike superfamily has undergone substantial structural and sequence revision over long evolutionary timescales. We constructed a phylogenetic tree for the superfamily using a novel approach that allowed for the combination of sequence and structure information into a unified quantitative analysis. When considered against the backdrop of species distribution and other metrics, our tree provides a compelling scenario for the development of the various kinase families from a shared common ancestor. We propose that most of the so-called āatypical kinasesā are not intermittently derived from protein kinases, but rather diverged early in evolution to form a distinct phyletic group. Within the atypical kinases, the aminoglycoside and choline kinase families appear to share the closest relationship. These two families in turn appear to be the most closely related to the protein kinase family. In addition, our analysis suggests that the actin-fragmin kinase, an atypical protein kinase, is more closely related to the phosphoinositide-3 kinase family than to the protein kinase family. The two most divergent families, Ī±-kinases and phosphatidylinositol phosphate kinases (PIPKs), appear to have distinct evolutionary histories. While the PIPKs probably have an evolutionary relationship with the rest of the kinase superfamily, the relationship appears to be very distant (and perhaps indirect). Conversely, the Ī±-kinases appear to be an exception to the scenario of early divergence for the atypical kinases: they apparently arose relatively recently in eukaryotes. We present possible scenarios for the derivation of the Ī±-kinases from an extant kinase fold
Short Promoters in Viral Vectors Drive Selective Expression in Mammalian Inhibitory Neurons, but do not Restrict Activity to Specific Inhibitory Cell-Types
Short cell-type specific promoter sequences are important for targeted gene therapy and studies of brain circuitry. We report on the ability of short promoter sequences to drive fluorescent protein expression in specific types of mammalian cortical inhibitory neurons using adeno-associated virus (AAV) and lentivirus (LV) vectors. We tested many gene regulatory sequences derived from fugu (Takifugu rubripes), mouse, human, and synthetic composite regulatory elements. All fugu compact promoters expressed in mouse cortex, with only the somatostatin (SST) and the neuropeptide Y (NPY) promoters largely restricting expression to GABAergic neurons. However these promoters did not control expression in inhibitory cells in a subtype specific manner. We also tested mammalian promoter sequences derived from genes putatively coexpressed or coregulated within three major inhibitory interneuron classes (PV, SST, VIP). In contrast to the fugu promoters, many of the mammalian sequences failed to express, and only the promoter from gene A930038C07Rik conferred restricted expression, although as in the case of the fugu sequences, this too was not inhibitory neuron subtype specific. Lastly and more promisingly, a synthetic sequence consisting of a composite regulatory element assembled with PAX6 E1.1 binding sites, NRSE and a minimal CMV promoter showed markedly restricted expression to a small subset of mostly inhibitory neurons, but whose commonalities are unknown
Recommended from our members
Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction.
BackgroundOne of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to the sensitivity of the resulting profile. The inclusion of highly diverse sequences will presumably produce a more powerful profile, but distantly related sequences can be difficult to align accurately using only sequence information. Therefore, it would be expected that the use of protein structure alignments to improve the selection and alignment of diverse sequence homologs might yield improved profiles. However, the actual utility of such an approach has remained unclear.ResultsWe explored several iterative protocols for the generation of profile hidden Markov models. These protocols were tailored to allow the inclusion of protein structure alignments in the process, and were used for large-scale creation and benchmarking of structure alignment-enhanced models. We found that models using structure alignments did not provide an overall improvement over sequence-only models for superfamily-level structure predictions. However, the results also revealed that the structure alignment-enhanced models were complimentary to the sequence-only models, particularly at the edge of the "twilight zone". When the two sets of models were combined, they provided improved results over sequence-only models alone. In addition, we found that the beneficial effects of the structure alignment-enhanced models could not be realized if the structure-based alignments were replaced with sequence-based alignments. Our experiments with different iterative protocols for sequence-only models also suggested that simple protocol modifications were unable to yield equivalent improvements to those provided by the structure alignment-enhanced models. Finally, we found that models using structure alignments provided fold-level structure assignments that were superior to those produced by sequence-only models.ConclusionWhen attempting to predict the structure of remote homologs, we advocate a combined approach in which both traditional models and models incorporating structure alignments are used
Enhanced Sequence Alignment Derived from the Structural Alignment of Kinase Representatives
<p>Enhanced Sequence Alignment Derived from the Structural Alignment of
Kinase Representatives</p
Views of Structural Representatives from Six Families in the Kinase-Like Superfamily Other Than the TPKs
<p>Structures are shown in an open-face view, and using the same conventions as
used for PKA in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-g001" target="_blank">Figure
1</a>. ATP and metal ions are shown in mirror image where available in
the structure. Similar to <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-g001" target="_blank">Figure 1</a>, secondary structural elements are colored according to
their conservation status in the overall superfamily as follows: yellow,
elements are part of the āuniversal coreā seen in all
kinases in the superfamily; orange, elements are present in more than two,
but not all, of the kinases in the superfamily; red, elements shared between
only two families; purple, elements seen only in this family, but inserted
within in the portion of the chain forming the universal core; blue,
elements seen only in this family, and connected to the N- or C-terminal
ends of the universal core. Secondary structural elements are labeled
according to the standard conventions for the individual structure. As in
<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-g001" target="_blank">Figure 1</a>, the
glycine-rich loop is rendered in green and the loop forming the linker
region is rendered in red. For clarity, the conserved residues shown in
<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-g001" target="_blank">Figure 1</a> are not
rendered in these structures, though in most cases they are similar.
Structures shown are as follows: (A) aminoglycoside phosphotransferase
(APH(3ā²)-IIIa [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-b24" target="_blank">24</a>]); (B) CK (CKA-2
[<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-b23" target="_blank">23</a>]); (C) ChaK [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-b20" target="_blank">20</a>]; (D) PI3K [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-b21" target="_blank">21</a>]; (E) AFK
[<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-b22" target="_blank">22</a>]; and (F) PIPKIIĪ² [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-b19" target="_blank">19</a>]. Molecular
renderings in this figure were created with MOLSCRIPT [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-b90" target="_blank">90</a>].</p
Conventional Distance-Based Phylogenetic Tree of the Kinase-Like Superfamily, Based Only on the Sequence Alignment from Figure 3
<p>This tree did not explicitly incorporate structural information, and is
provided for purposes of comparison with the Bayesian tree presented in
<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-g004" target="_blank">Figure 4</a>.
Structures are labeled by their PDB IDs, followed by the abbreviated
name of the structure. The AKs are highlighted by orange ovals.
Bootstrap values are provided for major branches. Some branches are too
short for values to fit; these are marked with red letters that
correspond to the following values: a, 199; b, 170; c, 101; d, 141.
Branches highlighted in gray were not supported by bootstrap values
above 500, and should be considered speculative (if based only on this
tree data) [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-b57" target="_blank">57</a>,<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0010049#pcbi-0010049-b58" target="_blank">58</a>]. Many of the core relationships within the
superfamily cannot be resolved with confidence using the conventional
sequence-based approach.</p