45 research outputs found
Dimerization of FIR upon FUSE DNA binding suggests a mechanism of c-myc inhibition
c-myc is essential for cell homeostasis and growth but lethal if improperly regulated. Transcription of this oncogene is governed by the counterbalancing forces of two proteins on TFIIH—the FUSE binding protein (FBP) and the FBP-interacting repressor (FIR). FBP and FIR recognize single-stranded DNA upstream of the P1 promoter, known as FUSE, and influence transcription by oppositely regulating TFIIH at the promoter site. Size exclusion chromatography coupled with light scattering reveals that an FIR dimer binds one molecule of single-stranded DNA. The crystal structure confirms that FIR binds FUSE as a dimer, and only the N-terminal RRM domain participates in nucleic acid recognition. Site-directed mutations of conserved residues in the first RRM domain reduce FIR's affinity for FUSE, while analogous mutations in the second RRM domain either destabilize the protein or have no effect on DNA binding. Oppositely oriented DNA on parallel binding sites of the FIR dimer results in spooling of a single strand of bound DNA, and suggests a mechanism for c-myc transcriptional control
Phylogeny of Echinoderm Hemoglobins
Recent genomic information has revealed that neuroglobin and cytoglobin are the two principal lineages of vertebrate hemoglobins, with the latter encompassing the familiar myoglobin and α-globin/β-globin tetramer hemoglobin, and several minor groups. In contrast, very little is known about hemoglobins in echinoderms, a phylum of exclusively marine organisms closely related to vertebrates, beyond the presence of coelomic hemoglobins in sea cucumbers and brittle stars. We identified about 50 hemoglobins in sea urchin, starfish and sea cucumber genomes and transcriptomes, and used Bayesian inference to carry out a molecular phylogenetic analysis of their relationship to vertebrate sequences, specifically, to assess the hypothesis that the neuroglobin and cytoglobin lineages are also present in echinoderms.The genome of the sea urchin Strongylocentrotus purpuratus encodes several hemoglobins, including a unique chimeric 14-domain globin, 2 androglobin isoforms and a unique single androglobin domain protein. Other strongylocentrotid genomes appear to have similar repertoires of globin genes. We carried out molecular phylogenetic analyses of 52 hemoglobins identified in sea urchin, brittle star and sea cucumber genomes and transcriptomes, using different multiple sequence alignment methods coupled with Bayesian and maximum likelihood approaches. The results demonstrate that there are two major globin lineages in echinoderms, which are related to the vertebrate neuroglobin and cytoglobin lineages. Furthermore, the brittle star and sea cucumber coelomic hemoglobins appear to have evolved independently from the cytoglobin lineage, similar to the evolution of erythroid oxygen binding globins in cyclostomes and vertebrates.The presence of echinoderm globins related to the vertebrate neuroglobin and cytoglobin lineages suggests that the split between neuroglobins and cytoglobins occurred in the deuterostome ancestor shared by echinoderms and vertebrates
Towards a comprehensive structural coverage of completed genomes: a structural genomics viewpoint
BACKGROUND: Structural genomics initiatives were established with the aim of solving protein structures on a large-scale. For many initiatives, such as the Protein Structure Initiative (PSI), the primary aim of target selection is focussed towards structurally characterising protein families which, so far, lack a structural representative. It is therefore of considerable interest to gain insights into the number and distribution of these families, and what efforts may be required to achieve a comprehensive structural coverage across all protein families. RESULTS: In this analysis we have derived a comprehensive domain annotation of the genomes using CATH, Pfam-A and Newfam domain families. We consider what proportions of structurally uncharacterised families are accessible to high-throughput structural genomics pipelines, specifically those targeting families containing multiple prokaryotic orthologues. In measuring the domain coverage of the genomes, we show the benefits of selecting targets from both structurally uncharacterised domain families, whilst in addition, pursuing additional targets from large structurally characterised protein superfamilies. CONCLUSION: This work suggests that such a combined approach to target selection is essential if structural genomics is to achieve a comprehensive structural coverage of the genomes, leading to greater insights into structure and the mechanisms that underlie protein evolution
The Crystal Structure of the Human Co-Chaperone P58IPK
P58IPK is one of the endoplasmic reticulum- (ER-) localised DnaJ (ERdj) proteins which interact with the chaperone BiP, the mammalian ER ortholog of Hsp70, and are thought to contribute to the specificity and regulation of its diverse functions. P58IPK, expression of which is upregulated in response to ER stress, has been suggested to act as a co-chaperone, binding un- or misfolded proteins and delivering them to BiP. In order to give further insights into the functions of P58IPK, and the regulation of BiP by ERdj proteins, we have determined the crystal structure of human P58IPK to 3.0 Å resolution using a combination of molecular replacement and single wavelength anomalous diffraction. The structure shows the human P58IPK monomer to have a very elongated overall shape. In addition to the conserved J domain, P58IPK contains nine N-terminal tetratricopeptide repeat motifs, divided into three subdomains of three motifs each. The J domain is attached to the C-terminal end via a flexible linker, and the structure shows the conserved Hsp70-binding histidine-proline-aspartate (HPD) motif to be situated on the very edge of the elongated protein, 100 Å from the putative binding site for unfolded protein substrates. The residues that comprise the surface surrounding the HPD motif are highly conserved in P58IPK from other organisms but more varied between the human ERdj proteins, supporting the view that their regulation of different BiP functions is facilitated by differences in BiP-binding
HemeBIND: a novel method for heme binding residue prediction by combining structural and sequence information
<p>Abstract</p> <p>Background</p> <p>Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues.</p> <p>Results</p> <p>Here we introduced an efficient algorithm HemeBIND for predicting heme binding residues by integrating structural and sequence information. We systematically investigated the characteristics of binding interfaces based on a non-redundant dataset of heme-protein complexes. It was found that several sequence and structural attributes such as evolutionary conservation, solvent accessibility, depth and protrusion clearly illustrate the differences between heme binding and non-binding residues. These features can then be separately used or combined to build the structure-based classifiers using support vector machine (SVM). The results showed that the information contained in these features is largely complementary and their combination achieved the best performance. To further improve the performance, an attempt has been made to develop a post-processing procedure to reduce the number of false positives. In addition, we built a sequence-based classifier based on SVM and sequence profile as an alternative when only sequence information can be used. Finally, we employed a voting method to combine the outputs of structure-based and sequence-based classifiers, which demonstrated remarkably better performance than the individual classifier alone.</p> <p>Conclusions</p> <p>HemeBIND is the first specialized algorithm used to predict binding residues in protein structures for heme ligands. Extensive experiments indicated that both the structure-based and sequence-based methods have effectively identified heme binding residues while the complementary relationship between them can result in a significant improvement in prediction performance. The value of our method is highlighted through the development of HemeBIND web server that is freely accessible at <url>http://mleg.cse.sc.edu/hemeBIND/</url>.</p
Of Bits and Bugs — On the Use of Bioinformatics and a Bacterial Crystal Structure to Solve a Eukaryotic Repeat-Protein Structure
Pur-α is a nucleic acid-binding protein involved in cell cycle control, transcription, and neuronal function. Initially no prediction of the three-dimensional structure of Pur-α was possible. However, recently we solved the X-ray structure of Pur-α from the fruitfly Drosophila melanogaster and showed that it contains a so-called PUR domain. Here we explain how we exploited bioinformatics tools in combination with X-ray structure determination of a bacterial homolog to obtain diffracting crystals and the high-resolution structure of Drosophila Pur-α. First, we used sensitive methods for remote-homology detection to find three repetitive regions in Pur-α. We realized that our lack of understanding how these repeats interact to form a globular domain was a major problem for crystallization and structure determination. With our information on the repeat motifs we then identified a distant bacterial homolog that contains only one repeat. We determined the bacterial crystal structure and found that two of the repeats interact to form a globular domain. Based on this bacterial structure, we calculated a computational model of the eukaryotic protein. The model allowed us to design a crystallizable fragment and to determine the structure of Drosophila Pur-α. Key for success was the fact that single repeats of the bacterial protein self-assembled into a globular domain, instructing us on the number and boundaries of repeats to be included for crystallization trials with the eukaryotic protein. This study demonstrates that the simpler structural domain arrangement of a distant prokaryotic protein can guide the design of eukaryotic crystallization constructs. Since many eukaryotic proteins contain multiple repeats or repeating domains, this approach might be instructive for structural studies of a range of proteins
Trends in template/fragment-free protein structure prediction
Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward