116 research outputs found

    Development of a suite of bioinformatics tools for the analysis and prediction of membrane protein structure

    Get PDF
    This thesis describes the development of a novel approach for prediction of the three-dimensional structure of transmembrane regions of membrane proteins directly from amino acid sequence and basic transmembrane region topology. The development rationale employed involved a knowledge-based approach. Based on determined membrane protein structures, 20x20 association matrices were generated to summarise the distance associations between amino acid side chains on different alpha helical transmembrane regions of membrane proteins. Using these association matrices, combined with a knowledge-based scale for propensity for residue orientation in transmembrane segments (kPROT) (Pilpel et al., 1999), the software predicts the optimal orientations and associations of transmembrane regions and generates a 3D structural model of a gi ven membrane protein, based on the amino acid sequence composition of its transmembrane regions. During the development, several structural and biostatistical analyses of determined membrane protein structures were undertaken with the aim of ensuring a consistent and reliable association matrix upon which to base the predictions. Evaluation of the model structures obtained for the protein sequences of a dataset of 17 membrane proteins of detennined structure based on cross-validated leave-one-out testing revealed generally high accuracy of prediction, with over 80% of associations between transmembrane regions being correctly predicted. These results provide a promising basis for future development and refinement of the algorithm, and to this end, work is underway using evolutionary computing approaches. As it stands, the approach gives scope for significant immediate benefit to researchers as a valuable starting point in the prediction of structure for membrane proteins of hitherto unknown structure

    STING Millennium Suite: integrated software for extensive analyses of 3d structures of proteins and their complexes

    Get PDF
    BACKGROUND: The integration of many aspects of protein/DNA structure analysis is an important requirement for software products in general area of structural bioinformatics. In fact, there are too few software packages on the internet which can be described as successful in this respect. We might say that what is still missing is publicly available, web based software for interactive analysis of the sequence/structure/function of proteins and their complexes with DNA and ligands. Some of existing software packages do have certain level of integration and do offer analysis of several structure related parameters, however not to the extent generally demanded by a user. RESULTS: We are reporting here about new Sting Millennium Suite (SMS) version which is fully accessible (including for local files at client end), web based software for molecular structure and sequence/structure/function analysis. The new SMS client version is now operational also on Linux boxes and it works with non-public pdb formatted files (structures not deposited at the RCSB/PDB), eliminating earlier requirement for the registration if SMS components were to be used with user's local files. At the same time the new SMS offers some important additions and improvements such as link to ProTherm as well as significant re-engineering of SMS component ConSSeq. Also, we have added 3 new SMS mirror sites to existing network of global SMS servers: Argentina, Japan and Spain. CONCLUSION: SMS is already established software package and many key data base and software servers worldwide, do offer either a link to, or host the SMS. SMS (Sting Millennium Suite) is web-based publicly available software developed to aid researches in their quest for translating information about the structures of macromolecules into knowledge. SMS allows to a user to interactively analyze molecular structures, cross-referencing visualized information with a correlated one, available across the internet. SMS is already used as a didactic tool by some universities. SMS analysis is now possible on Linux OS boxes and with no requirement for registration when using local files

    Analysis of non-TIR NBS-LRR resistance gene analogs in Musa acuminata Colla: Isolation, RFLP marker development, and physical mapping

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many commercial banana varieties lack sources of resistance to pests and diseases, as a consequence of sterility and narrow genetic background. Fertile wild relatives, by contrast, possess greater variability and represent potential sources of disease resistance genes (R-genes). The largest known family of plant R-genes encode proteins with nucleotide-binding site (NBS) and C-terminal leucine-rich repeat (LRR) domains. Conserved motifs in such genes in diverse plant species offer a means for isolation of candidate genes in banana which may be involved in plant defence.</p> <p>Results</p> <p>A computational strategy was developed for unbiased conserved motif discovery in NBS and LRR domains in R-genes and homologues in monocotyledonous plant species. Degenerate PCR primers targeting conserved motifs were tested on the wild cultivar <it>Musa acuminata </it>subsp. <it>burmannicoides</it>, var. Calcutta 4, which is resistant to a number of fungal pathogens and nematodes. One hundred and seventy four resistance gene analogs (RGAs) were amplified and assembled into 52 contiguous sequences. Motifs present were typical of the non-TIR NBS-LRR RGA subfamily. A phylogenetic analysis of deduced amino-acid sequences for 33 RGAs with contiguous open reading frames (ORFs), together with RGAs from <it>Arabidopsis thaliana </it>and <it>Oryza sativa</it>, grouped most <it>Musa </it>RGAs within monocotyledon-specific clades. RFLP-RGA markers were developed, with 12 displaying distinct polymorphisms in parentals and F1 progeny of a diploid <it>M. acuminata </it>mapping population. Eighty eight BAC clones were identified in <it>M. acuminata </it>Calcutta 4, <it>M. acuminata </it>Grande Naine, and <it>M. balbisiana </it>Pisang Klutuk Wulung BAC libraries when hybridized to two RGA probes. Multiple copy RGAs were common within BAC clones, potentially representing variation reservoirs for evolution of new R-gene specificities.</p> <p>Conclusion</p> <p>This is the first large scale analysis of NBS-LRR RGAs in <it>M. acuminata </it>Calcutta 4. Contig sequences were deposited in GenBank and assigned numbers <ext-link ext-link-type="gen" ext-link-id="ER935972">ER935972</ext-link> – <ext-link ext-link-type="gen" ext-link-id="ER936023">ER936023</ext-link>. RGA sequences and isolated BACs are a valuable resource for R-gene discovery, and in future applications will provide insight into the organization and evolution of NBS-LRR R-genes in the <it>Musa </it>A and B genome. The developed RFLP-RGA markers are applicable for genetic map development and marker assisted selection for defined traits such as pest and disease resistance.</p

    The genome sequence of Pseudoplusia includens single nucleopolyhedrovirus and an analysis of p26 gene evolution in the baculoviruses

    Get PDF
    Background: Pseudoplusia includens single nucleopolyhedrovirus (PsinSNPV-IE) is a baculovirus recently identified in our laboratory, with high pathogenicity to the soybean looper, Chrysodeixis includens (Lepidoptera: Noctuidae) (Walker, 1858). In Brazil, the C. includens caterpillar is an emerging pest and has caused significant losses in soybean and cotton crops. The PsinSNPV genome was determined and the phylogeny of the p26 gene within the family Baculoviridae was investigated. Results: The complete genome of PsinSNPV was sequenced (Roche 454 GS FLX – Titanium platform), annotated and compared with other Alphabaculoviruses, displaying a genome apparently different from other baculoviruses so far sequenced. The circular double stranded DNA genome is 139,132 bp in length, with a GC content of 39.3 % and contains 141 open reading frames (ORFs). PsinSNPV possesses the 37 conserved baculovirus core genes, 102 genes found in other baculoviruses and 2 unique ORFs. Two baculovirus repeat ORFs (bro) homologs, bro-a (Psin33) and bro-b (Psin69), were identified and compared with Chrysodeixis chalcites nucleopolyhedrovirus (ChchNPV) and Trichoplusia ni single nucleopolyhedrovirus (TnSNPV) bro genes and showed high similarity, suggesting that these genes may be derived from an ancestor common to these viruses. The homologous repeats (hrs) are absent from the PsinSNPV genome, which is also the case in ChchNPV and TnSNPV. Two p26 gene homologs (p26a and p26b) were found in the PsinSNPV genome. P26 is thought to be required for optimal virion occlusion in the occlusion bodies (OBs), but its function is not well characterized. The P26 phylogenetic tree suggests that this gene was obtained from three independent acquisition events within the Baculoviridae family. The presence of a signal peptide only in the PsinSNPV p26a/ORF-20 homolog indicates distinct function between the two P26 proteins. Conclusions: PsinSNPV has a genomic sequence apparently different from other baculoviruses sequenced so far. The complete genome sequence of PsinSNPV will provide a valuable resource, contributing to studies on its molecular biology and functional genomics, and will promote the development of this virus as an effective bioinsecticide

    Analysis of binding properties and specificity through identification of the interface forming residues (IFR) for serine proteases in silico docked to different inhibitors

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Enzymes belonging to the same super family of proteins in general operate on variety of substrates and are inhibited by wide selection of inhibitors. In this work our main objective was to expand the scope of studies that consider only the catalytic and binding pocket amino acids while analyzing enzyme specificity and instead, include a wider category which we have named the Interface Forming Residues (IFR). We were motivated to identify those amino acids with decreased accessibility to solvent after docking of different types of inhibitors to sub classes of serine proteases and then create a table (matrix) of all amino acid positions at the interface as well as their respective occupancies. Our goal is to establish a platform for analysis of the relationship between IFR characteristics and binding properties/specificity for bi-molecular complexes.</p> <p>Results</p> <p>We propose a novel method for describing binding properties and delineating serine proteases specificity by compiling an exhaustive table of interface forming residues (IFR) for serine proteases and their inhibitors. Currently, the Protein Data Bank (PDB) does not contain all the data that our analysis would require. Therefore, an <it>in silico </it>approach was designed for building corresponding complexes</p> <p>The IFRs are obtained by "rigid body docking" among 70 structurally aligned, sequence wise non-redundant, serine protease structures with 3 inhibitors: bovine pancreatic trypsin inhibitor (BPTI), ecotine and ovomucoid third domain inhibitor. The table (matrix) of all amino acid positions at the interface and their respective occupancy is created. We also developed a new computational protocol for predicting IFRs for those complexes which were not deciphered experimentally so far, achieving accuracy of at least 0.97.</p> <p>Conclusions</p> <p>The serine proteases interfaces prefer polar (including glycine) residues (with some exceptions). Charged residues were found to be uniquely prevalent at the interfaces between the "miscellaneous-virus" subfamily and the three inhibitors. This prompts speculation about how important this difference in IFR characteristics is for maintaining virulence of those organisms.</p> <p>Our work here provides a unique tool for both structure/function relationship analysis as well as a compilation of indicators detailing how the specificity of various serine proteases may have been achieved and/or could be altered. It also indicates that the interface forming residues which also determine specificity of serine protease subfamily can not be presented in a canonical way but rather as a matrix of alternative populations of amino acids occupying variety of IFR positions.</p

    The Diamond STING server

    Get PDF
    Diamond STING is a new version of the STING suite of programs for a comprehensive analysis of a relationship between protein sequence, structure, function and stability. We have added a number of new functionalities by both providing more structure parameters to the STING Database and by improving/expanding the interface for enhanced data handling. The integration among the STING components has also been improved. A new key feature is the ability of the STING server to handle local files containing protein structures (either modeled or not yet deposited to the Protein Data Bank) so that they can be used by the principal STING components: (Java)Protein Dossier ((J)PD) and STING Report. The current capabilities of the new STING version and a couple of biologically relevant applications are described here. We have provided an example where Diamond STING identifies the active site amino acids and folding essential amino acids (both previously determined by experiments) by filtering out all but those residues by selecting the numerical values/ranges for a set of corresponding parameters. This is the fundamental step toward a more interesting endeavor—the prediction of such residues. Diamond STING is freely accessible at and

    Transcriptome and gene expression analysis of three developmental stages of the coffee berry borer, Hypothenemus hampei

    Get PDF
    Coffee production is a global industry valued at approximately 173 billion US dollars. One of the main challenges facing coffee production is the management of the coffee berry borer (CBB), Hypothenemus hampei, which is considered the primary arthropod pest of coffee worldwide. Current control strategies are inefficient for CBB management. Although biotechnological alternatives, including RNA interference (RNAi), have been proposed in recent years to control insect pests, characterizing the genetics of the target pest is essential for the successful application of these emerging technologies. In this study, we employed RNA-seq to obtain the transcriptome of three developmental stages of the CBB (larva, female and male) to increase our understanding of the CBB life cycle in relation to molecular features. The CBB transcriptome was sequenced using Illumina Hiseq and assembled de novo. Differential gene expression analysis was performed across the developmental stages. The final assembly produced 29,434 unigenes, of which 4,664 transcripts were differentially expressed. Genes linked to crucial physiological functions, such as digestion and detoxification, were determined to be tightly regulated between the reproductive and nonreproductive stages of CBB. The data obtained in this study help to elucidate the critical roles that several genes play as regulatory elements in CBB development
    corecore