37 research outputs found

    MRFalign: Protein Homology Detection through Alignment of Markov Random Fields

    Full text link
    Sequence-based protein homology detection has been extensively studied and so far the most sensitive method is based upon comparison of protein sequence profiles, which are derived from multiple sequence alignment (MSA) of sequence homologs in a protein family. A sequence profile is usually represented as a position-specific scoring matrix (PSSM) or an HMM (Hidden Markov Model) and accordingly PSSM-PSSM or HMM-HMM comparison is used for homolog detection. This paper presents a new homology detection method MRFalign, consisting of three key components: 1) a Markov Random Fields (MRF) representation of a protein family; 2) a scoring function measuring similarity of two MRFs; and 3) an efficient ADMM (Alternating Direction Method of Multipliers) algorithm aligning two MRFs. Compared to HMM that can only model very short-range residue correlation, MRFs can model long-range residue interaction pattern and thus, encode information for the global 3D structure of a protein family. Consequently, MRF-MRF comparison for remote homology detection shall be much more sensitive than HMM-HMM or PSSM-PSSM comparison. Experiments confirm that MRFalign outperforms several popular HMM or PSSM-based methods in terms of both alignment accuracy and remote homology detection and that MRFalign works particularly well for mainly beta proteins. For example, tested on the benchmark SCOP40 (8353 proteins) for homology detection, PSSM-PSSM and HMM-HMM succeed on 48% and 52% of proteins, respectively, at superfamily level, and on 15% and 27% of proteins, respectively, at fold level. In contrast, MRFalign succeeds on 57.3% and 42.5% of proteins at superfamily and fold level, respectively. This study implies that long-range residue interaction patterns are very helpful for sequence-based homology detection. The software is available for download at http://raptorx.uchicago.edu/download/.Comment: Accepted by both RECOMB 2014 and PLOS Computational Biolog

    eRepo-ORP: Exploring the Opportunity Space to Combat Orphan Diseases with Existing Drugs

    Get PDF
    © 2017 About 7000 rare, or orphan, diseases affect more than 350 million people worldwide. Although these conditions collectively pose significant health care problems, drug companies seldom develop drugs for orphan diseases due to extremely limited individual markets. Consequently, developing new treatments for often life-threatening orphan diseases is primarily contingent on financial incentives from governments, special research grants, and private philanthropy. Computer-aided drug repositioning is a cheaper and faster alternative to traditional drug discovery offering a promising venue for orphan drug research. Here, we present eRepo-ORP, a comprehensive resource constructed by a large-scale repositioning of existing drugs to orphan diseases with a collection of structural bioinformatics tools, including eThread, eFindSite, and eMatchSite. Specifically, a systematic exploration of 320,856 possible links between known drugs in DrugBank and orphan proteins obtained from Orphanet reveals as many as 18,145 candidates for repurposing. In order to illustrate how potential therapeutics for rare diseases can be identified with eRepo-ORP, we discuss the repositioning of a kinase inhibitor for Ras-associated autoimmune leukoproliferative disease. The eRepo-ORP data set is available through the Open Science Framework at https://osf.io/qdjup/

    TMFoldRec: a statistical potential-based transmembrane protein fold recognition tool.

    Get PDF
    BACKGROUND: Transmembrane proteins (TMPs) are the key components of signal transduction, cell-cell adhesion and energy and material transport into and out from the cells. For the deep understanding of these processes, structure determination of transmembrane proteins is indispensable. However, due to technical difficulties, only a few transmembrane protein structures have been determined experimentally. Large-scale genomic sequencing provides increasing amounts of sequence information on the proteins and whole proteomes of living organisms resulting in the challenge of bioinformatics; how the structural information should be gained from a sequence. RESULTS: Here, we present a novel method, TMFoldRec, for fold prediction of membrane segments in transmembrane proteins. TMFoldRec based on statistical potentials was tested on a benchmark set containing 124 TMP chains from the PDBTM database. Using a 10-fold jackknife method, the native folds were correctly identified in 77 % of the cases. This accuracy overcomes the state-of-the-art methods. In addition, a key feature of TMFoldRec algorithm is the ability to estimate the reliability of the prediction and to decide with an accuracy of 70 %, whether the obtained, lowest energy structure is the native one. CONCLUSION: These results imply that the membrane embedded parts of TMPs dictate the TM structures rather than the soluble parts. Moreover, predictions with reliability scores make in this way our algorithm applicable for proteome-wide analyses. AVAILABILITY: The program is available upon request for academic use

    Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning

    Get PDF
    Direct prediction of protein structure from sequence is a challenging problem. An effective approach is to break it up into independent sub-problems. These sub-problems such as prediction of protein secondary structure can then be solved independently. In a previous study, we found that an iterative use of predicted secondary structure and backbone torsion angles can further improve secondary structure and torsion angle prediction. In this study, we expand the iterative features to include solvent accessible surface area and backbone angles and dihedrals based on Cα atoms. By using a deep learning neural network in three iterations, we achieved 82% accuracy for secondary structure prediction, 0.76 for the correlation coefficient between predicted and actual solvent accessible surface area, 19° and 30° for mean absolute errors of backbone φ and ψ angles, respectively, and 8° and 32° for mean absolute errors of Cα-based θ and τ angles, respectively, for an independent test dataset of 1199 proteins. The accuracy of the method is slightly lower for 72 CASP 11 targets but much higher than those of model structures from current state-of-the-art techniques. This suggests the potentially beneficial use of these predicted properties for model assessment and ranking

    3D structure of a Brucella melitensis porin: molecular modelling in lipid membranes

    Get PDF
    Brucella melitensis is a pathogenic bacterium responsible for brucellosis in mammals and humans. Its outer membrane proteins (Omp) control the diffusion of solutes through the membrane, and they consequently have a crucial role in the design of diagnostics and vaccines. Moreover, such proteins have recently revealed their potential for protein-based biomaterials. In the present contribution, the structure of the B. melitensis porin Omp2a is built using the RaptorX threading method. This is a 16-stranded ß-barrel with an a-helix on the third loop folding inside the barrel and forming the constriction zone of the channel, a typical feature of general porins such as PhoE and OmpF. The preferential diffusion of cations over anions experimentally observed in anterior studies is evidenced by the presence of distinct clusters of charges in the extracellular loops and in the inner pore. Docking studies support the previously reported hypothesis of Omp2a ability to aid maltotetraose diffusion. The monomer model is then assembled into a homotrimer, stabilized by the L2 loop involved in most of the interface interactions. The stability of the trimer is evaluated in three bilayers: pure 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC), pure 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphoethanolamine (POPE) and a mixture of 1:1 of POPC/POPE. All-atom molecular dynamics simulations demonstrate the ß-barrel-structural stability over time even though a breathing-like motion is observed. Compared to the pure bilayers, the POPC/POPE better preserves the integrity of the protein and its channel. Overall, this work demonstrates the relevancy of the Omp2a model and will help to design new therapeutic agents and bioinspired nanomaterialsPeer ReviewedPostprint (author's final draft

    In silico proteomic and phylogenetic analysis of the outer membrane protein repertoire of gastric Helicobacter species

    Get PDF
    Helicobacter (H.) pylori is an important risk factor for gastric malignancies worldwide. Its outer membrane proteome takes an important role in colonization of the human gastric mucosa. However, in zoonotic non-H. pylori helicobacters (NHPHs) also associated with human gastric disease, the composition of the outer membrane (OM) proteome and its relative contribution to disease remain largely unknown. By means of a comprehensive survey of the diversity and distribution of predicted outer membrane proteins (OMPs) identified in all known gastric Helicobacter species with fully annotated genome sequences, we found genus- and species-specific families known or thought to be implicated in virulence. Hop adhesins, part of the Helicobacter-specific family 13 (Hop, Hor and Horn) were restricted to the gastric species H. pylori, H. cetorum and H. acinonychis. Hof proteins (family 33) were putative adhesins with predicted Occ- or MOMP-family like 18-stranded beta-barrels. They were found to be widespread amongst all gastric Helicobacter species only sporadically detected in enterohepatic Helicobacter species. These latter are other members within the genus Helicobacter, although ecologically and genetically distinct. LpxR, a lipopolysaccharide remodeling factor, was also detected in all gastric Helicobacter species but lacking as well from the enterohepatic species H. cinaedi, H. equorum and H. hepaticus. In conclusion, our systemic survey of Helicobacter OMPs points to species and infection-site specific members that are interesting candidates for future virulence and colonization studies.Peer reviewe

    Functional distinctions between two isoforms of WNT5A

    Get PDF
    WNT5A is a secreted protein ligand with important roles in development, adult tissue homeostasis, and many basic cellular functions. In cancers, aberrant WNT5A signaling is often observed. This study focused on potential functional differences between two isoforms of this protein, on potential signaling differences between these two isoforms, and on differential expression of these isoforms in the context of mouse embryonic development. This study found that these isoforms both had no effect on either cellular migration or invasion in HCT116, contrary to reports published in the literature. This study also found that both these isoforms had no detectable effect on activation of terminal effector molecules in the PCP/CE pathway or in the Wnt/Ca2+ pathway. In a murine model of embryonic development, we observed that WNT5A isoforms were differentially regulated. Overall, the findings of this study suggest that WNT5A encodes functionally distinct protein isoforms which may play a role in enabling the orchestration of a complex series of biological events during embryonic development
    corecore