2,617 research outputs found

    Protein (Multi-)Location Prediction: Using Location Inter-Dependencies in a Probabilistic Framework

    Full text link
    Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins, assuming that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems have attempted to predict multiple locations of proteins, they typically treat locations as independent or capture inter-dependencies by treating each locations-combination present in the training set as an individual location-class. We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the multiple-location-prediction process, using a collection of Bayesian network classifiers. We evaluate our system on a dataset of single- and multi-localized proteins. Our results, obtained by incorporating inter-dependencies are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without restricting predictions to be based only on location-combinations present in the training set.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

    Computational and Experimental Approaches to Reveal the Effects of Single Nucleotide Polymorphisms with Respect to Disease Diagnostics

    Get PDF
    DNA mutations are the cause of many human diseases and they are the reason for natural differences among individuals by affecting the structure, function, interactions, and other properties of DNA and expressed proteins. The ability to predict whether a given mutation is disease-causing or harmless is of great importance for the early detection of patients with a high risk of developing a particular disease and would pave the way for personalized medicine and diagnostics. Here we review existing methods and techniques to study and predict the effects of DNA mutations from three different perspectives: in silico, in vitro and in vivo. It is emphasized that the problem is complicated and successful detection of a pathogenic mutation frequently requires a combination of several methods and a knowledge of the biological phenomena associated with the corresponding macromolecules

    DEVELOPMENT OF TOOLS TO CHARACTERIZE PROTEIN-PROTEIN INTERACTIONS

    Get PDF
    Protein-protein interactions (PPIs) are crucial to most biological processes and activities. Large-scale PPI screening has been applied to model organisms as well as to human cells. Two approaches have been used extensively in high-throughput PPI studies: (i) the Yeast Two-Hybrid (Y2H) assay (a bottom-up method), and (ii) the tandem affinity purification (TAP) (a top-down method). However, a close examination of both techniques revealed issues that limit their effectiveness. Thus, it is important to develop new methods that can bridge the gap between the Y2H and the TAP. In this thesis, two approaches were developed to meet this need. The first approach was a photoaffinity labeling tool, which was based on a photo-caged reactive intermediate para-quinone methide (pQM) to study protein-peptide associations. This system was developed and optimized by using the interaction between catPTP1Bm and the EGFR peptide as a test case. Highly specific protein labeling was achieved, and mass spectrometry (MS) was used to identify the crosslinked site on the target protein. Interestingly, two peptides from catPTP1Bm detected by MS were found close to the enzyme-substrate binding interface in the three-dimensional structure of the complex, which demonstrated this method might be useful for the analysis of protein complex conformation. The second approach, named "PCA plus", took advantage of a technique referred to as "Protein-fragment Complementation Assay (PCA)". A hydrolysis-deficient mutant β-lactamase (E166N) was used, which enabled interacting protein labeling in live cells. With this modification, the PCA plus method realized live cell imaging with subcellular resolution. Fluorescent microscopy and flow cytometry analysis demonstrated its potential applications. In addition, a new β-lactamase substrate was developed for the PCA plus method and was applied to enable purification, from living cells, of prey protein interacting with a bait protein. The observed enrichment of interacting partners suggested the system could be used for high-throughput PPI screening. Moreover, this method could also be useful for the characterization of low affinity and transient PPIs because of its capacity on labeling interacting protein inside cells

    Molecular Science for Drug Development and Biomedicine

    Get PDF
    With the avalanche of biological sequences generated in the postgenomic age, molecular science is facing an unprecedented challenge, i.e., how to timely utilize the huge amount of data to benefit human beings. Stimulated by such a challenge, a rapid development has taken place in molecular science, particularly in the areas associated with drug development and biomedicine, both experimental and theoretical. The current thematic issue was launched with the focus on the topic of “Molecular Science for Drug Development and Biomedicine”, in hopes to further stimulate more useful techniques and findings from various approaches of molecular science for drug development and biomedicine

    PEvoLM: Protein Sequence Evolutionary Information Language Model

    Full text link
    With the exponential increase of the protein sequence databases over time, multiple-sequence alignment (MSA) methods, like PSI-BLAST, perform exhaustive and time-consuming database search to retrieve evolutionary information. The resulting position-specific scoring matrices (PSSMs) of such search engines represent a crucial input to many machine learning (ML) models in the field of bioinformatics and computational biology. A protein sequence is a collection of contiguous tokens or characters called amino acids (AAs). The analogy to natural language allowed us to exploit the recent advancements in the field of Natural Language Processing (NLP) and therefore transfer NLP state-of-the-art algorithms to bioinformatics. This research presents an Embedding Language Model (ELMo), converting a protein sequence to a numerical vector representation. While the original ELMo trained a 2-layer bidirectional Long Short-Term Memory (LSTMs) network following a two-path architecture, one for the forward and the second for the backward pass, by merging the idea of PSSMs with the concept of transfer-learning, this work introduces a novel bidirectional language model (bi-LM) with four times less free parameters and using rather a single path for both passes. The model was trained not only on predicting the next AA but also on the probability distribution of the next AA derived from similar, yet different sequences as summarized in a PSSM, simultaneously for multi-task learning, hence learning evolutionary information of protein sequences as well. The network architecture and the pre-trained model are made available as open source under the permissive MIT license on GitHub at https://github.com/issararab/PEvoLM.Comment:

    A novel design of multi-epitope based vaccine against Escherichia coli

    Get PDF
    Background: Multi-valent based vaccines have advantage over conventional vaccines because of its multi-faceted action targeted at antigen; thereby raising hope of a more sustained actions against allergens. Escherichia coli (E. coli) is a bacterium that is commonly found in the gut of humans and warm-blooded animals. An increasing number of outbreaks are associated with the consumption of fruits and vegetables (including sprouts, spinach, lettuce, coleslaw, and salad) thereby contamination may be due to contact with faeces from domestic or wild animals at some stages during cultivation or handling. Due to the reported increase in resistance to antibiotics used for Escherichia coli control; an effective vaccine is a would-be alternative of proven interest. Hence, a need for a rational, strategic, and efficient vaccine candidate against E.coli is of paramount necessity by the use of the most current bioinformatics tools to achieve this task. Method: In this study, immunoinformatics tools mined from diverse molecular databases were used  for a novel putative epitope based oral vaccine against E.coli. The prospective vaccine proteins were carefully screened and validated to achieve a high thorough-put three-dimensional protein structure. The eventual propsective vaccine candidate proteins was evaluated for its non-allergenicity, antigenicity, solubility, appropriate molecular weight testing and isoelectric point evaluation. Conclusion: The resultant vaccine candidate could serve as a promising anti-E.coli vaccine candidate. Immunoinformatics is a new field over pharmaco-therapeutics; this newest technology should continue to be a rescue from age-long traditional approach in vaccine developments

    Structure of a highly conserved domain of rock1 required for shroom-mediated regulation of cell morphology

    Get PDF
    Rho-associated coiled coil containing protein kinase (Rho-kinase or Rock) is a well-defined determinant of actin organization and dynamics in most animal cells characterized to date. One of the primary effectors of Rock is non-muscle myosin II. Activation of Rock results in increased contractility of myosin II and subsequent changes in actin architecture and cell morphology. The regulation of Rock is thought to occur via autoinhibition of the kinase domain via intramolecular interactions between the N-terminus and the C-terminus of the kinase. This autoinhibited state can be relieved via proteolytic cleavage, binding of lipids to a Pleckstrin Homology domain near the C-terminus, or binding of GTP-bound RhoA to the central coiled-coil region of Rock. Recent work has identified the Shroom family of proteins as an additional regulator of Rock either at the level of cellular distribution or catalytic activity or both. The Shroom-Rock complex is conserved in most animals and is essential for the formation of the neural tube, eye, and gut in vertebrates. To address the mechanism by which Shroom and Rock interact, we have solved the structure of the coiled-coil region of Rock that binds to Shroom proteins. Consistent with other observations, the Shroom binding domain is a parallel coiled-coil dimer. Using biochemical approaches, we have identified a large patch of residues that contribute to Shrm binding. Their orientation suggests that there may be two independent Shrm binding sites on opposing faces of the coiled-coil region of Rock. Finally, we show that the binding surface is essential for Rock colocalization with Shroom and for Shroom-mediated changes in cell morphology. © 2013 Mohan et al
    corecore