24 research outputs found

    Exploiting physico-chemical properties in string kernels

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>String kernels are commonly used for the classification of biological sequences, nucleotide as well as amino acid sequences. Although string kernels are already very powerful, when it comes to amino acids they have a major short coming. They ignore an important piece of information when comparing amino acids: the physico-chemical properties such as size, hydrophobicity, or charge. This information is very valuable, especially when training data is less abundant. There have been only very few approaches so far that aim at combining these two ideas.</p> <p>Results</p> <p>We propose new string kernels that combine the benefits of physico-chemical descriptors for amino acids with the ones of string kernels. The benefits of the proposed kernels are assessed on two problems: MHC-peptide binding classification using position specific kernels and protein classification based on the substring spectrum of the sequences. Our experiments demonstrate that the incorporation of amino acid properties in string kernels yields improved performances compared to standard string kernels and to previously proposed non-substring kernels.</p> <p>Conclusions</p> <p>In summary, the proposed modifications, in particular the combination with the RBF substring kernel, consistently yield improvements without affecting the computational complexity. The proposed kernels therefore appear to be the kernels of choice for any protein sequence-based inference.</p> <p>Availability</p> <p>Data sets, code and additional information are available from <url>http://www.fml.tuebingen.mpg.de/raetsch/suppl/aask</url>. Implementations of the developed kernels are available as part of the Shogun toolbox.</p

    PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions

    Get PDF
    BACKGROUND: Many different aspects of cellular signalling, trafficking and targeting mechanisms are mediated by interactions between proteins and peptides. Representative examples are MHC-peptide complexes in the immune system. Developing computational methods for protein-peptide binding prediction is therefore an important task with applications to vaccine and drug design. METHODS: Previous learning approaches address the binding prediction problem using traditional margin based binary classifiers. In this paper we propose PepDist: a novel approach for predicting binding affinity. Our approach is based on learning peptide-peptide distance functions. Moreover, we suggest to learn a single peptide-peptide distance function over an entire family of proteins (e.g. MHC class I). This distance function can be used to compute the affinity of a novel peptide to any of the proteins in the given family. In order to learn these peptide-peptide distance functions, we formalize the problem as a semi-supervised learning problem with partial information in the form of equivalence constraints. Specifically, we propose to use DistBoost [1,2], which is a semi-supervised distance learning algorithm. RESULTS: We compare our method to various state-of-the-art binding prediction algorithms on MHC class I and MHC class II datasets. In almost all cases, our method outperforms all of its competitors. One of the major advantages of our novel approach is that it can also learn an affinity function over proteins for which only small amounts of labeled peptides exist. In these cases, our method's performance gain, when compared to other computational methods, is even more pronounced. We have recently uploaded the PepDist webserver which provides binding prediction of peptides to 35 different MHC class I alleles. The webserver which can be found at is powered by a prediction engine which was trained using the framework presented in this paper. CONCLUSION: The results obtained suggest that learning a single distance function over an entire family of proteins achieves higher prediction accuracy than learning a set of binary classifiers for each of the proteins separately. We also show the importance of obtaining information on experimentally determined non-binders. Learning with real non-binders generalizes better than learning with randomly generated peptides that are assumed to be non-binders. This suggests that information about non-binding peptides should also be published and made publicly available

    A Search for Energy Minimized Sequences of Proteins

    Get PDF
    In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with minimum energy. We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences. Our methods enable the computation of a lower bound as well as a tight upper bound for the energy of a given conformation. We validate our results by using three different inter-residue energy matrices for five proteins from protein data bank (PDB), and by comparing our energy-minimizing sequences with 80 million diverse sequences that are generated based on different considerations in each case. When we submitted some of our chosen energy-minimizing sequences to Basic Local Alignment Search Tool (BLAST), we obtained some sequences from non-redundant protein sequence database that are similar to ours with an E-value of the order of 10-7. In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence. The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function

    Evaluation of the Allergenicity Potential of TcPR-10 Protein from Theobroma cacao

    Get PDF
    Background: The pathogenesis related protein PR10 (TcPR-10), obtained from the Theobroma cacao-Moniliophthora perniciosa interaction library, presents antifungal activity against M. perniciosa and acts in vitro as a ribonuclease. However, despite its biotechnological potential, the TcPR-10 has the P-loop motif similar to those of some allergenic proteins such as Bet v 1 (Betula verrucosa) and Pru av 1 (Prunus avium). The insertion of mutations in this motif can produce proteins with reduced allergenic power. The objective of the present work was to evaluate the allergenic potential of the wild type and mutant recombinant TcPR-10 using bioinformatics tools and immunological assays. Methodology/Principal Findings: Mutant substitutions (T10P, I30V, H45S) were inserted in the TcPR-10 gene by sitedirected mutagenesis, cloned into pET28a and expressed in Escherichia coli BL21(DE3) cells. Changes in molecular surface caused by the mutant substitutions was evaluated by comparative protein modeling using the three-dimensional structure of the major cherry allergen, Pru av 1 as a template. The immunological assays were carried out in 8-12 week old female BALB/c mice. The mice were sensitized with the proteins (wild type and mutants) via subcutaneous and challenged intranasal for induction of allergic airway inflammation. Conclusions/Significance: We showed that the wild TcPR-10 protein has allergenic potential, whereas the insertion of mutations produced proteins with reduced capacity of IgE production and cellular infiltration in the lungs. On the other hand, in vitro assays show that the TcPR-10 mutants still present antifungal and ribonuclease activity against M. perniciosa RNA. In conclusion, the mutant proteins present less allergenic potential than the wild TcPR-10, without the loss of interesting biotechnological properties. (Résumé d'auteur

    Effects of fiber content and its chemical treatment on the mechanical properties of screw pine fiber reinforced vinyl ester composite

    No full text
    Natural fiber-reinforced polymer composites have several advantages over traditional composites. The chemical modification of natural fibers helps to develop polymer composites with better mechanical properties. In the present work, mechanical properties such as tensile, flexural, and impact strength of chopped Screw pine fiber reinforced vinyl ester composites have been evaluated under-treated conditions based on the volume fractions of Screw pine fibers. The fibers have been treated with 5% of NaOH solution for 1 h at room temperature. The hand lay-up method has been used to prepare composite plates at room temperature. The results revealed that mechanical properties of composites increased with the increase of the fiber content up to 35.57 vol% at both the untreated and treated conditions and then dropped. However, the modulus values have been increased continuously from the fiber content of 8.43 to 45.3 vol%. It was identified that the critical or optimum fiber content for better mechanical properties is 35.57 vol% for both the untreated and treated conditions. The percentage of improvement at every combination was obtained by comparing the composites prepared with the untreated and treated fibers. The fractured surface of the treated fiber composites was examined by scanning electron microscopy. Moreover, the tensile properties are predicted using the Hirsch and Modified Bowyer and Bader model and compared with experimental values. The predicted results revealed that the Modified Bowyer and Bader model shows better conformity

    A Screen capture of APDbase showing query and property value deposition interface is shown

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "APDbase: Amino acid Physico­chemical properties Database "</p><p></p><p>Bioinformation 2005;1(1):2-4.</p><p>Published online 12 Mar 2005</p><p>PMCID:PMC1891621.</p><p></p> This database can be queried using either amino acid property keyword or database index number. A search result for hydrophobicity is shown here as a sampl
    corecore