325 research outputs found
The evaluation of protein folding rate constant is improved by predicting the folding kinetic order with a SVM-based method
Protein folding is a problem of large interest since it concerns the
mechanism by which the genetic information is translated into proteins with
well defined three-dimensional (3D) structures and functions. Recently
theoretical models have been developed to predict the protein folding rate
considering the relationships of the process with tolopological parameters
derived from the native (atomic-solved) protein structures. Previous works
classified proteins in two different groups exhibiting either a
single-exponential or a multi-exponential folding kinetics. It is well known
that these two classes of proteins are related to different protein structural
features. The increasing number of available experimental kinetic data allows
the application to the problem of a machine learning approach, in order to
predict the kinetic order of the folding process starting from the experimental
data so far collected. This information can be used to improve the prediction
of the folding rate. In this work first we describe a support vector
machine-based method (SVM-KO) to predict for a given protein the kinetic order
of the folding process. Using this method we can classify correctly 78% of the
folding mechanisms over a set of 63 experimental data. Secondly we focus on the
prediction of the logarithm of the folding rate. This value can be obtained as
a linear regression task with a SVM-based method. In this paper we show that
linear correlation of the predicted with experimental data can improve when the
regression task is computed over two different sets, instead of one, each of
them composed by the proteins with a correctly predicted two state or
multistate kinetic order.Comment: The paper will be published on WSEAS Transaction on Biology and
Biomedicin
The 4th Bologna Winter School: Hot Topics in Structural Genomics
The 4th Bologna Winter School on Biotechnologies was held on 9–15 February
2003 at the University of Bologna, Italy, with the specific aim of discussing recent
developments in bioinformatics. The school provided an opportunity for students
and scientists to debate current problems in computational biology and possible
solutions. The course, co-supported (as last year) by the European Science Foundation
program on Functional Genomics, focused mainly on hot topics in structural
genomics, including recent CASP and CAPRI results, recent and promising genomewide
predictions, protein–protein and protein–DNA interaction predictions and
genome functional annotation. The topics were organized into four main sections
(http://www.biocomp.unibo.it)
The posterior-Viterbi: a new decoding algorithm for hidden Markov models
Background: Hidden Markov models (HMM) are powerful machine learning tools
successfully applied to problems of computational Molecular Biology. In a
predictive task, the HMM is endowed with a decoding algorithm in order to
assign the most probable state path, and in turn the class labeling, to an
unknown sequence. The Viterbi and the posterior decoding algorithms are the
most common. The former is very efficient when one path dominates, while the
latter, even though does not guarantee to preserve the automaton grammar, is
more effective when several concurring paths have similar probabilities. A
third good alternative is 1-best, which was shown to perform equal or better
than Viterbi. Results: In this paper we introduce the posterior-Viterbi (PV) a
new decoding which combines the posterior and Viterbi algorithms. PV is a two
step process: first the posterior probability of each state is computed and
then the best posterior allowed path through the model is evaluated by a
Viterbi algorithm.
Conclusions: We show that PV decoding performs better than other algorithms
first on toy models and then on the computational biological problem of the
prediction of the topology of beta-barrel membrane proteins.Comment: 23 pages, 3 figure
SChloro: directing Viridiplantae proteins to six chloroplastic sub-compartments
Motivation: Chloroplasts are organelles found in plants and involved in several important cell processes. Similarly to other compartments in the cell, chloroplasts have an internal structure comprising several sub-compartments, where different proteins are targeted to perform their functions. Given the relation between protein function and localization, the availability of effective computational tools to predict protein sub-organelle localizations is crucial for large-scale functional studies.
Results: In this paper we present SChloro, a novel machine-learning approach to predict protein sub-chloroplastic localization, based on targeting signal detection and membrane protein information. The proposed approach performs multi-label predictions discriminating six chloroplastic sub-compartments that include inner membrane, outer membrane, stroma, thylakoid lumen, plastoglobule and thylakoid membrane. In comparative benchmarks, the proposed method outperforms current state-of-the-art methods in both single-and multi-compartment predictions, with an overall multi-label accuracy of 74%. The results demonstrate the relevance of the approach that is eligible as a good candidate for integration into more general large-scale annotation pipelines of protein subcellular localization
NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases
Enrichment analysis is a widely applied procedure for shedding light on the molecular mechanisms and functions at the basis of phenotypes, for enlarging the dataset of possibly related genes/proteins and for helping interpretation and prioritization of newly determined variations. Several standard and Network-based enrichment methods are available. Both approaches rely on the annotations that characterize the genes/proteins included in the input set; network based ones also include in different ways physical and functional relationships among different genes or proteins that can be extracted from the available biological networks of interactions
In silico evidence of the relationship between miRNAs and siRNAs
Both short interfering RNAs (siRNAs) and microRNAs (miRNAs) mediate the
repression of specific sequences of mRNA through the RNA interference pathway.
In the last years several experiments have supported the hypothesis that siRNAs
and miRNAs may be functionally interchangeable, at least in cultured cells. In
this work we verify that this hypothesis is also supported by a computational
evidence. We show that a method specifically trained to predict the activity of
the exogenous siRNAs assigns a high silencing level to experimentally
determined human miRNAs. This result not only supports the idea of siRNAs and
miRNAs equivalence but indicates that it is possible to use computational tools
developed using synthetic small interference RNAs to investigate endogenous
miRNAs.Comment: 8 pages, 2 figure
BUSCA: An integrative web server to predict subcellular localization of proteins
Here, we present BUSCA (http://busca.biocomp.unibo.it), a novel web server that integrates different computational tools for predicting protein subcellular localization. BUSCA combines methods for identifying signal and transit peptides (DeepSig and TPpred3), GPI-anchors (PredGPI) and transmembrane domains (ENSEMBLE3.0 and BetAware) with tools for discriminating subcellular localization of both globular and membrane proteins (BaCelLo, MemLoci and SChloro). Outcomes from the different tools are processed and integrated for annotating subcellular localization of both eukaryotic and bacterial protein sequences. We benchmark BUSCA against protein targets derived from recent CAFA experiments and other specific data sets, reporting performance at the state-of-the-art. BUSCA scores better than all other evaluated methods on 2732 targets from CAFA2, with a F1 value equal to 0.49 and among the best methods when predicting targets from CAFA3. We propose BUSCA as an integrated and accurate resource for the annotation of protein subcellular localization
I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure
I-Mutant2.0 is a support vector machine (SVM)-based tool for the automatic prediction of protein stability changes upon single point mutations. I-Mutant2.0 predictions are performed starting either from the protein structure or, more importantly, from the protein sequence. This latter task, to the best of our knowledge, is exploited for the first time. The method was trained and tested on a data set derived from ProTherm, which is presently the most comprehensive available database of thermodynamic experimental data of free energy changes of protein stability upon mutation under different conditions. I-Mutant2.0 can be used both as a classifier for predicting the sign of the protein stability change upon mutation and as a regression estimator for predicting the related ΔΔG values. Acting as a classifier, I-Mutant2.0 correctly predicts (with a cross-validation procedure) 80% or 77% of the data set, depending on the usage of structural or sequence information, respectively. When predicting ΔΔG values associated with mutations, the correlation of predicted with expected/experimental values is 0.71 (with a standard error of 1.30 kcal/mol) and 0.62 (with a standard error of 1.45 kcal/mol) when structural or sequence information are respectively adopted. Our web interface allows the selection of a predictive mode that depends on the availability of the protein structure and/or sequence. In this latter case, the web server requires only pasting of a protein sequence in a raw format. We therefore introduce I-Mutant2.0 as a unique and valuable helper for protein design, even when the protein structure is not yet known with atomic resolution. Availability:
Large scale analysis of protein stability in OMIM disease related human protein variants
Modern genomic techniques allow to associate several Mendelian human diseases to single residue variations in different proteins. Molecular mechanisms explaining the relationship among genotype and phenotype are still under debate. Change of protein stability upon variation appears to assume a particular relevance in annotating whether a single residue substitution can or cannot be associated to a given disease. Thermodynamic properties of human proteins and of their disease related variants are lacking. In the present work, we take advantage of the available three dimensional structure of human proteins for predicting the role of disease related variations on the perturbation of protein stability
A new decoding algorithm for hidden Markov models improves the prediction of the topology of all-beta membrane proteins
BACKGROUND: Structure prediction of membrane proteins is still a challenging computational problem. Hidden Markov models (HMM) have been successfully applied to the problem of predicting membrane protein topology. In a predictive task, the HMM is endowed with a decoding algorithm in order to assign the most probable state path, and in turn the labels, to an unknown sequence. The Viterbi and the posterior decoding algorithms are the most common. The former is very efficient when one path dominates, while the latter, even though does not guarantee to preserve the HMM grammar, is more effective when several concurring paths have similar probabilities. A third good alternative is 1-best, which was shown to perform equal or better than Viterbi. RESULTS: In this paper we introduce the posterior-Viterbi (PV) a new decoding which combines the posterior and Viterbi algorithms. PV is a two step process: first the posterior probability of each state is computed and then the best posterior allowed path through the model is evaluated by a Viterbi algorithm. CONCLUSION: We show that PV decoding performs better than other algorithms when tested on the problem of the prediction of the topology of beta-barrel membrane proteins
- …