43,378 research outputs found
Integration of molecular network data reconstructs Gene Ontology.
Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers’ yeasts protein–protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online
Protein folding tames chaos
Protein folding produces characteristic and functional three-dimensional
structures from unfolded polypeptides or disordered coils. The emergence of
extraordinary complexity in the protein folding process poses astonishing
challenges to theoretical modeling and computer simulations. The present work
introduces molecular nonlinear dynamics (MND), or molecular chaotic dynamics,
as a theoretical framework for describing and analyzing protein folding. We
unveil the existence of intrinsically low dimensional manifolds (ILDMs) in the
chaotic dynamics of folded proteins. Additionally, we reveal that the
transition from disordered to ordered conformations in protein folding
increases the transverse stability of the ILDM. Stated differently, protein
folding reduces the chaoticity of the nonlinear dynamical system, and a folded
protein has the best ability to tame chaos. Additionally, we bring to light the
connection between the ILDM stability and the thermodynamic stability, which
enables us to quantify the disorderliness and relative energies of folded,
misfolded and unfolded protein states. Finally, we exploit chaos for protein
flexibility analysis and develop a robust chaotic algorithm for the prediction
of Debye-Waller factors, or temperature factors, of protein structures
PTOMSM: A modified version of Topological Overlap Measure used for predicting Protein-Protein Interaction Network
A variety of methods are developed to integrating diverse biological data to predict novel interaction relationship between proteins. However, traditional integration can only generate protein interaction pairs within existing relationships. Therefore, we propose a modified version of Topological Overlap Measure to identify not only extant direct PPIs links, but also novel protein interactions that can be indirectly inferred from various relationships between proteins. Our method is more powerful than a naïve Bayesian-network-based integration in PPI prediction, and could generate more reliable candidate PPIs. Furthermore, we examined the influence of the sizes of training and test datasets on prediction, and further demonstrated the effectiveness of PTOMSM in predicting PPI. More importantly, this method can be extended naturally to predict other types of biological networks, and may be combined with Bayesian method to further improve the prediction
On the optimal contact potential of proteins
We analytically derive the lower bound of the total conformational energy of
a protein structure by assuming that the total conformational energy is well
approximated by the sum of sequence-dependent pairwise contact energies. The
condition for the native structure achieving the lower bound leads to the
contact energy matrix that is a scalar multiple of the native contact matrix,
i.e., the so-called Go potential. We also derive spectral relations between
contact matrix and energy matrix, and approximations related to one-dimensional
protein structures. Implications for protein structure prediction are
discussed.Comment: 5 pages, text onl
Multiscale virtual particle based elastic network model (MVP-ENM) for biomolecular normal mode analysis
In this paper, a multiscale virtual particle based elastic network model
(MVP-ENM) is proposed for biomolecular normal mode analysis. The multiscale
virtual particle model is proposed for the discretization of biomolecular
density data in different scales. Essentially, the model works as the
coarse-graining of the biomolecular structure, so that a delicate balance
between biomolecular geometric representation and computational cost can be
achieved. To form "connections" between these multiscale virtual particles, a
new harmonic potential function, which considers the influence from both mass
distributions and distance relations, is adopted between any two virtual
particles. Unlike the previous ENMs that use a constant spring constant, a
particle-dependent spring parameter is used in MVP-ENM. Two independent models,
i.e., multiscale virtual particle based Gaussian network model (MVP-GNM) and
multiscale virtual particle based anisotropic network model (MVP-ANM), are
proposed. Even with a rather coarse grid and a low resolution, the MVP-GNM is
able to predict the Debye-Waller factors (B-factors) with considerable good
accuracy. Similar properties have also been observed in MVP-ANM. More
importantly, in B-factor predictions, the mismatch between the predicted
results and experimental ones is predominantly from higher fluctuation regions.
Further, it is found that MVP-ANM can deliver a very consistent low-frequency
eigenmodes in various scales. This demonstrates the great potential of MVP-ANM
in the deformation analysis of low resolution data. With the multiscale
rigidity function, the MVP-ENM can be applied to biomolecular data represented
in density distribution and atomic coordinates. Further, the great advantage of
my MVP-ENM model in computational cost has been demonstrated by using two
poliovirus virus structures. Finally, the paper ends with a conclusion.Comment: 15 figures; 25 page
- …