1,345 research outputs found

    Sequence-based Multiscale Model (SeqMM) for High-throughput chromosome conformation capture (Hi-C) data analysis

    Full text link
    In this paper, I introduce a Sequence-based Multiscale Model (SeqMM) for the biomolecular data analysis. With the combination of spectral graph method, I reveal the essential difference between the global scale models and local scale ones in structure clustering, i.e., different optimization on Euclidean (or spatial) distances and sequential (or genomic) distances. More specifically, clusters from global scale models optimize Euclidean distance relations. Local scale models, on the other hand, result in clusters that optimize the genomic distance relations. For a biomolecular data, Euclidean distances and sequential distances are two independent variables, which can never be optimized simultaneously in data clustering. However, sequence scale in my SeqMM can work as a tuning parameter that balances these two variables and deliver different clusterings based on my purposes. Further, my SeqMM is used to explore the hierarchical structures of chromosomes. I find that in global scale, the Fiedler vector from my SeqMM bears a great similarity with the principal vector from principal component analysis, and can be used to study genomic compartments. In TAD analysis, I find that TADs evaluated from different scales are not consistent and vary a lot. Particularly when the sequence scale is small, the calculated TAD boundaries are dramatically different. Even for regions with high contact frequencies, TAD regions show no obvious consistence. However, when the scale value increases further, although TADs are still quite different, TAD boundaries in these high contact frequency regions become more and more consistent. Finally, I find that for a fixed local scale, my method can deliver very robust TAD boundaries in different cluster numbers.Comment: 22 PAGES, 13 FIGURE

    Multiscale virtual particle based elastic network model (MVP-ENM) for biomolecular normal mode analysis

    Full text link
    In this paper, a multiscale virtual particle based elastic network model (MVP-ENM) is proposed for biomolecular normal mode analysis. The multiscale virtual particle model is proposed for the discretization of biomolecular density data in different scales. Essentially, the model works as the coarse-graining of the biomolecular structure, so that a delicate balance between biomolecular geometric representation and computational cost can be achieved. To form "connections" between these multiscale virtual particles, a new harmonic potential function, which considers the influence from both mass distributions and distance relations, is adopted between any two virtual particles. Unlike the previous ENMs that use a constant spring constant, a particle-dependent spring parameter is used in MVP-ENM. Two independent models, i.e., multiscale virtual particle based Gaussian network model (MVP-GNM) and multiscale virtual particle based anisotropic network model (MVP-ANM), are proposed. Even with a rather coarse grid and a low resolution, the MVP-GNM is able to predict the Debye-Waller factors (B-factors) with considerable good accuracy. Similar properties have also been observed in MVP-ANM. More importantly, in B-factor predictions, the mismatch between the predicted results and experimental ones is predominantly from higher fluctuation regions. Further, it is found that MVP-ANM can deliver a very consistent low-frequency eigenmodes in various scales. This demonstrates the great potential of MVP-ANM in the deformation analysis of low resolution data. With the multiscale rigidity function, the MVP-ENM can be applied to biomolecular data represented in density distribution and atomic coordinates. Further, the great advantage of my MVP-ENM model in computational cost has been demonstrated by using two poliovirus virus structures. Finally, the paper ends with a conclusion.Comment: 15 figures; 25 page

    Protein folding tames chaos

    Full text link
    Protein folding produces characteristic and functional three-dimensional structures from unfolded polypeptides or disordered coils. The emergence of extraordinary complexity in the protein folding process poses astonishing challenges to theoretical modeling and computer simulations. The present work introduces molecular nonlinear dynamics (MND), or molecular chaotic dynamics, as a theoretical framework for describing and analyzing protein folding. We unveil the existence of intrinsically low dimensional manifolds (ILDMs) in the chaotic dynamics of folded proteins. Additionally, we reveal that the transition from disordered to ordered conformations in protein folding increases the transverse stability of the ILDM. Stated differently, protein folding reduces the chaoticity of the nonlinear dynamical system, and a folded protein has the best ability to tame chaos. Additionally, we bring to light the connection between the ILDM stability and the thermodynamic stability, which enables us to quantify the disorderliness and relative energies of folded, misfolded and unfolded protein states. Finally, we exploit chaos for protein flexibility analysis and develop a robust chaotic algorithm for the prediction of Debye-Waller factors, or temperature factors, of protein structures
    • …
    corecore