Search CORE

81,438 research outputs found

Fast Tree Search for Enumeration of a Lattice Model of Protein Folding

Author: Anfinsen
Camacho
Chan
Chan
Dill
Dill
Dill
Eisenberg
Govindarajan
Helling
Jernigan
Kauzmann
Klimov
Lau
Levitt
Li
Li
Li
Mélin
Miller
Miyazawa
Nozaki
Orland
Pande
Roseman
Sali
Schmalz
Schuster
Shakhnovich
Shih
Shrivastava
Socci
Publication venue: 'AIP Publishing'
Publication date: 26/07/2001
Field of study

Using a fast tree-searching algorithm and a Pentium cluster, we enumerated all the sequences and compact conformations (structures) for a protein folding model on a cubic lattice of size

4\times3\times3

. We used two types of amino acids -- hydrophobic (H) and polar (P) -- to make up the sequences, so there were

2^{36} \approx 6.87 \times 10^{10}

different sequences. The total number of distinct structures was 84,731,192. We made use of a simple solvation model in which the energy of a sequence folded into a structure is minus the number of hydrophobic amino acids in the ``core'' of the structure. For every sequence, we found its ground state or ground states, i.e., the structure or structures for which its energy is lowest. About 0.3% of the sequences have a unique ground state. The number of structures that are unique ground states of at least one sequence is 2,662,050, about 3% of the total number of structures. However, these ``designable'' structures differ drastically in their designability, defined as the number of sequences whose unique ground state is that structure. To understand this variation in designability, we studied the distribution of structures in a high dimensional space in which each structure is represented by a string of 1's and 0's, denoting core and surface sites, respectively.Comment: 18 pages, 10 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

FoldExplorer: Fast and Accurate Protein Structure Search with Sequence-Enhanced Graph Embedding

Author: Liu Yuan
Shen Hong-Bin
Publication venue
Publication date: 29/11/2023
Field of study

The advent of highly accurate protein structure prediction methods has fueled an exponential expansion of the protein structure database. Consequently, there is a rising demand for rapid and precise structural homolog search. Traditional alignment-based methods are dedicated to precise comparisons between pairs, exhibiting high accuracy. However, their sluggish processing speed is no longer adequate for managing the current massive volume of data. In response to this challenge, we propose a novel deep-learning approach FoldExplorer. It harnesses the powerful capabilities of graph attention neural networks and protein large language models for protein structures and sequences data processing to generate embeddings for protein structures. The structural embeddings can be used for fast and accurate protein search. The embeddings also provide insights into the protein space. FoldExplorer demonstrates a substantial performance improvement of 5% to 8% over the current state-of-the-art algorithm on the benchmark datasets. Meanwhile, FoldExplorer does not compromise on search speed and excels particularly in searching on a large-scale dataset.Comment: 14 pages, 8 figure

arXiv.org e-Print Archive

Functionals linear in curvature and statistics of helical proteins

Author: Ambjörn
Aminov
Arodź
Banavar
Banavar
Barbashov
Barbashov
Barbi
Barbi
Batlle
Braaten
Chan
Chervyakov
Chothia
Curtright
Curtright
David
Dill
Do Carmo
Duggal
Eisenhart
Griffiths
Grundberg
Harris
Hyde
Isberg
Iso
Kamien
Kholodenko
Kholodenko
Kholodenko
Kholodenko
Kleinert
Kleinert
Kratky
Maritan
McCammon
Nesterenko
Nesterenko
Nesterenko
Nesterenko
Nesterenko
Nesterenko
Nesterenko
Nesterenko
O'Neill
Pauling
Pavsic
Pieranski
Pisarski
Plyushchay
Plyushchay
Plyushchay
Plyushchay
Plyushchay
Polyakov
Polyakov
Postnikov
Schweber
Spivak
Stasiak
Publication venue: 'Elsevier BV'
Publication date: 02/10/2003
Field of study

The effective free energy of globular protein chain is considered to be a functional defined on smooth curves in three dimensional Euclidean space. From the requirement of geometrical invariance, together with basic facts on conformation of helical proteins and dynamical characteristics of the protein chains, we are able to determine, in a unique way, the exact form of the free energy functional. Namely, the free energy density should be a linear function of the curvature of curves on which the free energy functional is defined. We briefly discuss the possibility of using the model proposed in Monte Carlo simulations of exhaustive searching the native stable state of the protein chain. The relation of this model to the rigid relativistic particles and strings is also considered.Comment: 18 pages, LaTeX2e, no figures, no tables; the title is changed slightly, the explanations are added concerning the physical content of the approach; the list of references is enlarge

arXiv.org e-Print Archive

Crossref

Joint Institute for Nuclear Research (JINR)

Efficient Algorithms for Node Disjoint Subgraph Homeomorphism Determination

Author: He Zhengying
Wang Wei
Wu Wentao
Xiao Yanghua
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/09/2007
Field of study

Recently, great efforts have been dedicated to researches on the management of large scale graph based data such as WWW, social networks, biological networks. In the study of graph based data management, node disjoint subgraph homeomorphism relation between graphs is more suitable than (sub)graph isomorphism in many cases, especially in those cases that node skipping and node mismatching are allowed. However, no efficient node disjoint subgraph homeomorphism determination (ndSHD) algorithms have been available. In this paper, we propose two computationally efficient ndSHD algorithms based on state spaces searching with backtracking, which employ many heuristics to prune the search spaces. Experimental results on synthetic data sets show that the proposed algorithms are efficient, require relative little time in most of the testing cases, can scale to large or dense graphs, and can accommodate to more complex fuzzy matching cases.Comment: 15 pages, 11 figures, submitted to DASFAA 200

arXiv.org e-Print Archive

CiteSeerX

Kinetics of protein-DNA interaction: facilitated target location in sequence-dependent potential

Author: Abkevich
Akke
Bell
Bell
Berg
Berg
Bouchaud
Bruinsma
Clarke
Derrida
Dyson
Elowitz
Erie
Finkelstein
Gerland
Goldhirsh
Grillo
Gutin
Hopfield
Hughes
Kalodimos
Kalodimos
Kalodimos
Kim
Landau
Levy
Lewis
Lomakin
Luscombe
Mirny
Murthy
O’Neil
Pande
Richter
Riggs
Riggs
Schumacher
Shimamoto
Shoemaker
Slutsky
Spolar
Stormo
Takeda
Uversky
von Hippel
Williams
Winkler
Winter
Wright
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

Recognition and binding of specific sites on DNA by proteins is central for many cellular functions such as transcription, replication, and recombination. In the process of recognition, a protein rapidly searches for its specific site on a long DNA molecule and then strongly binds this site. Here we aim to find a mechanism that can provide both a fast search (1-10 sec) and high stability of the specific protein-DNA complex (

K_d=10^{-15}-10^{-8}

M). Earlier studies have suggested that rapid search involves the sliding of a protein along the DNA. Here we consider sliding as a one-dimensional (1D) diffusion in a sequence-dependent rough energy landscape. We demonstrate that, in spite of the landscape's roughness, rapid search can be achieved if 1D sliding is accompanied by 3D diffusion. We estimate the range of the specific and non-specific DNA-binding energy required for rapid search and suggest experiments that can test our mechanism. We show that optimal search requires a protein to spend half of time sliding along the DNA and half diffusing in 3D. We also establish that, paradoxically, realistic energy functions cannot provide both rapid search and strong binding of a rigid protein. To reconcile these two fundamental requirements we propose a search-and-fold mechanism that involves the coupling of protein binding and partial protein folding. Proposed mechanism has several important biological implications for search in the presence of other proteins and nucleosomes, simultaneous search by several proteins etc. Proposed mechanism also provides a new framework for interpretation of experimental and structural data on protein-DNA interactions

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Crossref

PubMed Central