869 research outputs found
Recoverable One-dimensional Encoding of Three-dimensional Protein Structures
Protein one-dimensional (1D) structures such as secondary structure and
contact number provide intuitive pictures to understand how the native
three-dimensional (3D) structure of a protein is encoded in the amino acid
sequence. However, it has not been clear whether a given set of 1D structures
contains sufficient information for recovering the underlying 3D structure.
Here we show that the 3D structure of a protein can be recovered from a set of
three types of 1D structures, namely, secondary structure, contact number and
residue-wise contact order which is introduced here for the first time. Using
simulated annealing molecular dynamics simulations, the structures satisfying
the given native 1D structural restraints were sought for 16 proteins of
various structural classes and of sizes ranging from 56 to 146 residues. By
selecting the structures best satisfying the restraints, all the proteins
showed a coordinate RMS deviation of less than 4\AA{} from the native
structure, and for most of them, the deviation was even less than 2\AA{}. The
present result opens a new possibility to protein structure prediction and our
understanding of the sequence-structure relationship.Comment: Corrected title. No Change In Content
Predicting Secondary Structures, Contact Numbers, and Residue-wise Contact Orders of Native Protein Structure from Amino Acid Sequence by Critical Random Networks
Prediction of one-dimensional protein structures such as secondary structures
and contact numbers is useful for the three-dimensional structure prediction
and important for the understanding of sequence-structure relationship. Here we
present a new machine-learning method, critical random networks (CRNs), for
predicting one-dimensional structures, and apply it, with position-specific
scoring matrices, to the prediction of secondary structures (SS), contact
numbers (CN), and residue-wise contact orders (RWCO). The present method
achieves, on average, accuracy of 77.8% for SS, correlation coefficients
of 0.726 and 0.601 for CN and RWCO, respectively. The accuracy of the SS
prediction is comparable to other state-of-the-art methods, and that of the CN
prediction is a significant improvement over previous methods. We give a
detailed formulation of critical random networks-based prediction scheme, and
examine the context-dependence of prediction accuracies. In order to study the
nonlinear and multi-body effects, we compare the CRNs-based method with a
purely linear method based on position-specific scoring matrices. Although not
superior to the CRNs-based method, the surprisingly good accuracy achieved by
the linear method highlights the difficulty in extracting structural features
of higher order from amino acid sequence beyond that provided by the
position-specific scoring matrices.Comment: 20 pages, 1 figure, 5 tables; minor revision; accepted for
publication in BIOPHYSIC
Properties of contact matrices induced by pairwise interactions in proteins
The total conformational energy is assumed to consist of pairwise interaction
energies between atoms or residues, each of which is expressed as a product of
a conformation-dependent function (an element of a contact matrix, C-matrix)
and a sequence-dependent energy parameter (an element of a contact energy
matrix, E-matrix). Such pairwise interactions in proteins force native
C-matrices to be in a relationship as if the interactions are a Go-like
potential [N. Go, Annu. Rev. Biophys. Bioeng. 12. 183 (1983)] for the native
C-matrix, because the lowest bound of the total energy function is equal to the
total energy of the native conformation interacting in a Go-like pairwise
potential. This relationship between C- and E-matrices corresponds to (a) a
parallel relationship between the eigenvectors of the C- and E-matrices and a
linear relationship between their eigenvalues, and (b) a parallel relationship
between a contact number vector and the principal eigenvectors of the C- and
E-matrices; the E-matrix is expanded in a series of eigenspaces with an
additional constant term, which corresponds to a threshold of contact energy
that approximately separates native contacts from non-native ones. These
relationships are confirmed in 182 representatives from each family of the SCOP
database by examining inner products between the principal eigenvector of the
C-matrix, that of the E-matrix evaluated with a statistical contact potential,
and a contact number vector. In addition, the spectral representation of C- and
E-matrices reveals that pairwise residue-residue interactions, which depends
only on the types of interacting amino acids but not on other residues in a
protein, are insufficient and other interactions including residue
connectivities and steric hindrance are needed to make native structures the
unique lowest energy conformations.Comment: Errata in DOI:10.1103/PhysRevE.77.051910 has been corrected in the
present versio
Wang-Landau molecular dynamics technique to search for low-energy conformational space of proteins
Multicanonical molecular dynamics (MD) is a powerful technique for sampling
conformations on rugged potential surfaces such as protein. However, it is
notoriously difficult to estimate the multicanonical temperature effectively.
Wang and Landau developed a convenient method for estimating the density of
states based on a multicanonical Monte Carlo method. In their method, the
density of states is calculated autonomously during a simulation. In this paper
we develop a set of techniques to effectively apply the Wang-Landau method to
MD simulations. In the multicanonical MD, the estimation of the derivative of
the density of states is critical. In order to estimate it accurately, we
devise two original improvements. First, the correction for the density of
states is made smooth by using the Gaussian distribution obtained by a short
canonical simulation. Second, an approximation is applied to the derivative,
which is based on the Gaussian distribution and the multiple weighted histogram
technique. A test of this method was performed with small polypeptides,
Met-enkephalin and Trp-cage, and it is demonstrated that Wang-Landau MD is
consistent with replica exchange MD but can sample much larger conformational
space.Comment: 8 pages, 7 figures, accepted for publication in Physical Review
Unique Interplay between Sugar and Lipid in Determining the Antigenic Potency of Bacterial Antigens for NKT Cells
Structural and biophysical studies reveal the induced-fit mechanism underlying the stringent specificity of invariant natural killer T cells for unique glycolipid antigens from the pathogen Streptococcus pneumoniae
Composite structural motifs of binding sites for delineating biological functions of proteins
Most biological processes are described as a series of interactions between
proteins and other molecules, and interactions are in turn described in terms
of atomic structures. To annotate protein functions as sets of interaction
states at atomic resolution, and thereby to better understand the relation
between protein interactions and biological functions, we conducted exhaustive
all-against-all atomic structure comparisons of all known binding sites for
ligands including small molecules, proteins and nucleic acids, and identified
recurring elementary motifs. By integrating the elementary motifs associated
with each subunit, we defined composite motifs which represent
context-dependent combinations of elementary motifs. It is demonstrated that
function similarity can be better inferred from composite motif similarity
compared to the similarity of protein sequences or of individual binding sites.
By integrating the composite motifs associated with each protein function, we
define meta-composite motifs each of which is regarded as a time-independent
diagrammatic representation of a biological process. It is shown that
meta-composite motifs provide richer annotations of biological processes than
sequence clusters. The present results serve as a basis for bridging atomic
structures to higher-order biological phenomena by classification and
integration of binding site structures.Comment: 34 pages, 7 figure
Release of magnesium from vermiculite by acid dissolution
The vermiculite from Paulistânia, State of PiauÃ, was used to study a release of magnesium by acid dissolution. The material was ground and sieved to separate two fractions: 0.50 to 0.15mm and < 0.10mm. Each fraction was divided into three parts, two of which were heated respectively to 550°C and 950°C in a muffle furnace for one hour. These vermiculites were treated with concentrated sulfuric acid and concentrated phosphoric acid in order to evaluate their efficiency in acid dissolution of vermiculite. A release of magnesium in relation to a quantity of sulfuric acid added and a amount of calcium carbonate necessary to neutralize a residual acidity of the product were also investigated. The sulfuric acid was just as effective as phosphoric acid in the dissolution of vermiculites and the release of magnesium. The particle-size and heat treatment of vermiculite had no influence on the amount of magnesium released by acid dissolution. The addiction of sulfuric acid to vermiculite in equal amount released more than 80% of magnesium. A quantity of calcium carbonate necessary to neutralize the residual acidity of the product was about one half the weight of the vermiculite.Estudou-se a liberação de magnésio estrutural da vermiculita procedente de Paulistânia, Estado de PiauÃ. O material foi triturado e peneirado para obter duas frações de 0,50 a l,15mm e de < 0,10mm. Cada fração de vermiculita foi dividida em três partes. As duas partes foram aquecidas num forno mufla à s temperaturas de 550 e 950°C, respectivamente, durante uma hora. As vermiculitas, assim preparadas, foram tratadas com ácido sulfúrico conc. e ácido fosfórico conc. para avaliar a eficiência dos ácidos na liberação de magnésio. Em seguida, estudou-se a liberação de magnésio em função da quantidade de ácido sulfúrico e a necessidade de carbonato de cálcio para neutralizar a acidez residual do produto. Não houve diferença entre o ácido sulfúrico e o ácido fosfórico quanto a extração de magnésio da vermiculita. A granulometria e o aquecimento não influiram na liberação de magnésio pelos ácidos. A adição de ácido sulfúrico à vermiculita em quantidades iguais liberou mais que 80% de magnésio. A quantidade de carbonato de cálcio necessária para neutralizar a acidez residual do produto foi aproximadamente a metade do peso da vermiculita
Predicting residue-wise contact orders in proteins by support vector regression
BACKGROUND: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. RESULTS: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. CONCLUSION: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences
Induction of Expandable Tissue-Specific Progenitor Cells from Human Pancreatic Tissue through Transient Expression of Defined Factors
We recently demonstrated the generation of mouse induced tissue-specific stem (iTS) cells through transient overexpression of reprogramming factors combined with tissue-specific selection. Here we induced expandable tissue-specific progenitor (iTP) cells from human pancreatic tissue through transient expression of genes encoding the reprogramming factors OCT4 (octamer-binding transcription factor 4), p53 small hairpin RNA (shRNA), SOX2 (sex-determining region Y-box 2), KLF4 (Kruppel-like factor 4), L-MYC, and LIN28. Transfection of episomal plasmid vectors into human pancreatic tissue efficiently generated iTP cells expressing genetic markers of endoderm and pancreatic progenitors. The iTP cells differentiated into insulin-producing cells more efficiently than human induced pluripotent stem cells (iPSCs). iTP cells continued to proliferate faster than pancreatic tissue cells until days 100–120 (passages 15–20). iTP cells subcutaneously inoculated into immunodeficient mice did not form teratomas. Genomic bisulfite nucleotide sequence analysis demonstrated that the OCT4 and NANOG promoters remained partially methylated in iTP cells. We compared the global gene expression profiles of iPSCs, iTP cells, and pancreatic cells (islets >80%). Microarray analyses revealed that the gene expression profiles of iTP cells were similar, but not identical, to those of iPSCs but different from those of pancreatic cells. The generation of human iTP cells may have important implications for the clinical application of stem/progenitor cells
- …