31,345 research outputs found
The amino acid sequence of chicken histone F3
Histone F3 (III) from chicken erythrocytes was isolated by selective extraction from nucleoprotein with ethanolic-HCl and purified by a single gel filtration step. This protein was found to be homogeneous by the following criteria: gel filtration, electrophoretic mobility, N- and C-terminal amino acid residues and amino acid analysis. The primary structure of this histone was established without resorting to the use of overlapping sequences. This has been achieved with specific chemical cleavages rather than enzymatic degradations chosen and applied, first to the original protein chain, and subsequently to the generated polypeptides, to yield sets of not more than 3 peptides in any single cleavage. Their relative position in the protein or polypeptides became evident after comparison of the N- and C-terminal amino acids in the cleavage products and the uncleaved starting material. The simplicity of the peptide mixture after each cleavage, resulting in easy separation of the peptides, together with the highly efficient Edman degradation of automatic sequencing, allowed a rapid and relatively nonlaborious primary structure determination. Finally, the amino acid sequence is compared with those of protamines and other histones. The evolution and the structure of this protein in relation to DNA is briefly considered
A methodology for determining amino-acid substitution matrices from set covers
We introduce a new methodology for the determination of amino-acid
substitution matrices for use in the alignment of proteins. The new methodology
is based on a pre-existing set cover on the set of residues and on the
undirected graph that describes residue exchangeability given the set cover.
For fixed functional forms indicating how to obtain edge weights from the set
cover and, after that, substitution-matrix elements from weighted distances on
the graph, the resulting substitution matrix can be checked for performance
against some known set of reference alignments and for given gap costs. Finding
the appropriate functional forms and gap costs can then be formulated as an
optimization problem that seeks to maximize the performance of the substitution
matrix on the reference alignment set. We give computational results on the
BAliBASE suite using a genetic algorithm for optimization. Our results indicate
that it is possible to obtain substitution matrices whose performance is either
comparable to or surpasses that of several others, depending on the particular
scenario under consideration
Multiple sequence alignment based on set covers
We introduce a new heuristic for the multiple alignment of a set of
sequences. The heuristic is based on a set cover of the residue alphabet of the
sequences, and also on the determination of a significant set of blocks
comprising subsequences of the sequences to be aligned. These blocks are
obtained with the aid of a new data structure, called a suffix-set tree, which
is constructed from the input sequences with the guidance of the
residue-alphabet set cover and generalizes the well-known suffix tree of the
sequence set. We provide performance results on selected BAliBASE amino-acid
sequences and compare them with those yielded by some prominent approaches
A^2-Net: Molecular Structure Estimation from Cryo-EM Density Volumes
Constructing of molecular structural models from Cryo-Electron Microscopy
(Cryo-EM) density volumes is the critical last step of structure determination
by Cryo-EM technologies. Methods have evolved from manual construction by
structural biologists to perform 6D translation-rotation searching, which is
extremely compute-intensive. In this paper, we propose a learning-based method
and formulate this problem as a vision-inspired 3D detection and pose
estimation task. We develop a deep learning framework for amino acid
determination in a 3D Cryo-EM density volume. We also design a sequence-guided
Monte Carlo Tree Search (MCTS) to thread over the candidate amino acids to form
the molecular structure. This framework achieves 91% coverage on our newly
proposed dataset and takes only a few minutes for a typical structure with a
thousand amino acids. Our method is hundreds of times faster and several times
more accurate than existing automated solutions without any human intervention.Comment: 8 pages, 5 figures, 4 table
Structure, kinetic characterization and subcellular localization of the two ribulose 5-phosphate epimerase isoenzymes from Trypanosoma cruzi
The enzyme of the pentose phosphate pathway (PPP) ribulose-5-phosphate-epimerase (RPE) is encoded by two genes present in the genome of Trypanosoma cruzi CL Brener clone: TcRPE1 and TcRPE2. Despite high sequence similarity at the amino acid residue level, the recombinant isoenzymes show a strikingly different kinetics. Whereas TcRPE2 follows a typical michaelian behavior, TcRPE1 shows a complex kinetic pattern, displaying a biphasic curve, suggesting the coexistence of -at least-two kinetically different molecular forms. Regarding the subcellular localization in epimastigotes, whereas TcRPE1 is a cytosolic enzyme, TcRPE2 is localized in glycosomes. To our knowledge, TcRPE2 is the first PPP isoenzyme that is exclusively localized in glycosomes. Over-expression of TcRPE1, but not of TcRPE2, significantly reduces the parasite doubling time in vitro, as compared with wild type epimastigotes. Both TcRPEs represent single domain proteins exhibiting the classical α/β TIM-barrel fold, as expected for enzymes with this activity. With regard to the architecture of the active site, all the important amino acid residues for catalysis -with the exception of M58- are also present in both TcRPEs models. The superimposition of the binding pocket of both isoenzyme models shows that they adopt essentially identical positions in the active site with a residue specific RMSD < 2Ã…, with the sole exception of S12, which displays a large deviation (residue specific RMSD: 11.07 A). Studies on the quaternary arrangement of these isoenzymes reveal that both are present in a mixture of various oligomeric species made up of an even number of molecules, probably pointing to the dimer as their minimal functional unit. This multiplicity of oligomeric species has not been reported for any of the other RPEs studied so far and it might bear implications for the regulation of TcRPEs activity, although further investigation will be necessary to unravel the physiological significance of these structural findings.Fil: Gonzalez, Soledad Natalia. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl AlfonsÃn" (sede Chascomús). Universidad Nacional de San MartÃn. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl AlfonsÃn" (sede Chascomús); ArgentinaFil: Valsecchi, Wanda Mariela. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de QuÃmica y FÃsico-QuÃmica Biológicas "Prof. Alejandro C. Paladini". Universidad de Buenos Aires. Facultad de Farmacia y BioquÃmica. Instituto de QuÃmica y FÃsico-QuÃmica Biológicas; ArgentinaFil: Maugeri, Dante. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl AlfonsÃn" (sede Chascomús). Universidad Nacional de San MartÃn. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl AlfonsÃn" (sede Chascomús); ArgentinaFil: Delfino, Jose Maria. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de QuÃmica y FÃsico-QuÃmica Biológicas "Prof. Alejandro C. Paladini". Universidad de Buenos Aires. Facultad de Farmacia y BioquÃmica. Instituto de QuÃmica y FÃsico-QuÃmica Biológicas; ArgentinaFil: Cazzulo, Juan Jose. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl AlfonsÃn" (sede Chascomús). Universidad Nacional de San MartÃn. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl AlfonsÃn" (sede Chascomús); Argentin
PRED-CLASS: cascading neural networks for generalized protein classification and genome-wide applications
A cascading system of hierarchical, artificial neural networks (named
PRED-CLASS) is presented for the generalized classification of proteins into
four distinct classes-transmembrane, fibrous, globular, and mixed-from
information solely encoded in their amino acid sequences. The architecture of
the individual component networks is kept very simple, reducing the number of
free parameters (network synaptic weights) for faster training, improved
generalization, and the avoidance of data overfitting. Capturing information
from as few as 50 protein sequences spread among the four target classes (6
transmembrane, 10 fibrous, 13 globular, and 17 mixed), PRED-CLASS was able to
obtain 371 correct predictions out of a set of 387 proteins (success rate
approximately 96%) unambiguously assigned into one of the target classes. The
application of PRED-CLASS to several test sets and complete proteomes of
several organisms demonstrates that such a method could serve as a valuable
tool in the annotation of genomic open reading frames with no functional
assignment or as a preliminary step in fold recognition and ab initio structure
prediction methods. Detailed results obtained for various data sets and
completed genomes, along with a web sever running the PRED-CLASS algorithm, can
be accessed over the World Wide Web at http://o2.biol.uoa.gr/PRED-CLAS
Cryo-EM map interpretation and protein model-building using iterative map segmentation.
A procedure for building protein chains into maps produced by single-particle electron cryo-microscopy (cryo-EM) is described. The procedure is similar to the way an experienced structural biologist might analyze a map, focusing first on secondary structure elements such as helices and sheets, then varying the contour level to identify connections between these elements. Since the high density in a map typically follows the main-chain of the protein, the main-chain connection between secondary structure elements can often be identified as the unbranched path between them with the highest minimum value along the path. This chain-tracing procedure is then combined with finding side-chain positions based on the presence of density extending away from the main path of the chain, allowing generation of a Cα model. The Cα model is converted to an all-atom model and is refined against the map. We show that this procedure is as effective as other existing methods for interpretation of cryo-EM maps and that it is considerably faster and produces models with fewer chain breaks than our previous methods that were based on approaches developed for crystallographic maps
Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?
The organization and mining of malaria genomic and post-genomic data is
highly motivated by the necessity to predict and characterize new biological
targets and new drugs. Biological targets are sought in a biological space
designed from the genomic data from Plasmodium falciparum, but using also the
millions of genomic data from other species. Drug candidates are sought in a
chemical space containing the millions of small molecules stored in public and
private chemolibraries. Data management should therefore be as reliable and
versatile as possible. In this context, we examined five aspects of the
organization and mining of malaria genomic and post-genomic data: 1) the
comparison of protein sequences including compositionally atypical malaria
sequences, 2) the high throughput reconstruction of molecular phylogenies, 3)
the representation of biological processes particularly metabolic pathways, 4)
the versatile methods to integrate genomic data, biological representations and
functional profiling obtained from X-omic experiments after drug treatments and
5) the determination and prediction of protein structures and their molecular
docking with drug candidate structures. Progresses toward a grid-enabled
chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa
An automatic method for assessing structural importance of amino acid positions
Background: A great deal is known about the qualitative aspects of the sequence-structure relationship, for example that buried residues are usually more conserved between structurally similar homologues, but no attempts have been made to quantitate the relationship between evolutionary conservation at a sequence position and change to global tertiary structure. In this paper we demonstrate that the Spearman correlation between sequence and structural change is suitable for this purpose.
Results:
Buried residues, bends, cysteines, prolines and leucines were significantly more likely to occupy positions highly correlated with structural change than expected by chance. Some buried residues were found to be less informative than expected, particularly residues involved in active sites and the binding of small molecules.
Conclusion:
The correlation-based method generates predictions of structural importance for superfamily positions which agree well with previous results of manual analyses, and may be of use in automated residue annotation piplines. A PERL script which implements the method is provided
- …