31,345 research outputs found

    The amino acid sequence of chicken histone F3

    Get PDF
    Histone F3 (III) from chicken erythrocytes was isolated by selective extraction from nucleoprotein with ethanolic-HCl and purified by a single gel filtration step. This protein was found to be homogeneous by the following criteria: gel filtration, electrophoretic mobility, N- and C-terminal amino acid residues and amino acid analysis. The primary structure of this histone was established without resorting to the use of overlapping sequences. This has been achieved with specific chemical cleavages rather than enzymatic degradations chosen and applied, first to the original protein chain, and subsequently to the generated polypeptides, to yield sets of not more than 3 peptides in any single cleavage. Their relative position in the protein or polypeptides became evident after comparison of the N- and C-terminal amino acids in the cleavage products and the uncleaved starting material. The simplicity of the peptide mixture after each cleavage, resulting in easy separation of the peptides, together with the highly efficient Edman degradation of automatic sequencing, allowed a rapid and relatively nonlaborious primary structure determination. Finally, the amino acid sequence is compared with those of protamines and other histones. The evolution and the structure of this protein in relation to DNA is briefly considered

    A methodology for determining amino-acid substitution matrices from set covers

    Full text link
    We introduce a new methodology for the determination of amino-acid substitution matrices for use in the alignment of proteins. The new methodology is based on a pre-existing set cover on the set of residues and on the undirected graph that describes residue exchangeability given the set cover. For fixed functional forms indicating how to obtain edge weights from the set cover and, after that, substitution-matrix elements from weighted distances on the graph, the resulting substitution matrix can be checked for performance against some known set of reference alignments and for given gap costs. Finding the appropriate functional forms and gap costs can then be formulated as an optimization problem that seeks to maximize the performance of the substitution matrix on the reference alignment set. We give computational results on the BAliBASE suite using a genetic algorithm for optimization. Our results indicate that it is possible to obtain substitution matrices whose performance is either comparable to or surpasses that of several others, depending on the particular scenario under consideration

    Multiple sequence alignment based on set covers

    Full text link
    We introduce a new heuristic for the multiple alignment of a set of sequences. The heuristic is based on a set cover of the residue alphabet of the sequences, and also on the determination of a significant set of blocks comprising subsequences of the sequences to be aligned. These blocks are obtained with the aid of a new data structure, called a suffix-set tree, which is constructed from the input sequences with the guidance of the residue-alphabet set cover and generalizes the well-known suffix tree of the sequence set. We provide performance results on selected BAliBASE amino-acid sequences and compare them with those yielded by some prominent approaches

    A^2-Net: Molecular Structure Estimation from Cryo-EM Density Volumes

    Full text link
    Constructing of molecular structural models from Cryo-Electron Microscopy (Cryo-EM) density volumes is the critical last step of structure determination by Cryo-EM technologies. Methods have evolved from manual construction by structural biologists to perform 6D translation-rotation searching, which is extremely compute-intensive. In this paper, we propose a learning-based method and formulate this problem as a vision-inspired 3D detection and pose estimation task. We develop a deep learning framework for amino acid determination in a 3D Cryo-EM density volume. We also design a sequence-guided Monte Carlo Tree Search (MCTS) to thread over the candidate amino acids to form the molecular structure. This framework achieves 91% coverage on our newly proposed dataset and takes only a few minutes for a typical structure with a thousand amino acids. Our method is hundreds of times faster and several times more accurate than existing automated solutions without any human intervention.Comment: 8 pages, 5 figures, 4 table

    Structure, kinetic characterization and subcellular localization of the two ribulose 5-phosphate epimerase isoenzymes from Trypanosoma cruzi

    Get PDF
    The enzyme of the pentose phosphate pathway (PPP) ribulose-5-phosphate-epimerase (RPE) is encoded by two genes present in the genome of Trypanosoma cruzi CL Brener clone: TcRPE1 and TcRPE2. Despite high sequence similarity at the amino acid residue level, the recombinant isoenzymes show a strikingly different kinetics. Whereas TcRPE2 follows a typical michaelian behavior, TcRPE1 shows a complex kinetic pattern, displaying a biphasic curve, suggesting the coexistence of -at least-two kinetically different molecular forms. Regarding the subcellular localization in epimastigotes, whereas TcRPE1 is a cytosolic enzyme, TcRPE2 is localized in glycosomes. To our knowledge, TcRPE2 is the first PPP isoenzyme that is exclusively localized in glycosomes. Over-expression of TcRPE1, but not of TcRPE2, significantly reduces the parasite doubling time in vitro, as compared with wild type epimastigotes. Both TcRPEs represent single domain proteins exhibiting the classical α/β TIM-barrel fold, as expected for enzymes with this activity. With regard to the architecture of the active site, all the important amino acid residues for catalysis -with the exception of M58- are also present in both TcRPEs models. The superimposition of the binding pocket of both isoenzyme models shows that they adopt essentially identical positions in the active site with a residue specific RMSD < 2Å, with the sole exception of S12, which displays a large deviation (residue specific RMSD: 11.07 A). Studies on the quaternary arrangement of these isoenzymes reveal that both are present in a mixture of various oligomeric species made up of an even number of molecules, probably pointing to the dimer as their minimal functional unit. This multiplicity of oligomeric species has not been reported for any of the other RPEs studied so far and it might bear implications for the regulation of TcRPEs activity, although further investigation will be necessary to unravel the physiological significance of these structural findings.Fil: Gonzalez, Soledad Natalia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); ArgentinaFil: Valsecchi, Wanda Mariela. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Química y Físico-Química Biológicas "Prof. Alejandro C. Paladini". Universidad de Buenos Aires. Facultad de Farmacia y Bioquímica. Instituto de Química y Físico-Química Biológicas; ArgentinaFil: Maugeri, Dante. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); ArgentinaFil: Delfino, Jose Maria. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Química y Físico-Química Biológicas "Prof. Alejandro C. Paladini". Universidad de Buenos Aires. Facultad de Farmacia y Bioquímica. Instituto de Química y Físico-Química Biológicas; ArgentinaFil: Cazzulo, Juan Jose. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); Argentin

    PRED-CLASS: cascading neural networks for generalized protein classification and genome-wide applications

    Full text link
    A cascading system of hierarchical, artificial neural networks (named PRED-CLASS) is presented for the generalized classification of proteins into four distinct classes-transmembrane, fibrous, globular, and mixed-from information solely encoded in their amino acid sequences. The architecture of the individual component networks is kept very simple, reducing the number of free parameters (network synaptic weights) for faster training, improved generalization, and the avoidance of data overfitting. Capturing information from as few as 50 protein sequences spread among the four target classes (6 transmembrane, 10 fibrous, 13 globular, and 17 mixed), PRED-CLASS was able to obtain 371 correct predictions out of a set of 387 proteins (success rate approximately 96%) unambiguously assigned into one of the target classes. The application of PRED-CLASS to several test sets and complete proteomes of several organisms demonstrates that such a method could serve as a valuable tool in the annotation of genomic open reading frames with no functional assignment or as a preliminary step in fold recognition and ab initio structure prediction methods. Detailed results obtained for various data sets and completed genomes, along with a web sever running the PRED-CLASS algorithm, can be accessed over the World Wide Web at http://o2.biol.uoa.gr/PRED-CLAS

    Cryo-EM map interpretation and protein model-building using iterative map segmentation.

    Get PDF
    A procedure for building protein chains into maps produced by single-particle electron cryo-microscopy (cryo-EM) is described. The procedure is similar to the way an experienced structural biologist might analyze a map, focusing first on secondary structure elements such as helices and sheets, then varying the contour level to identify connections between these elements. Since the high density in a map typically follows the main-chain of the protein, the main-chain connection between secondary structure elements can often be identified as the unbranched path between them with the highest minimum value along the path. This chain-tracing procedure is then combined with finding side-chain positions based on the presence of density extending away from the main path of the chain, allowing generation of a Cα model. The Cα model is converted to an all-atom model and is refined against the map. We show that this procedure is as effective as other existing methods for interpretation of cryo-EM maps and that it is considerably faster and produces models with fewer chain breaks than our previous methods that were based on approaches developed for crystallographic maps

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    An automatic method for assessing structural importance of amino acid positions

    Get PDF
    Background: A great deal is known about the qualitative aspects of the sequence-structure relationship, for example that buried residues are usually more conserved between structurally similar homologues, but no attempts have been made to quantitate the relationship between evolutionary conservation at a sequence position and change to global tertiary structure. In this paper we demonstrate that the Spearman correlation between sequence and structural change is suitable for this purpose. Results: Buried residues, bends, cysteines, prolines and leucines were significantly more likely to occupy positions highly correlated with structural change than expected by chance. Some buried residues were found to be less informative than expected, particularly residues involved in active sites and the binding of small molecules. Conclusion: The correlation-based method generates predictions of structural importance for superfamily positions which agree well with previous results of manual analyses, and may be of use in automated residue annotation piplines. A PERL script which implements the method is provided
    • …
    corecore