1,256 research outputs found

    Protein Remote Homology Detection Based on an Ensemble Learning Approach

    Get PDF

    MOLECULAR DYNAMICS STUDIES OF NUCLEIC ACIDS AND RIBONUCLEOPROTEIN COMPLEXES

    Get PDF
    Molecular simulations of protein-nucleic acid complexes, as well as the HIV-1 Trans Activation Response Element (TAR) RNA molecule, were conducted. First, three different molecular dynamics techniques were studied on the molecule HIV-1 TAR RNA. The three techniques studied were classical molecular dynamics, steered molecular dynamics (SMD), and metadynamics. The classic molecular dynamics simulations were used to equilibrate the HIV-1 TAR RNA system, as well as every other system studied in this thesis. The SMD technique was used in order to observe the breaking force of the nucleotide interactions within TAR. This breaking force averaged to about 100pN. The metadynamics technique was used in order to accelerate the folding of HIV-1 TAR RNA from an unfolded state to its native state. With the use of root mean square deviation (RMSD) and radius of gyration (RGYR) as collective variables (CVs) we were not able to successfully fold HIV-1 TAR RNA xiv from an unfolded state to it’s native state, however, we did obtain four unique conformations of TAR that were within 1kcal/mol of the native state in free energy. Next, the classification of interaction strength between nine diverse nucleic acidprotein complexes was studied using the SMD technique. The nine chosen complexes vary in size (800-6000 atoms) as well as in the type of RNA binding protein (RBP) bound to RNA. In these simulations the RNA molecule in each system is partially fixed and the protein atoms in the binding interface are pulled at a constant velocity. Force data is obtained for each of the nine systems and the maximum force required to separate the molecules is compared using two different variables, percent composition of charged amino acid residues in the binding interface (percent composition) and buried surface area (BSA). We also look at the van der Waals and electrostatic interactions of each system over their respective trajectories. It was found that an increase in BSA often resulted in a higher value of the maximum force. The percent composition did not correlate well with the maximum force, however it is shown that the arginine rich motif (1ETG) system surprisingly had a relatively high maximum force value for such a small BSA and system size. Lastly, the binding affinity of an arginine residue bound to RNA and an adenine monophosphate (AMP) molecule bound to RNA is determined using the well-tempered metadynamics technique. Binding affinity is an important aspect to drug targeting. An effective characterization of a molecules binding affinity is the free energy of binding. Finding a way to calculate this value using molecular dynamics simulations could save much time in the drug development process. We apply well-tempered metadynamics to two small molecule systems that resemble drug-like molecular systems in order to xv determine the binding free energy of these systems. The aim here was to first test the technique on these two example systems such that the same process could be repeated for any system involving the binding of drug molecules to proteins or nucleic acids. Using welltempered metadynamics with a center-of-mass distance CV we were able to successfully determine the binding free energy of the two model systems

    IP6K gene identification in plant genomes by tag searching

    Get PDF
    BACKGROUND: Plants have played a special role in inositol polyphosphate (IP) research since in plant seeds was discovered the first IP, the fully phosphorylated inositol ring of phytic acid (IP6). It is now known that phytic acid is further metabolized by the IP6 Kinases (IP6Ks) to generate IP containing pyro-phosphate moiety. The IP6K are evolutionary conserved enzymes identified in several mammalian, fungi and amoebae species. Although IP6K has not yet been identified in plant chromosomes, there are many clues suggesting its presences in vegetal cells. RESULTS: In this paper we propose a new approach to search for the plant IP6K gene, that lead to the identification in plant genome of a nucleotide sequence corresponding to a specific tag of the IP6K family. Such a tag has been found in all IP6K genes identified up to now, as well as in all genes belonging to the Inositol Polyphosphate Kinases superfamily (IPK). The tag sequence corresponds to the inositol-binding site of the enzyme, and it can be considered as characterizing all IPK genes. To this aim we applied a technique based on motif discovery. We exploited DLSME, a software recently proposed, which allows for the motif structure to be only partially specified by the user. First we applied the new method on mitochondrial DNA (mtDNA) of plants, where such a gene could have been nested, possibly encrypted and hidden by virtue of the editing and/or trans-splicing processes. Then we looked for the gene in nuclear genome of two model plants, Arabidopsis thaliana and Oryza sativa. CONCLUSIONS: The analysis we conducted in plant mitochondria provided the negative, though we argue relevant, result that IP6K does not actually occur in vegetable mtDNA. Very interestingly, the tag search in nuclear genomes lead us to identify a promising sequence in chromosome 5 of Oryza sativa. Further analyses are in course to confirm that this sequence actually corresponds to IP6K mammalian gene

    Knowledge discovery in biological databases : a neural network approach

    Get PDF
    Knowledge discovery, in databases, also known as data mining, is aimed to find significant information from a set of data. The knowledge to be mined from the dataset may refer to patterns, association rules, classification and clustering rules, and so forth. In this dissertation, we present a neural network approach to finding knowledge in biological databases. Specifically, we propose new methods to process biological sequences in two case studies: the classification of protein sequences and the prediction of E. Coli promoters in DNA sequences. Our proposed methods, based oil neural network architectures combine techniques ranging from Bayesian inference, coding theory, feature selection, dimensionality reduction, to dynamic programming and machine learning algorithms. Empirical studies show that the proposed methods outperform previously published methods and have excellent performance on the latest dataset. We have implemented the proposed algorithms into an infrastructure, called Genome Mining, developed for biosequence classification and recognition

    Bioinformatics

    Get PDF
    This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here
    corecore