6 research outputs found

    Structural relation matching: an algorithm to identify structural patterns into RNAs and their interactions

    Get PDF
    RNA molecules play crucial roles in various biological processes. Their three-dimensional configurations determine the functions and, in turn, influences the interaction with other molecules. RNAs and their interaction structures, the so-called RNA-RNA interactions, can be abstracted in terms of secondary structures, i.e., a list of the nucleotide bases paired by hydrogen bonding within its nucleotide sequence. Each secondary structure, in turn, can be abstracted into cores and shadows. Both are determined by collapsing nucleotides and arcs properly. We formalize all of these abstractions as arc diagrams, whose arcs determine loops. A secondary structure, represented by an arc diagram, is pseudoknot-free if its arc diagram does not present any crossing among arcs otherwise, it is said pseudoknotted. In this study, we face the problem of identifying a given structural pattern into secondary structures or the associated cores or shadow of both RNAs and RNA-RNA interactions, characterized by arbitrary pseudoknots. These abstractions are mapped into a matrix, whose elements represent the relations among loops. Therefore, we face the problem of taking advantage of matrices and submatrices. The algorithms, implemented in Python, work in polynomial time. We test our approach on a set of 16S ribosomal RNAs with inhibitors of Thermus thermophilus, and we quantify the structural effect of the inhibitors

    Hierarchical representation for PPI sites prediction

    Get PDF
    Background: Protein–protein interactions have pivotal roles in life processes, and aberrant interactions are associated with various disorders. Interaction site identification is key for understanding disease mechanisms and design new drugs. Effective and efficient computational methods for the PPI prediction are of great value due to the overall cost of experimental methods. Promising results have been obtained using machine learning methods and deep learning techniques, but their effectiveness depends on protein representation and feature selection. Results: We define a new abstraction of the protein structure, called hierarchical representations, considering and quantifying spatial and sequential neighboring among amino acids. We also investigate the effect of molecular abstractions using the Graph Convolutional Networks technique to classify amino acids as interface and no-interface ones. Our study takes into account three abstractions, hierarchical representations, contact map, and the residue sequence, and considers the eight functional classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0. The performance of our method, evaluated using standard metrics, is compared to the ones obtained with some state-of-the-art protein interface predictors. The analysis of the performance values shows that our method outperforms the considered competitors when the considered molecules are structurally similar. Conclusions: The hierarchical representation can capture the structural properties that promote the interactions and can be used to represent proteins with unknown structures by codifying only their sequential neighboring. Analyzing the results, we conclude that classes should be arranged according to their architectures rather than functions

    Automatic generation of pseudoknotted RNAs taxonomy

    Get PDF
    Background: The ability to compare RNA secondary structures is important in understanding their biological function and for grouping similar organisms into families by looking at evolutionarily conserved sequences such as 16S rRNA. Most comparison methods and benchmarks in the literature focus on pseudoknot-free structures due to the difficulty of mapping pseudoknots in classical tree representations. Some approaches exist that permit to cluster pseudoknotted RNAs but there is not a general framework for evaluating their performance. Results: We introduce an evaluation framework based on a similarity/dissimilarity measure obtained by a comparison method and agglomerative clustering. Their combination automatically partition a set of molecules into groups. To illustrate the framework we define and make available a benchmark of pseudoknotted (16S and 23S) and pseudoknot-free (5S) rRNA secondary structures belonging to Archaea, Bacteria and Eukaryota. We also consider five different comparison methods from the literature that are able to manage pseudoknots. For each method we clusterize the molecules in the benchmark to obtain the taxa at the rank phylum according to the European Nucleotide Archive curated taxonomy. We compute appropriate metrics for each method and we compare their suitability to reconstruct the taxa

    Topological Classification of RNA Structures via Intersection Graph

    No full text
    We introduce a new algebraic representation of RNA secondary structures as composition of hairpins and we define an appropriate abstract algebraic representation. Moreover, we propose a novel method to classify the RNA structures based on two topological invariants, the genus and the number of crossing. Starting from the classic arc representation of RNA secondary structures, the proposed method takes advantage of the algebraic representation to easy obtain an interaction graph of RNA molecule through an appropriate procedure. Each vertex of the graph is a loop and each edge represents the interaction between the two loops, thus the cardinality of edges is the number of crossing of the RNA molecule. Through the definition and application of a new procedure, the intersection graph of the RNA shape is obtained. The cardinality of the resulting graph corresponds to the crossing number of the shape associated to the RNA molecule. The aforementioned crossing number is a topological invariant, as well as the genus. Both do not uniquely identify an RNA graph, but the crossing number permits to add a term which is proportional to the standard free energy of the RNA molecule. Thus, a more precise free-energy parametrization can be obtained. Finally, our method is validated over a subset of real RNA structures from Pseudobase++ databases, and we classify the RNA structures according to their topological genus and crossing number

    A Loop Grammar to Understand the roles of miRNAs in the Tumor Cell

    No full text
    A miRNA is a small non-coding RNA molecule that regulates gene expression. Current studies showed that miRNAs may function both as oncogenes and as tumor suppressors, but not revealed the precise conditions that cause miRNAs to alter gene expression of the cancer cells. In this study, we introduce a context-free grammar, Loop Grammar, that formalizes the primary and secondary structure as a composition of loops, corresponding to concatenation or nesting of hairpins. We also formalize the concatenation and nesting on fatgraphs, oriented surfaces with boundary, and we define a Surface Loop Grammar, whose algebraic expressions uniquely identify such surfaces associated to given RNA structures. The Loop Grammar has been used to model tumor and healthy miRNAs of the mir-515 family, and we observed that the mutations of elements of primary structure involved in loops formation changed the secondary structure of tumor miRNAs. The Surface Loop Grammar is useful to classify RNA structures in terms of loops and relations among them. References: 1) Peng, Y., Croce, C. M. The role of MicroRNAs in human cancer. Signal transduction and targeted therapy, 2016, 1, 15004. 2) Penner, R.C., Knudsen, M., Wiuf, C., Andersen, J.E., Fatgraph models of proteins. Communications on Pure and Applied Mathematics, 2010, 63(10), 1249–1297 3) Quadrini, M., Culmone, R., Merelli, E.: Topological Classification of RNA Structures via Intersection Graph. In: International Conference on Theory and Practice of Natural Computing, Springer, 2017, 203–215 4) Quadrini, M., Merelli, E.: Loop-loop interaction metrics on RNA secondary structures with pseudoknotsth International Conference on Bioinformatics Models, Methods and Algorithms, Proceedings; Part of 11th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2018 3, 2018
    corecore