30,228 research outputs found

    TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions

    Full text link
    Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the entangled geometric complexity and biological complexity. We introduce topology, i.e., element specific persistent homology (ESPH), to untangle geometric complexity and biological complexity. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains crucial biological information via a multichannel image representation. It is able to reveal hidden structure-function relationships in biomolecules. We further integrate ESPH and convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the limitations to deep learning arising from small and noisy training sets, we present a multitask topological convolutional neural network (MT-TCNN). We demonstrate that the present TopologyNet architectures outperform other state-of-the-art methods in the predictions of protein-ligand binding affinities, globular protein mutation impacts, and membrane protein mutation impacts.Comment: 20 pages, 8 figures, 5 table

    Algebraic shortcuts for leave-one-out cross-validation in supervised network inference

    Get PDF
    Supervised machine learning techniques have traditionally been very successful at reconstructing biological networks, such as protein-ligand interaction, protein-protein interaction and gene regulatory networks. Many supervised techniques for network prediction use linear models on a possibly nonlinear pairwise feature representation of edges. Recently, much emphasis has been placed on the correct evaluation of such supervised models. It is vital to distinguish between using a model to either predict new interactions in a given network or to predict interactions for a new vertex not present in the original network. This distinction matters because (i) the performance might dramatically differ between the prediction settings and (ii) tuning the model hyperparameters to obtain the best possible model depends on the setting of interest. Specific cross-validation schemes need to be used to assess the performance in such different prediction settings. In this work we discuss a state-of-the-art kernel-based network inference technique called two-step kernel ridge regression. We show that this regression model can be trained efficiently, with a time complexity scaling with the number of vertices rather than the number of edges. Furthermore, this framework leads to a series of cross-validation shortcuts that allow one to rapidly estimate the model performance for any relevant network prediction setting. This allows computational biologists to fully assess the capabilities of their models

    The Nondeterministic Waiting Time Algorithm: A Review

    Full text link
    We present briefly the Nondeterministic Waiting Time algorithm. Our technique for the simulation of biochemical reaction networks has the ability to mimic the Gillespie Algorithm for some networks and solutions to ordinary differential equations for other networks, depending on the rules of the system, the kinetic rates and numbers of molecules. We provide a full description of the algorithm as well as specifics on its implementation. Some results for two well-known models are reported. We have used the algorithm to explore Fas-mediated apoptosis models in cancerous and HIV-1 infected T cells

    Collagens - structure, function and biosynthesis.

    Get PDF
    The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified so far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. This review focuses on the distribution and function of various collagen types in different tissues. It introduces their basic structural subunits and points out major steps in the biosynthesis and supramolecular processing of fibrillar collagens as prototypical members of this protein family. A final outlook indicates the importance of different collagen types not only for the understanding of collagen-related diseases, but also as a basis for the therapeutical use of members of this protein family discussed in other chapters of this issue

    MicroRNA in control of gene expression: An overview of nuclear functions

    Get PDF
    The finding that small non-coding RNAs (ncRNAs) are able to control gene expression in a sequence specific manner has had a massive impact on biology. Recent improvements in high throughput sequencing and computational prediction methods have allowed the discovery and classification of several types of ncRNAs. Based on their precursor structures, biogenesis pathways and modes of action, ncRNAs are classified as small interfering RNAs (siRNAs), microRNAs (miRNAs), PIWI-interacting RNAs (piRNAs), endogenous small interfering RNAs (endo-siRNAs or esiRNAs), promoter associate RNAs (pRNAs), small nucleolar RNAs (snoRNAs) and sno-derived RNAs. Among these, miRNAs appear as important cytoplasmic regulators of gene expression. miRNAs act as post-transcriptional regulators of their messenger RNA (mRNA) targets via mRNA degradation and/or translational repression. However, it is becoming evident that miRNAs also have specific nuclear functions. Among these, the most studied and debated activity is the miRNA-guided transcriptional control of gene expression. Although available data detail quite precisely the effectors of this activity, the mechanisms by which miRNAs identify their gene targets to control transcription are still a matter of debate. Here, we focus on nuclear functions of miRNAs and on alternative mechanisms of target recognition, at the promoter lavel, by miRNAs in carrying out transcriptional gene silencing

    Genomics and proteomics: a signal processor's tour

    Get PDF
    The theory and methods of signal processing are becoming increasingly important in molecular biology. Digital filtering techniques, transform domain methods, and Markov models have played important roles in gene identification, biological sequence analysis, and alignment. This paper contains a brief review of molecular biology, followed by a review of the applications of signal processing theory. This includes the problem of gene finding using digital filtering, and the use of transform domain methods in the study of protein binding spots. The relatively new topic of noncoding genes, and the associated problem of identifying ncRNA buried in DNA sequences are also described. This includes a discussion of hidden Markov models and context free grammars. Several new directions in genomic signal processing are briefly outlined in the end

    The ever-evolving concept of the gene: The use of RNA/Protein experimental techniques to understand genome functions

    Get PDF
    The completion of the human genome sequence together with advances in sequencing technologies have shifted the paradigm of the genome, as composed of discrete and hereditable coding entities, and have shown the abundance of functional noncoding DNA. This part of the genome, previously dismissed as "junk" DNA, increases proportionally with organismal complexity and contributes to gene regulation beyond the boundaries of known protein-coding genes. Different classes of functionally relevant nonprotein-coding RNAs are transcribed from noncoding DNA sequences. Among them are the long noncoding RNAs (lncRNAs), which are thought to participate in the basal regulation of protein-coding genes at both transcriptional and post-transcriptional levels. Although knowledge of this field is still limited, the ability of lncRNAs to localize in different cellular compartments, to fold into specific secondary structures and to interact with different molecules (RNA or proteins) endows them with multiple regulatory mechanisms. It is becoming evident that lncRNAs may play a crucial role in most biological processes such as the control of development, differentiation and cell growth. This review places the evolution of the concept of the gene in its historical context, from Darwin's hypothetical mechanism of heredity to the post-genomic era. We discuss how the original idea of protein-coding genes as unique determinants of phenotypic traits has been reconsidered in light of the existence of noncoding RNAs. We summarize the technological developments which have been made in the genome-wide identification and study of lncRNAs and emphasize the methodologies that have aided our understanding of the complexity of lncRNA-protein interactions in recent years
    corecore