14,044 research outputs found

    Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems

    Get PDF
    A huge amount of genetic information is available thanks to the recent advances in sequencing technologies and the larger computational capabilities, but the interpretation of such genetic data at phenotypic level remains elusive. One of the reasons is that proteins are not acting alone, but are specifically interacting with other proteins and biomolecules, forming intricate interaction networks that are essential for the majority of cell processes and pathological conditions. Thus, characterizing such interaction networks is an important step in understanding how information flows from gene to phenotype. Indeed, structural characterization of protein–protein interactions at atomic resolution has many applications in biomedicine, from diagnosis and vaccine design, to drug discovery. However, despite the advances of experimental structural determination, the number of interactions for which there is available structural data is still very small. In this context, a complementary approach is computational modeling of protein interactions by docking, which is usually composed of two major phases: (i) sampling of the possible binding modes between the interacting molecules and (ii) scoring for the identification of the correct orientations. In addition, prediction of interface and hot-spot residues is very useful in order to guide and interpret mutagenesis experiments, as well as to understand functional and mechanistic aspects of the interaction. Computational docking is already being applied to specific biomedical problems within the context of personalized medicine, for instance, helping to interpret pathological mutations involved in protein–protein interactions, or providing modeled structural data for drug discovery targeting protein–protein interactions.Spanish Ministry of Economy grant number BIO2016-79960-R; D.B.B. is supported by a predoctoral fellowship from CONACyT; M.R. is supported by an FPI fellowship from the Severo Ochoa program. We are grateful to the Joint BSC-CRG-IRB Programme in Computational Biology.Peer ReviewedPostprint (author's final draft

    Under-approximating Cut Sets for Reachability in Large Scale Automata Networks

    Get PDF
    In the scope of discrete finite-state models of interacting components, we present a novel algorithm for identifying sets of local states of components whose activity is necessary for the reachability of a given local state. If all the local states from such a set are disabled in the model, the concerned reachability is impossible. Those sets are referred to as cut sets and are computed from a particular abstract causality structure, so-called Graph of Local Causality, inspired from previous work and generalised here to finite automata networks. The extracted sets of local states form an under-approximation of the complete minimal cut sets of the dynamics: there may exist smaller or additional cut sets for the given reachability. Applied to qualitative models of biological systems, such cut sets provide potential therapeutic targets that are proven to prevent molecules of interest to become active, up to the correctness of the model. Our new method makes tractable the formal analysis of very large scale networks, as illustrated by the computation of cut sets within a Boolean model of biological pathways interactions gathering more than 9000 components

    Hierarchy of protein loop-lock structures: a new server for the decomposition of a protein structure into a set of closed loops

    Full text link
    HoPLLS (Hierarchy of protein loop-lock structures) (http://leah.haifa.ac.il/~skogan/Apache/mydata1/main.html) is a web server that identifies closed loops - a structural basis for protein domain hierarchy. The server is based on the loop-and-lock theory for structural organisation of natural proteins. We describe this web server, the algorithms for the decomposition of a 3D protein into loops and the results of scientific investigations into a structural "alphabet" of loops and locks.Comment: 11 pages, 4 figure

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Computational analysis of a plant receptor interaction network

    Full text link
    Trabajo fin de máster en Bioinformática y Biología ComputacionalIn all organisms, complex protein-protein interactions (PPI) networks control major biological functions yet studying their structural features presents a major analytical challenge. In plants, leucine-rich-repeat receptor kinases (LRR-RKs) are key in sensing and transmitting non-self as well as self-signals from the cell surface. As such, LRR-RKs have both developmental and immune functions that allow plants to make the most of their environments. In the model organism in plant molecular biology, Arabidopsis thaliana, most LRR-RKs are still represented by biochemically and genetically uncharacterized receptors. To fix this an LRR-based Cell Surface Interaction (CSI LRR ) network was obtained in 2018, a protein-protein interaction network of the extracellular domain of 170 LRR-RKs that contains 567 bidirectional interactions. Several network analyses have been performed with CSI LRR . However, these analyses have so far not considered the spatial and temporal expression of its proteins. Neither has it been characterized in detail the role of the extracellular domain (ECD) size in the network structure. Because of that, the objective of the present work is to continue with more in depth analyses with the CSI LRR network. This would provide important insights that will facilitate LRR-RKs function characterization. The first aim of this work is to test out the fit of the CSI LRR network to a scale-free topology. To accomplish that, the degree distribution of the CSI LRR network was compared with the degree distribution of the known network models of scale-free and random. Additionally, three network attack algorithms were implemented and applied to these two network models and the CSI LRR network to compare their behavior. However, since the CSI LRR interaction data comes from an in vitro screening, there is no direct evidence whether its protein-protein interactions occur inside the plant cells. To gain insight on how the network composition changes depending on the transcriptional regulation, the interaction data of the CSI LRR was integrated with 4 different RNA-Seq datasets related with the network biological functions. To automatize this task a Python script was written. Furthermore, it was evaluated the role of the LRR-RKs in the network structure depending on the size of their extracellular domain (large or small). For that, centrality parameters were measured, and size-targeted attacks performed. Finally, gene regulatory information was integrated into the CSI LRR to classify the different network proteins according to the function of the transcription factors that regulate its expression. The results were that CSI LRR fits a power law degree distribution and approximates a scale- free topology. Moreover, CSI LRR displays high resistance to random attacks and reduced resistance to hub/bottleneck-directed attacks, similarly to scale-free network model. Also, the integration of CSI LRR interaction data and RNA-Seq data suggests that the transcriptional regulation of the network is more relevant for developmental programs than for defense responses. Another result was that the LRR-RKs with a small ECD size have a major role in the maintenance of the CSI LRR integrity. Lastly, it was hypothesized that the integration of CSI LRR interaction data with predicted gene regulatory networks could shed light upon the functioning of growth-immunity signaling crosstalk

    Graph Kernels

    Get PDF
    We present a unified framework to study graph kernels, special cases of which include the random walk (Gärtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004; Mahé et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time complexity of kernel computation between unlabeled graphs with n vertices from O(n^6) to O(n^3). We find a spectral decomposition approach even more efficient when computing entire kernel matrices. For labeled graphs we develop conjugate gradient and fixed-point methods that take O(dn^3) time per iteration, where d is the size of the label set. By extending the necessary linear algebra to Reproducing Kernel Hilbert Spaces (RKHS) we obtain the same result for d-dimensional edge kernels, and O(n^4) in the infinite-dimensional case; on sparse graphs these algorithms only take O(n^2) time per iteration in all cases. Experiments on graphs from bioinformatics and other application domains show that these techniques can speed up computation of the kernel by an order of magnitude or more. We also show that certain rational kernels (Cortes et al., 2002, 2003, 2004) when specialized to graphs reduce to our random walk graph kernel. Finally, we relate our framework to R-convolution kernels (Haussler, 1999) and provide a kernel that is close to the optimal assignment kernel of Fröhlich et al. (2006) yet provably positive semi-definite

    Functional nucleic acids as substrate for information processing

    No full text
    Information processing applications driven by self-assembly and conformation dynamics of nucleic acids are possible. These underlying paradigms (self-assembly and conformation dynamics) are essential for natural information processors as illustrated by proteins. A key advantage in utilising nucleic acids as information processors is the availability of computational tools to support the design process. This provides us with a platform to develop an integrated environment in which an orchestration of molecular building blocks can be realised. Strict arbitrary control over the design of these computational nucleic acids is not feasible. The microphysical behaviour of these molecular materials must be taken into consideration during the design phase. This thesis investigated, to what extent the construction of molecular building blocks for a particular purpose is possible with the support of a software environment. In this work we developed a computational protocol that functions on a multi-molecular level, which enable us to directly incorporate the dynamic characteristics of nucleic acids molecules. To allow the implementation of this computational protocol, we developed a designer that able to solve the nucleic acids inverse prediction problem, not only in the multi-stable states level, but also include the interactions among molecules that occur in each meta-stable state. The realisation of our computational protocol are evaluated by generating computational nucleic acids units that resembles synthetic RNA devices that have been successfully implemented in the laboratory. Furthermore, we demonstrated the feasibility of the protocol to design various types of computational units. The accuracy and diversity of the generated candidates are significantly better than the best candidates produced by conventional designers. With the computational protocol, the design of nucleic acid information processor using a network of interconnecting nucleic acids is now feasible

    Kernel methods in genomics and computational biology

    Full text link
    Support vector machines and kernel methods are increasingly popular in genomics and computational biology, due to their good performance in real-world applications and strong modularity that makes them suitable to a wide range of problems, from the classification of tumors to the automatic annotation of proteins. Their ability to work in high dimension, to process non-vectorial data, and the natural framework they provide to integrate heterogeneous data are particularly relevant to various problems arising in computational biology. In this chapter we survey some of the most prominent applications published so far, highlighting the particular developments in kernel methods triggered by problems in biology, and mention a few promising research directions likely to expand in the future
    corecore