Search CORE

15,563 research outputs found

Progress in the development and application of computational methods for probabilistic protein design

Author: Boder Eric T.
Kono Hidetoshi
Park Sheldon
Saven Jeffery G.
Wang Wei
Publication venue: ScholarlyCommons
Publication date: 06/07/2004
Field of study

Proteins exhibit a wide range of physical and chemical properties, including highly selective molecular recognition and catalysis, and are also key components in biological metabolic, catabolic, and signaling pathways. Given that proteins are well-structured and can now be rapidly synthesized, they are excellent targets for engineering of both molecular structure and biological function. Computational analysis of the protein design problem allows scientists to explore sequence space and systematically discover novel protein molecules. Nonetheless, the complexity of proteins, the subtlety of the determinants of folding, and the exponentially large number of possible sequences impede the search for peptide sequences compatible with a desired structure and function. Directed search algorithms, which identify directly a small number of sequences, have achieved some success in identifying sequences with desired structures and functions. Alternatively, one can adopt a probabilistic approach. Instead of a finite number of sequences, such calculations result in a probabilistic description of the sequence ensemble. In particular, by casting the formalism in the language of statistical mechanics, the site-specific amino acid probabilities of sequences compatible with a target structure may be readily identified. The computational probabilities are well suited for both de novo protein design of particular sequences as well as combinatorial, library-based protein engineering. The computed site-specific amino acid profile may be converted to a nucleotide base distribution to allow assembly of a partially randomized gene library. The ability to synthesize readily such degenerate oligonucleotide sequences according to the prescribed distribution is key to constructing a biased peptide library genuinely reflective of the computational design. Herein we illustrate how a standard DNA synthesizer can be used with only a slight modification to the synthesis protocol to generate a pool of degenerate DNA sequences, which encodes a predetermined amino acid distribution with high fidelity

Kernel methods in genomics and computational biology

Author: Vert Jean-Philippe
Publication venue
Publication date: 17/10/2005
Field of study

Support vector machines and kernel methods are increasingly popular in genomics and computational biology, due to their good performance in real-world applications and strong modularity that makes them suitable to a wide range of problems, from the classification of tumors to the automatic annotation of proteins. Their ability to work in high dimension, to process non-vectorial data, and the natural framework they provide to integrate heterogeneous data are particularly relevant to various problems arising in computational biology. In this chapter we survey some of the most prominent applications published so far, highlighting the particular developments in kernel methods triggered by problems in biology, and mention a few promising research directions likely to expand in the future

arXiv.org e-Print Archive

7th German Conference on Chemoinformatics: 25 CIC-Workshop : Goslar, Germany, 6 - 8 November 2011 ; meeting abstracts / Edited by Frank Oellien, Uli Fechner and Thomas Engel

Author: Engel Thomas
Fechner Uli
Oellien Frank
Publication venue
Publication date: 01/05/2012
Field of study

Link Prediction in Complex Networks: A Survey

Author: Adamic
Airoldi
Albert
Alon
Amaral
Arenas
Baiesi
Barabási
Barahona
Bayes
Bianconi
Blondel
Boccaletti
Breiman
Brin
Buntine
Burke
Butts
Caldarelli
Carmi
Casella
Casella
Chebotarev
Chu
Clauset
Colizza
Cui
da
Dasgupta
Dawah
Dorelan
Dorogovtsev
Fouss
Fouss
Gallagher
Gastner
Geisser
Getoor
Girvan
Granger
Guha
Guimerà
Guimerà
Guimerà
Guimerà
Hanely
Heckerman
Heckerman
Herlocker
Holland
Holme
Holme
Huang
Huang
Huss
Jaccard
Jeh
Jung
Kaluza
Katz
Kim
Klein
Kohavi
Kossinets
Krebs
Kunegis
Lambiotte
Leicht
Leroy
Leskovec
Liben-Nowell
Lin
Linyuan Lü
Liu
Liu
Liu
Liu
Liu
Lusseau
Lü
Lü
Mann
Manning
Mantrach
Marvel
Metropolis
Molloy
Moore
Mossel
Murata
Neal
Neville
Newman
Newman
Newman
Newman
Newman
Newman
Newman
Newman
Ou
O’Madadhain
Pan
Pastor-Satorras
Pastor-Satorras
Penrose
Perotti
Polikar
Ravasz
Redner
Reed
Reichardt
Sales-Pardo
Salton
Salton
Schafer
Schafer
Shang
Shang
Shawe-Taylor
Spiegelhalter
Spring
Stumpf
Su
Sun
Szell
Sørensen
Tao Zhou
Taskar
Tong
Traag
Tylenda
Valverde
von Mering
Vázquez
Wang
Watts
White
White
White
Wilcoxon
Xiao
Xie
Yan
Yin
Yin
Yu
Yu
Yu
Zachary
Zeng
Zhang
Zhang
Zhang
Zhang
Zhang
Zheleva
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Publication venue: 'Elsevier BV'
Publication date: 04/10/2010
Field of study

Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labelled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.Comment: 44 pages, 5 figure

arXiv.org e-Print Archive

Design and Development of Software Tools for Bio-PEPA

Author: Duguid Adam
Gilmore Stephen
Guerriero Maria
Hillston Jane
Loewe Laurence
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

This paper surveys the design of software tools for the Bio-PEPA process algebra. Bio-PEPA is a high-level language for modelling biological systems such as metabolic pathways and other biochemical reaction networks. Through providing tools for this modelling language we hope to allow easier use of a range of simulators and model-checkers thereby freeing the modeller from the responsibility of developing a custom simulator for the problem of interest. Further, by providing mappings to a range of different analysis tools the Bio-PEPA language allows modellers to compare analysis results which have been computed using independent numerical analysers, which enhances the reliability and robustness of the results computed.

CiteSeerX

ADAM: Analysis of Discrete Models of Biological Systems Using Computer Algebra

Author: Blekherman Grigoriy
Brandon Madison
Guang Bonny
Hinkelmann Franziska
Laubenbacher Reinhard
McNeill Rustin
Veliz-Cuba Alan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Background: Many biological systems are modeled qualitatively with discrete models, such as probabilistic Boolean networks, logical models, Petri nets, and agent-based models, with the goal to gain a better understanding of the system. The computational complexity to analyze the complete dynamics of these models grows exponentially in the number of variables, which impedes working with complex models. Although there exist sophisticated algorithms to determine the dynamics of discrete models, their implementations usually require labor-intensive formatting of the model formulation, and they are oftentimes not accessible to users without programming skills. Efficient analysis methods are needed that are accessible to modelers and easy to use. Method: By converting discrete models into algebraic models, tools from computational algebra can be used to analyze their dynamics. Specifically, we propose a method to identify attractors of a discrete model that is equivalent to solving a system of polynomial equations, a long-studied problem in computer algebra. Results: A method for efficiently identifying attractors, and the web-based tool Analysis of Dynamic Algebraic Models (ADAM), which provides this and other analysis methods for discrete models. ADAM converts several discrete model types automatically into polynomial dynamical systems and analyzes their dynamics using tools from computer algebra. Based on extensive experimentation with both discrete models arising in systems biology and randomly generated networks, we found that the algebraic algorithms presented in this manuscript are fast for systems with the structure maintained by most biological systems, namely sparseness, i.e., while the number of nodes in a biological network may be quite large, each node is affected only by a small number of other nodes, and robustness, i.e., small number of attractors

arXiv.org e-Print Archive

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

Inherent limitations of probabilistic models for protein-DNA binding specificity

Author: Ruan Shuxiang
Stormo Gary D
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

The specificities of transcription factors are most commonly represented with probabilistic models. These models provide a probability for each base occurring at each position within the binding site and the positions are assumed to contribute independently. The model is simple and intuitive and is the basis for many motif discovery algorithms. However, the model also has inherent limitations that prevent it from accurately representing true binding probabilities, especially for the highest affinity sites under conditions of high protein concentration. The limitations are not due to the assumption of independence between positions but rather are caused by the non-linear relationship between binding affinity and binding probability and the fact that independent normalization at each position skews the site probabilities. Generally probabilistic models are reasonably good approximations, but new high-throughput methods allow for biophysical models with increased accuracy that should be used whenever possible

Directory of Open Access Journals

FigShare