Search CORE

5,081 research outputs found

Minimum message length inference of secondary structure from protein coordinate data

Author: A. M. Lesk
A. S. Konagurthu
Colloc'h
Cuff
Dupuis
Fodje
Frishman
Kabsch
King
L. Allison
Lesk
Levitt
Majumdar
Martin
Pauling
Richards
Richardson
Sklenar
Srinivasan
Taylor
Zhang
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

Motivation: Secondary structure underpins the folding pattern and architecture of most proteins. Accurate assignment of the secondary structure elements is therefore an important problem. Although many approximate solutions of the secondary structure assignment problem exist, the statement of the problem has resisted a consistent and mathematically rigorous definition. A variety of comparative studies have highlighted major disagreements in the way the available methods define and assign secondary structure to coordinate data

Crossref

PubMed Central

Monash University Research Portal

Kernel Belief Propagation

Author: Bickson Danny
Gretton Arthur
Guestrin Carlos
Low Yucheng
Song Le
Publication venue
Publication date: 01/01/2011
Field of study

We propose a nonparametric generalization of belief propagation, Kernel Belief Propagation (KBP), for pairwise Markov random fields. Messages are represented as functions in a reproducing kernel Hilbert space (RKHS), and message updates are simple linear operations in the RKHS. KBP makes none of the assumptions commonly required in classical BP algorithms: the variables need not arise from a finite domain or a Gaussian distribution, nor must their relations take any particular parametric form. Rather, the relations between variables are represented implicitly, and are learned nonparametrically from training data. KBP has the advantage that it may be used on any domain where kernels are defined (Rd, strings, groups), even where explicit parametric models are not known, or closed form expressions for the BP updates do not exist. The computational cost of message updates in KBP is polynomial in the training data size. We also propose a constant time approximate message update procedure by representing messages using a small number of basis functions. In experiments, we apply KBP to image denoising, depth prediction from still images, and protein configuration prediction: KBP is faster than competing classical and nonparametric approaches (by orders of magnitude, in some cases), while providing significantly more accurate results

arXiv.org e-Print Archive

CiteSeerX

Factored expectation propagation for input-output FHMM models in systems biology

Author: Cseke Botond
Sanguinetti Guido
Publication venue
Publication date: 17/05/2013
Field of study

We consider the problem of joint modelling of metabolic signals and gene expression in systems biology applications. We propose an approach based on input-output factorial hidden Markov models and propose a structured variational inference approach to infer the structure and states of the model. We start from the classical free form structured variational mean field approach and use a expectation propagation to approximate the expectations needed in the variational loop. We show that this corresponds to a factored expectation constrained approximate inference. We validate our model through extensive simulations and demonstrate its applicability on a real world bacterial data set

arXiv.org e-Print Archive

CiteSeerX

Direct-coupling analysis of residue co-evolution captures native contacts across many protein families

Author: A. Bertolino
A. Pagnani
Altschuh
Atchley
B. Lunt
Berman
Burger
C. Sander
Campbell
D. S. Marks
Dima
Eddy
F. Morcos
Go
Gouveia-Oliveira
G bel
Halabi
Herrou
Hoch
J. N. Onuchic
Lashuel
Lee
Lockless
Lunt
M. Weigt
Maris
Miyazawa
Neher
Pai
Pasternak
Procaccini
R. Zecchina
Schug
Shindyalov
Skerker
T. Hwa
Tame
Tanaka
Tillier
White
Wisedchaisri
Wollenberg
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2011
Field of study

The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced Direct Coupling Analysis (DCA) (Weigt et al. (2009) Proc Natl Acad Sci 106:67). Here we develop a computationally efficient implementation of DCA, which allows us to evaluate the accuracy of contact prediction by DCA for a large number of protein domains, based purely on sequence information. DCA is shown to yield a large number of correctly predicted contacts, recapitulating the global structure of the contact map for the majority of the protein domains examined. Furthermore, our analysis captures clear signals beyond intra- domain residue contacts, arising, e.g., from alternative protein conformations, ligand- mediated residue couplings, and inter-domain interactions in protein oligomers. Our findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, provided the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.Comment: 28 pages, 7 figures, to appear in PNA

arXiv.org e-Print Archive

Crossref

PubMed Central

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

HAL-Paris1

Hal-Diderot

PORTO Publications Open Repository TOrino

Direct-coupling analysis of residue co-evolution captures native contacts across many protein families

Author: Morcos Faruck
Pagnani Andrea
Lunt Bryan
Bertolino Arianna
Marks Debora S.
Sander Chris
Zecchina Riccardo
Onuchic Jose' N.
Hwa Terence
Weigt Martin
Publication venue
Publication date: 01/01/2003
Field of study

arXiv.org e-Print Archive

Crossref

PubMed Central

Archivio istituzionale della ricerca - Università di Cagliari

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

HAL-Paris1

CERN Document Server

PORTO Publications Open Repository TOrino

Recommended from our members

Hierarchical dynamics of individual RNA helix base pair formation and disruption

Author: Hon Jason J.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2017
Field of study

This thesis explores the RNA folding problem using single-molecule field effect transistors (smFETs) to measure the lifetimes of individual RNA base-pairing rearrangements. In the course of this research, considerable computational, chemical, and engineering contributions were developed so that the single-molecule measurements could be conducted and quantified. These advancements have allowed, on the basis of the smFET data collected herein, the quantification of a kinetic model for RNA stem-loop structures which has been generalized to quantitatively explore the phenomenological observation that an RNA found in the bacillus subtilis strain acts as a metabolite-sensing switch, allowing RNA polymerase to transcribe the messenger RNA when the metabolite is present and preventing transcription when the metabolite is absent. Together, the data presented quantify a simple model for the base pairing rearrangements that underlie RNA folding

Columbia University Academic Commons

A protein-dependent side-chain rotamer library

Author: Bhuyan Md Shariful Islam
Gao Xin
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Protein side-chain packing problem has remained one of the key open problems in bioinformatics. The three main components of protein side-chain prediction methods are a rotamer library, an energy function and a search algorithm. Rotamer libraries summarize the existing knowledge of the experimentally determined structures quantitatively. Depending on how much contextual information is encoded, there are backbone-independent rotamer libraries and backbone-dependent rotamer libraries. Backbone-independent libraries only encode sequential information, whereas backbone-dependent libraries encode both sequential and locally structural information. However, side-chain conformations are determined by spatially local information, rather than sequentially local information. Since in the side-chain prediction problem, the backbone structure is given, spatially local information should ideally be encoded into the rotamer libraries. Methods In this paper, we propose a new type of backbone-dependent rotamer library, which encodes structural information of all the spatially neighboring residues. We call it protein-dependent rotamer libraries. Given any rotamer library and a protein backbone structure, we first model the protein structure as a Markov random field. Then the marginal distributions are estimated by the inference algorithms, without doing global optimization or search. The rotamers from the given library are then re-ranked and associated with the updated probabilities. Results Experimental results demonstrate that the proposed protein-dependent libraries significantly outperform the widely used backbone-dependent libraries in terms of the side-chain prediction accuracy and the rotamer ranking ability. Furthermore, without global optimization/search, the side-chain prediction power of the protein-dependent library is still comparable to the global-search-based side-chain prediction methods.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central