Search CORE

16,018 research outputs found

Alignment of helical membrane protein sequences using AlignMe

Author: Forrest Lucy R.
Khafizov Kamil
Stamm Marcus
Staritzbichler René
Publication venue
Publication date: 01/01/2013
Field of study

Few sequence alignment methods have been designed specifically for integral membrane proteins, even though these important proteins have distinct evolutionary and structural properties that might affect their alignments. Existing approaches typically consider membrane-related information either by using membrane-specific substitution matrices or by assigning distinct penalties for gap creation in transmembrane and non-transmembrane regions. Here, we ask whether favoring matching of predicted transmembrane segments within a standard dynamic programming algorithm can improve the accuracy of pairwise membrane protein sequence alignments. We tested various strategies using a specifically designed program called AlignMe. An updated set of homologous membrane protein structures, called HOMEP2, was used as a reference for optimizing the gap penalties. The best of the membrane-protein optimized approaches were then tested on an independent reference set of membrane protein sequence alignments from the BAliBASE collection. When secondary structure (S) matching was combined with evolutionary information (using a position-specific substitution matrix (P)), in an approach we called AlignMePS, the resultant pairwise alignments were typically among the most accurate over a broad range of sequence similarities when compared to available methods. Matching transmembrane predictions (T), in addition to evolutionary information, and secondary-structure predictions, in an approach called AlignMePST, generally reduces the accuracy of the alignments of closely-related proteins in the BAliBASE set relative to AlignMePS, but may be useful in cases of extremely distantly related proteins for which sequence information is less informative. The open source AlignMe code is available at https://sourceforge.net/projects/alignme/, and at http://www.forrestlab.org, along with an online server and the HOMEP2 data set

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Hochschulschriftenserver - Universität Frankfurt am Main

FigShare

CATHEDRAL: A Fast and Effective Algorithm to Predict Folds and Domain Boundaries from Multidomain Protein Structures

Author: Andrew Harrison
Christine A Orengo
Frances M. G Pearl
Oliver C Redfern
Robert B Russell
Tim Dallman
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

We present CATHEDRAL, an iterative protocol for determining the location of previously observed protein folds in novel multidomain protein structures. CATHEDRAL builds on the features of a fast secondary-structure–based method (using graph theory) to locate known folds within a multidomain context and a residue-based, double-dynamic programming algorithm, which is used to align members of the target fold groups against the query protein structure to identify the closest relative and assign domain boundaries. To increase the fidelity of the assignments, a support vector machine is used to provide an optimal scoring scheme. Once a domain is verified, it is excised, and the search protocol is repeated in an iterative fashion until all recognisable domains have been identified. We have performed an initial benchmark of CATHEDRAL against other publicly available structure comparison methods using a consensus dataset of domains derived from the CATH and SCOP domain classifications. CATHEDRAL shows superior performance in fold recognition and alignment accuracy when compared with many equivalent methods. If a novel multidomain structure contains a known fold, CATHEDRAL will locate it in 90% of cases, with <1% false positives. For nearly 80% of assigned domains in a manually validated test set, the boundaries were correctly delineated within a tolerance of ten residues. For the remaining cases, previously classified domains were very remotely related to the query chain so that embellishments to the core of the fold caused significant differences in domain sizes and manual refinement of the boundaries was necessary. To put this performance in context, a well-established sequence method based on hidden Markov models was only able to detect 65% of domains, with 33% of the subsequent boundaries assigned within ten residues. Since, on average, 50% of newly determined protein structures contain more than one domain unit, and typically 90% or more of these domains are already classified in CATH, CATHEDRAL will considerably facilitate the automation of protein structure classification

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

Sussex Research Online

Structural Alignment of RNAs Using Profile-csHMMs and Its Application to RNA Homology Search: Overview and New Results

Author: Vaidyanathan P. P.
Yoon Byung-Jun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Systematic research on noncoding RNAs (ncRNAs) has revealed that many ncRNAs are actively involved in various biological networks. Therefore, in order to fully understand the mechanisms of these networks, it is crucial to understand the roles of ncRNAs. Unfortunately, the annotation of ncRNA genes that give rise to functional RNA molecules has begun only recently, and it is far from being complete. Considering the huge amount of genome sequence data, we need efficient computational methods for finding ncRNA genes. One effective way of finding ncRNA genes is to look for regions that are similar to known ncRNA genes. As many ncRNAs have well-conserved secondary structures, we need statistical models that can represent such structures for this purpose. In this paper, we propose a new method for representing RNA sequence profiles and finding structural alignment of RNAs based on profile context-sensitive hidden Markov models (profile-csHMMs). Unlike existing models, the proposed approach can handle any kind of RNA secondary structures, including pseudoknots. We show that profile-csHMMs can provide an effective framework for the computational analysis of RNAs and the identification of ncRNA genes

CiteSeerX

Caltech Authors

Design of RNAi reagents for invertebrate model organisms and human disease vectors

Author: Michael Boutros
Thomas Horn
Publication venue
Publication date: 04/12/2011
Field of study

RNAi has become an important tool to silence gene expression in a variety of organisms, in particular when classical genetic methods are missing. However, application of this method in functional studies has raised new challenges in the design of RNAi reagents in order to minimize false positive and false negative results. Since the performance of reagents can be rarely validated on a genome-wide scale, improved computational methods are required that consider experimentally derived design parameters. Here, we describe computational methods for the design of RNAi reagents for invertebrate model organisms and human disease vectors, such as Anopheles. We describe procedures on how to design short and long double-stranded RNAs for single genes, and evaluate their predicted specificity and efficiency. Using a bioinformatics pipeline we also describe how to design a genome-wide RNAi library for Anopheles gambiae

Nature Precedings

Predicted structures of agonist and antagonist bound complexes of adenosine A_3 receptor

Author: Abrol Ravinder
Goddard William A., III
Jacobson Kenneth A.
Kim Soo-Kyung
Riley Lindsay
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/06/2011
Field of study

We used the GEnSeMBLE Monte Carlo method to predict ensemble of the 20 best packings (helix rotations and tilts) based on the neutral total energy (E) from a vast number (10 trillion) of potential packings for each of the four subtypes of the adenosine G protein-coupled receptors (GPCRs), which are involved in many cytoprotective functions. We then used the DarwinDock Monte Carlo methods to predict the binding pose for the human A_3 adenosine receptor (hAA_3R) for subtype selective agonists and antagonists. We found that all four A_3 agonists stabilize the 15th lowest conformation of apo-hAA_3R while also binding strongly to the 1st and 3rd. In contrast the four A_3 antagonists stabilize the 2nd or 3rd lowest conformation. These results show that different ligands can stabilize different GPCR conformations, which will likely affect function, complicating the design of functionally unique ligands. Interestingly all agonists lead to a trans χ1 angle for W6.48 that experiments on other GPCRs associate with G-protein activation while all 20 apo-AA_3R conformations have a W6.48 gauche+ χ1 angle associated experimentally with inactive GPCRs for other systems. Thus docking calculations have identified critical ligand-GPCR structures involved with activation. We found that the predicted binding site for selective agonist Cl-IB-MECA to the predicted structure of hAA_3R shows favorable interactions to three subtype variable residues, I253^(6.58), V169^(EL2), and Q167^(EL2), while the predicted structure for hAA_(2A)R shows weakened to the corresponding amino acids: T256^(6.58), E169^(EL2), and L167^(EL2), explaining the observed subtype selectivity

Caltech Authors

Recommended from our members

Reconstructing an ancestral genotype of two hexachlorocyclohexane-degrading Sphingobium species using metagenomic sequence data.

Author: Gilbert Jack A
Khurana Jitendra P
Khurana Paramjit
Kumar Roshan
Lal Rup
Lax Simon
Negi Vivek
Sangwan Naseer
Verma Helianthous
Publication venue: eScholarship, University of California
Publication date: 01/02/2014
Field of study

Over the last 60 years, the use of hexachlorocyclohexane (HCH) as a pesticide has resulted in the production of >4 million tons of HCH waste, which has been dumped in open sinks across the globe. Here, the combination of the genomes of two genetic subspecies (Sphingobium japonicum UT26 and Sphingobium indicum B90A; isolated from two discrete geographical locations, Japan and India, respectively) capable of degrading HCH, with metagenomic data from an HCH dumpsite (∼450 mg HCH per g soil), enabled the reconstruction and validation of the last-common ancestor (LCA) genotype. Mapping the LCA genotype (3128 genes) to the subspecies genomes demonstrated that >20% of the genes in each subspecies were absent in the LCA. This includes two enzymes from the 'upper' HCH degradation pathway, suggesting that the ancestor was unable to degrade HCH isomers, but descendants acquired lin genes by transposon-mediated lateral gene transfer. In addition, anthranilate and homogentisate degradation traits were found to be strain (selectively retained only by UT26) and environment (absent in the LCA and subspecies, but prevalent in the metagenome) specific, respectively. One draft secondary chromosome, two near complete plasmids and eight complete lin transposons were assembled from the metagenomic DNA. Collectively, these results reinforce the elastic nature of the genus Sphingobium, and describe the evolutionary acquisition mechanism of a xenobiotic degradation phenotype in response to environmental pollution. This also demonstrates for the first time the use of metagenomic data in ancestral genotype reconstruction, highlighting its potential to provide significant insight into the development of such phenotypes

eScholarship - University of California