Search CORE

9,105 research outputs found

Analysis of Three-Dimensional Protein Images

Author: Baxter K.
Fortier S.
Glasgow J.
Leherte L.
Steeg E.
Publication venue
Publication date: 01/01/1997
Field of study

A fundamental goal of research in molecular biology is to understand protein structure. Protein crystallography is currently the most successful method for determining the three-dimensional (3D) conformation of a protein, yet it remains labor intensive and relies on an expert's ability to derive and evaluate a protein scene model. In this paper, the problem of protein structure determination is formulated as an exercise in scene analysis. A computational methodology is presented in which a 3D image of a protein is segmented into a graph of critical points. Bayesian and certainty factor approaches are described and used to analyze critical point graphs and identify meaningful substructures, such as alpha-helices and beta-sheets. Results of applying the methodologies to protein images at low and medium resolution are reported. The research is related to approaches to representation, segmentation and classification in vision, as well as to top-down approaches to protein structure prediction.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Repository of the University of Namur

CATHEDRAL: A Fast and Effective Algorithm to Predict Folds and Domain Boundaries from Multidomain Protein Structures

Author: Andrew Harrison
Christine A Orengo
Frances M. G Pearl
Oliver C Redfern
Robert B Russell
Tim Dallman
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

We present CATHEDRAL, an iterative protocol for determining the location of previously observed protein folds in novel multidomain protein structures. CATHEDRAL builds on the features of a fast secondary-structure–based method (using graph theory) to locate known folds within a multidomain context and a residue-based, double-dynamic programming algorithm, which is used to align members of the target fold groups against the query protein structure to identify the closest relative and assign domain boundaries. To increase the fidelity of the assignments, a support vector machine is used to provide an optimal scoring scheme. Once a domain is verified, it is excised, and the search protocol is repeated in an iterative fashion until all recognisable domains have been identified. We have performed an initial benchmark of CATHEDRAL against other publicly available structure comparison methods using a consensus dataset of domains derived from the CATH and SCOP domain classifications. CATHEDRAL shows superior performance in fold recognition and alignment accuracy when compared with many equivalent methods. If a novel multidomain structure contains a known fold, CATHEDRAL will locate it in 90% of cases, with <1% false positives. For nearly 80% of assigned domains in a manually validated test set, the boundaries were correctly delineated within a tolerance of ten residues. For the remaining cases, previously classified domains were very remotely related to the query chain so that embellishments to the core of the fold caused significant differences in domain sizes and manual refinement of the boundaries was necessary. To put this performance in context, a well-established sequence method based on hidden Markov models was only able to detect 65% of domains, with 33% of the subsequent boundaries assigned within ten residues. Since, on average, 50% of newly determined protein structures contain more than one domain unit, and typically 90% or more of these domains are already classified in CATH, CATHEDRAL will considerably facilitate the automation of protein structure classification

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

Sussex Research Online

Kernel-based machine learning protocol for predicting DNA-binding proteins

Author: Bhardwaj Nitin
Langlois Robert E.
Lu Hui
Zhao Guijun
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

DNA-binding proteins (DNA-BPs) play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. Attempts have been made to identify DNA-BPs based on their sequence and structural information with moderate accuracy. Here we develop a machine learning protocol for the prediction of DNA-BPs where the classifier is Support Vector Machines (SVMs). Information used for classification is derived from characteristics that include surface and overall composition, overall charge and positive potential patches on the protein surface. In total 121 DNA-BPs and 238 non-binding proteins are used to build and evaluate the protocol. In self-consistency, accuracy value of 100% has been achieved. For cross-validation (CV) optimization over entire dataset, we report an accuracy of 90%. Using leave 1-pair holdout evaluation, the accuracy of 86.3% has been achieved. When we restrict the dataset to less than 20% sequence identity amongst the proteins, the holdout accuracy is achieved at 85.8%. Furthermore, seven DNA-BPs with unbounded structures are all correctly predicted. The current performances are better than results published previously. The higher accuracy value achieved here originates from two factors: the ability of the SVM to handle features that demonstrate a wide range of discriminatory power and, a different definition of the positive patch. Since our protocol does not lean on sequence or structural homology, it can be used to identify or predict proteins with DNA-binding function(s) regardless of their homology to the known ones

CiteSeerX

Crossref

PubMed Central

Characterization of Aptamer-Protein Complexes by X-ray Crystallography and Alternative Approaches

Author: Baugh
Bauke W. Dijkstra
Bing
Bock
Cao
Chayen
Convery
Doudna
Ellington
Friedmann
Garber
Hauke Smidt
Hermann
Hianik
Hoggan
Hollis
Horn
Huang
Huang
Hwang
Jiang
Johan Hekelaar
John van der Oost
Kaur
Ke
Kelly
Kikin
Krauss
Kwan
Laing
Lebruska
Lee
Long
Lupold
Macaya
Mark Levisson
Mascini
McPherson
Mehta
Miyakawa
Moorthy
Murai
Nix
Nomura
Orlova
Padmanabhan
Padmanabhan
Paige
Parisien
Poniková
Reinemann
Reinstein
Renault
Rivas
Rowsell
Ruigrok
Sekiya
Shum
Skrzypczak-Jankun
Snyder
Someya
Stoltenburg
Sugiyama
Sussman
Tereshko
Tuerk
Vincent J. B. Ruigrok
Wang
Wilson
Win
Wochner
Yan
Yee
Zuker
Publication venue
Publication date: 01/01/2012
Field of study

Aptamers are oligonucleotide ligands, either RNA or ssDNA, selected for high-affinity binding to molecular targets, such as small organic molecules, proteins or whole microorganisms. While reports of new aptamers are numerous, characterization of their specific interaction is often restricted to the affinity of binding (KD). Over the years, crystal structures of aptamer-protein complexes have only scarcely become available. Here we describe some relevant technical issues about the process of crystallizing aptamer-protein complexes and highlight some biochemical details on the molecular basis of selected aptamer-protein interactions. In addition, alternative experimental and computational approaches are discussed to study aptamer-protein interactions.

Multidisciplinary Digital Publishing Institute

University of Groningen

Directory of Open Access Journals

Wageningen University & Research Publications

CiteSeerX

Crossref

Proceedings - University of Groningen

ARTS repository - University of Groningen

PubMed Central

University of Groningen Digital Archive

Dissertations of the University of Groningen

Prediction of peptides binding to MHC class I alleles by partial periodic pattern mining

Author: Meydan Cem
Otu Hasan
Sezerman Ugur
Sezerman Uğur
Publication venue: ODTU (Ortadoğu Teknik Üniversitesi)
Publication date: 16/04/2009
Field of study

MHC (Major Histocompatibility Complex) is a key player in the immune response of an organism. It is important to be able to predict which antigenic peptides will bind to a spe-cific MHC allele and which will not, creating possibilities for controlling immune response and for the applications of immunotherapy. However a problem encountered in the computational binding prediction methods for MHC class I is the presence of bulges and loops in the peptides, changing the total length. Most machine learning methods in use to-day require the sequences to be of same length to success-fully mine the binding motifs. We propose the use of time-based data mining methods in motif mining to be able to mine motifs position-independently. Also, the information for both binding and non-binding peptides are used on the contrary to the other methods which only rely on binding peptides. The prediction results are between 70-80% for the tested alleles

Sabanci University Research Database