Search CORE

27,590 research outputs found

A spatio-temporal mining approach towards summarizing and analyzing protein folding trajectories

Author: B Zagrovic
CD Snow
CD Snow
CE Shannon
D Berrar
D Russel
Duygu Ucar
G Lancia
H Yang
H Yang
H Yang
H Yang
Hui Yang
J Hu
J MacQueen
M DE
M Shirts
M Vendruscolo
MR Garey
P Ferrara
P Ferrara
P KW
RL Baldwin
SR Griffiths-Jones
Srinivasan Parthasarathy
VS Pande
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Understanding the protein folding mechanism remains a grand challenge in structural biology. In the past several years, computational theories in molecular dynamics have been employed to shed light on the folding process. Coupled with high computing power and large scale storage, researchers now can computationally simulate the protein folding process in atomistic details at femtosecond temporal resolution. Such simulation often produces a large number of folding trajectories, each consisting of a series of 3D conformations of the protein under study. As a result, effectively managing and analyzing such trajectories is becoming increasingly important. In this article, we present a spatio-temporal mining approach to analyze protein folding trajectories. It exploits the simplicity of contact maps, while also integrating 3D structural information in the analysis. It characterizes the dynamic folding process by first identifying spatio-temporal association patterns in contact maps, then studying how such patterns evolve along a folding trajectory. We demonstrate that such patterns can be leveraged to summarize folding trajectories, and to facilitate the detection and ordering of important folding events along a folding path. We also show that such patterns can be used to identify a consensus partial folding pathway across multiple folding trajectories. Furthermore, we argue that such patterns can capture both local and global structural topology in a 3D protein conformation, thereby facilitating effective structural comparison amongst conformations. We apply this approach to analyze the folding trajectories of two small synthetic proteins-BBA5 and GSGS (or Beta3S). We show that this approach is promising towards addressing the above issues, namely, folding trajectory summarization, folding events detection and ordering, and consensus partial folding pathway identification across trajectories

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

An Overview of the Use of Neural Networks for Data Mining Tasks

Author: Alberts B
Alpaydin E
Ando T
Blake CL
Bramer MA
Castanheira LG
Han J
Lu H
Mitchell M
Ni X
Quinlan RJ
Rumelhart DE
Shafer JC
Shendure J
Simić D
Stahl F
Steinwart I
Surjandari I
Wei JS
Widrow B
Witten IH
Zaslavsky B
Zhang D
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

Central Archive at the University of Reading

Crossref

Portsmouth University Research Portal (Pure)

Bournemouth University Research Online

Embedding machine-readable proteins interactions data in scientific articles for easy access and retrieval

Author: Alberto Termanini
Claudio Franceschi
Paolo Tieri
Piero Fariselli
Publication venue
Publication date: 29/09/2008
Field of study

Extraction of protein-protein interactions data from scientific literature remains a hard, time- and resource-consuming task. This task would be greatly simplified by embedding in the source, i.e. research articles, a standardized, synthetic, machine-readable codification for protein-protein interactions data description, to make the identification and the retrieval of such very valuable information easier, faster, and more reliable than now.
We shortly discuss how this information can be easily encoded and embedded in research papers with the collaboration of authors and scientific publishers, and propose an online demonstrative tool that shows how to help and allow authors for the easy and fast conversion of such valuable biological data into an embeddable, accessible, computer-readable codification

Nature Precedings

WormBase: a multi-species resource for nematode biology and genomics

Author: Antoshechkin Igor
Bastiani Carol
Bieri Tamberlyn
Blasiar Darin
Bradnam Keith
Chan Juancarlos
Chen Chao-Kung
Chen Nansheng
Chen Wen J.
Cunningham Fiona
Davis Paul
Durbin Richard
Harris Todd W.
Kenny Eimear
Kishore Ranjana
Lawson Daniel
Lee Raymond Y. N.
Müller Hans-Michael
Nakamura Cecilia
Ozersky Philip
Petcherski Andrei
Rogers Anthony
Sabo Aniko
Schwarz Erich M.
Spieth John
Stein Lincoln D.
Sternberg Paul W.
Tello-Ruiz Marcela
Van Auken Kimberly
Wang Qinghua
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2004
Field of study

WormBase (http://www.wormbase.org/) is the central data repository for information about Caenorhabditis elegans and related nematodes. As a model organism database, WormBase extends beyond the genomic sequence, integrating experimental results with extensively annotated views of the genome. The WormBase Consortium continues to expand the biological scope and utility of WormBase with the inclusion of large-scale genomic analyses, through active data and literature curation, through new analysis and visualization tools, and through refinement of the user interface. Over the past year, the nearly complete genomic sequence and comparative analyses of the closely related species Caenorhabditis briggsae have been integrated into WormBase, including gene predictions, ortholog assignments and a new synteny viewer to display the relationships between the two species. Extensive site-wide refinement of the user interface now provides quick access to the most frequently accessed resources and a consistent browsing experience across the site. Unified single-page views now provide complete summaries of commonly accessed entries like genes. These advances continue to increase the utility of WormBase for C.elegans researchers, as well as for those researchers exploring problems in functional and comparative genomics in the context of a powerful genetic system

CiteSeerX

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central

Caltech Authors

GPCRTree: online hierarchical classification of GPCR function

Author: Alex A Freitas
Alex A Freitas
Andrew Secker
Darren R Flower Open Access
David S Moss
David S Moss
Edward Clark
Jon Timmis
Mark Halling-brown
Matthew N Davies
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Background: G protein-coupled receptors (GPCRs) play important physiological roles transducing extracellular signals into intracellular responses. Approximately 50% of all marketed drugs target a GPCR. There remains considerable interest in effectively predicting the function of a GPCR from its primary sequence. Findings: Using techniques drawn from data mining and proteochemometrics, an alignment-free approach to GPCR classification has been devised. It uses a simple representation of a protein's physical properties. GPCRTree, a publicly-available internet server, implements an algorithm that classifies GPCRs at the class, sub-family and sub-subfamily level. Conclusion: A selective top-down classifier was developed which assigns sequences within a GPCR hierarchy. Compared to other publicly available GPCR prediction servers, GPCRTree is considerably more accurate at every level of classification. The server has been available online since March 2008 at URL: http://igrid-ext.cryst.bbk.ac.uk/gpcrtree

CiteSeerX

Crossref

Aberystwyth Research Portal

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Aston Publications Explorer

Birkbeck Institutional Research Online

Kent Academic Repository

Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem

Author: Giuliani Alessandro
Livi Lorenzo
Rizzi Antonello
Publication venue: 'Elsevier BV'
Publication date: 29/04/2015
Field of study

This paper builds upon the fundamental work of Niwa et al. [34], which provides the unique possibility to analyze the relative aggregation/folding propensity of the elements of the entire Escherichia coli (E. coli) proteome in a cell-free standardized microenvironment. The hardness of the problem comes from the superposition between the driving forces of intra- and inter-molecule interactions and it is mirrored by the evidences of shift from folding to aggregation phenotypes by single-point mutations [10]. Here we apply several state-of-the-art classification methods coming from the field of structural pattern recognition, with the aim to compare different representations of the same proteins gathered from the Niwa et al. data base; such representations include sequences and labeled (contact) graphs enriched with chemico-physical attributes. By this comparison, we are able to identify also some interesting general properties of proteins. Notably, (i) we suggest a threshold around 250 residues discriminating "easily foldable" from "hardly foldable" molecules consistent with other independent experiments, and (ii) we highlight the relevance of contact graph spectra for folding behavior discrimination and characterization of the E. coli solubility data. The soundness of the experimental results presented in this paper is proved by the statistically relevant relationships discovered among the chemico-physical description of proteins and the developed cost matrix of substitution used in the various discrimination systems.Comment: 17 pages, 3 figures, 46 reference

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Mining Representative Unsubstituted Graph Patterns Using Prior Similarity Matrix

Author: Dhifli Wajdi
Nguifo Engelbert Mephu
Saidi Rabie
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

One of the most powerful techniques to study protein structures is to look for recurrent fragments (also called substructures or spatial motifs), then use them as patterns to characterize the proteins under study. An emergent trend consists in parsing proteins three-dimensional (3D) structures into graphs of amino acids. Hence, the search of recurrent spatial motifs is formulated as a process of frequent subgraph discovery where each subgraph represents a spatial motif. In this scope, several efficient approaches for frequent subgraph discovery have been proposed in the literature. However, the set of discovered frequent subgraphs is too large to be efficiently analyzed and explored in any further process. In this paper, we propose a novel pattern selection approach that shrinks the large number of discovered frequent subgraphs by selecting the representative ones. Existing pattern selection approaches do not exploit the domain knowledge. Yet, in our approach we incorporate the evolutionary information of amino acids defined in the substitution matrices in order to select the representative subgraphs. We show the effectiveness of our approach on a number of real datasets. The results issued from our experiments show that our approach is able to considerably decrease the number of motifs while enhancing their interestingness

arXiv.org e-Print Archive

HAL Clermont Université

Hal-Diderot