Search CORE

9,709 research outputs found

Peptide vocabulary analysis reveals ultra-conservation and homonymity in protein sequences

Author: Gatherer D.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2007
Field of study

A new algorithm is presented for vocabulary analysis (word detection) in texts of human origin. It performs at 60%–70% overall accuracy and greater than 80% accuracy for longer words, and approximately 85% sensitivity on Alice in Wonderland, a considerable improvement on previous methods. When applied to protein sequences, it detects short sequences analogous to words in human texts, i.e. intolerant to changes in spelling (mutation), and relatively contextindependent in their meaning (function). Some of these are homonyms of up to 7 amino acids, which can assume different structures in different proteins. Others are ultra-conserved stretches of up to 18 amino acids within proteins of less than 40% overall identity, reflecting extreme constraint or convergent evolution. Different species are found to have qualitatively different major peptide vocabularies, e.g. some are dominated by large gene families, while others are rich in simple repeats or dominated by internally repetitive proteins. This suggests the possibility of a peptide vocabulary signature, analogous to genome signatures in DNA. Homonyms may be useful in detecting convergent evolution and positive selection in protein evolution. Ultra-conserved words may be useful in identifying structures intolerant to substitution over long periods of evolutionary time

Directory of Open Access Journals

Enlighten

Lancaster E-Prints

`The frozen accident' as an evolutionary adaptation: A rate distortion theory perspective on the dynamics and symmetries of genetic coding mechanisms

Author: James F. Glazebrook
Rodrick Wallace
Publication venue
Publication date: 22/02/2011
Field of study

We survey some interpretations and related issues concerning the frozen hypothesis due to F. Crick and how it can be explained in terms of several natural mechanisms involving error correction codes, spin glasses, symmetry breaking and the characteristic robustness of genetic networks. The approach to most of these questions involves using elements of Shannon's rate distortion theory incorporating a semantic system which is meaningful for the relevant alphabets and vocabulary implemented in transmission of the genetic code. We apply the fundamental homology between information source uncertainty with the free energy density of a thermodynamical system with respect to transcriptional regulators and the communication channels of sequence/structure in proteins. This leads to the suggestion that the frozen accident may have been a type of evolutionary adaptation

Evolutionary constraints on the complexity of genetic regulatory networks allow predictions of the total number of genetic interactions

Author: Campos-González Adrian I.
Freyre-González Julio A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/01/2019
Field of study

Genetic regulatory networks (GRNs) have been widely studied, yet there is a lack of understanding with regards to the final size and properties of these networks, mainly due to no network currently being complete. In this study, we analyzed the distribution of GRN structural properties across a large set of distinct prokaryotic organisms and found a set of constrained characteristics such as network density and number of regulators. Our results allowed us to estimate the number of interactions that complete networks would have, a valuable insight that could aid in the daunting task of network curation, prediction, and validation. Using state-of-the-art statistical approaches, we also provided new evidence to settle a previously stated controversy that raised the possibility of complete biological networks being random and therefore attributing the observed scale-free properties to an artifact emerging from the sampling process during network discovery. Furthermore, we identified a set of properties that enabled us to assess the consistency of the connectivity distribution for various GRNs against different alternative statistical distributions. Our results favor the hypothesis that highly connected nodes (hubs) are not a consequence of network incompleteness. Finally, an interaction coverage computed for the GRNs as a proxy for completeness revealed that high-throughput based reconstructions of GRNs could yield biased networks with a low average clustering coefficient, showing that classical targeted discovery of interactions is still needed.Comment: 28 pages, 5 figures, 12 pages supplementary informatio

arXiv.org e-Print Archive

Directory of Open Access Journals

University of Queensland eSpace

Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

Author: Bastien Olivier
Birkholtz Lyn-Marie
Breton Vincent
Grando Delphine
Hofmann-Apitius Martin
Jacq Nicolas
Joubert Fourie
Kasam Vinod
Louw Abraham I
Maréchal Eric
Ortet Philippe
Roy Sylvaine
Saïdani Nadia
Wells Gordon
Zimmermann Marc
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

Hal - Université Grenoble Alpes

HAL AMU

HAL Clermont Université

ProdInra

arXiv.org e-Print Archive

Springer - Publisher Connector

UPSpace at the University of Pretoria

Revisiting Date and Party Hubs: Novel Approaches to Role Assignment in Protein Interaction Networks

Author: A Gursoy
A Rapoport
AC Gavin
AC Gavin
ACF Lewis
AI Su
AP Gasch
AS Schwartz
AW Rives
B Adamcsek
C von Mering
Charlotte M. Deane
CM Deane
E Hubbell
E Zotenko
G Kar
GD Bader
H Jeong
H Yu
H Yu
HW Mewes
I Maraziotis
IW Taylor
J Chen
J Reichardt
JA Hartigan
JDJ Han
JF Rual
JS Bader
K Komurov
K Tarassov
K Venkatesan
KR Brown
L Hakes
L Hakes
LC Freeman
M Fromont-Racine
M Fromont-Racine
M Girvan
MA Porter
Mason A. Porter
MEJ Newman
MR Wilkins
MS Granovetter
N Bertin
Nick S. Jones
NN Batada
NN Batada
P Braun
P Jaccard
P Kemmeren
P Resnik
P Uetz
Philip E. Bourne
PM Hartigan
PM Kim
PV Missiuro
R Dunn
R Guimerà
R Saeed
RR Vallabhajosyula
S Fortunato
SH Yook
Sumeet Agarwal
T Ito
T Obayashi
T Yamada
V Spirin
WK Lim
Y Ho
Z Wu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2010
Field of study

The idea of 'date' and 'party' hubs has been influential in the study of protein-protein interaction networks. Date hubs display low co-expression with their partners, whilst party hubs have high co-expression. It was proposed that party hubs are local coordinators whereas date hubs are global connectors. Here we show that the reported importance of date hubs to network connectivity can in fact be attributed to a tiny subset of them. Crucially, these few, extremely central, hubs do not display particularly low expression correlation, undermining the idea of a link between this quantity and hub function. The date/party distinction was originally motivated by an approximately bimodal distribution of hub co-expression; we show that this feature is not always robust to methodological changes. Additionally, topological properties of hubs do not in general correlate with co-expression. Thus, we suggest that a date/party dichotomy is not meaningful and it might be more useful to conceive of roles for protein-protein interactions rather than individual proteins. We find significant correlations between interaction centrality and the functional similarity of the interacting proteins.Comment: 27 pages, 5 main figures, 4 supplementary figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Oxford University Research Archive

Spiral - Imperial College Digital Repository