Search CORE

20,462 research outputs found

Translation conditional models for protein coding sequences

Author: Mathé Catherine
Rodolphe François
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/01/2000
Field of study

A coding sequence is defined as a DNA sequence coding the primary structure of a protein (a polypeptide). Such a sequence must satisfy a specific constraint, which consists in coding a functional protein, As the genetic code is degenerated, there exists, for a given polypeptide, a set of synonymous sequences which would code the same polypeptide, Translation conditional models are being defined on such sets. The aim of this paper is to give a common formalism, Besides the codon bias model, a few other conditional models will be defined. Statistical estimators and comparison methods will be briefly presented. These models can be used for gene classification, or to find out, in a real sequence, remarkable features. An example will be presented on Escherichia coli genes

Ghent University Academic Bibliography

Translation Conditional Models for Protein Coding Sequences

Author: Catherine Mathé
François Rodolphe
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

Metabolic constraints on the evolution of genetic codes: Did multiple 'preaerobic' ecosystem transitions entrain richer dialects via Serial Endosymbiosis?

Author: Rodrick Wallace
Publication venue
Publication date: 12/01/2010
Field of study

A mathematical model based on Tlusty's topological deconstruction suggests that multiple punctuated ecosystem shifts in available metabolic free energy, broadly akin to the 'aerobic' transition, enabled a punctuated sequence of increasingly complex genetic codes and protein translators under mechanisms similar to the Serial Endosymbiosis effecting the Eukaryotic transition. These evolved until the ancestor to the present narrow spectrum of nearly maximally robust codes became locked-in by path dependence

Nature Precedings

A Rate Distortion approach to protein symmetry

Author: Rodrick Wallace
Publication venue
Publication date: 14/03/2010
Field of study

A spontaneous symmetry breaking argument is applied to the problem of protein form, via a Rate Distortion analysis of the relation between genome coding and the final condensation of the protein 'molten globule'. The Rate Distortion Function, under coding constraints, serves as a temperature analog, so that low values act to drive proteins to simple symmetries. The Rate Distortion Function itself is significantly constrained by the availability of metabolic free energy. This work extends Tlusty's (2007) elegant exploration of the evolution of the genetic code, suggesting that rate distortion considerations may play a critical role across a broad spectrum of molecular expressions of evolutionary process

Nature Precedings

The glucocorticoid receptor in inflammatory processes : transrepression is not enough

Author: Dejager Lien
Hübner Sabine
Libert Claude
Tuckermann Jan P
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2015
Field of study

Glucocorticoids (GCs) are the most commonly used anti-inflammatory agents to treat inflammatory and immune diseases. However, steroid therapies are accompanied by severe side-effects during long-term treatment. The dogma that transrepression of genes, by tethering of the glucocorticoid receptor (GR) to DNA-bound pro-inflammatory transcription factors, is the main anti-inflammatory mechanism, is now challenged. Recent discoveries using conditional GR mutant mice and genomic approaches reveal that transactivation of anti-inflammatory acting genes is essential to suppress many inflammatory disease models. This novel view radically changes the concept to design selective acting GR ligands with a reduced side-effect profile

Ghent University Academic Bibliography

Archivsystem Ask23

Codon Bias Patterns of $E.coli$ 's Interacting Proteins

Author: Cimini Giulio
Deiana Antonio
Dilucca Maddalena
Giansanti Andrea
Semmoloni Andrea
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

Synonymous codons, i.e., DNA nucleotide triplets coding for the same amino acid, are used differently across the variety of living organisms. The biological meaning of this phenomenon, known as codon usage bias, is still controversial. In order to shed light on this point, we propose a new codon bias index,

CompAI

, that is based on the competition between cognate and near-cognate tRNAs during translation, without being tuned to the usage bias of highly expressed genes. We perform a genome-wide evaluation of codon bias for

E.coli

, comparing

CompAI

with other widely used indices:

tAI

CAI

, and

Nc

. We show that

CompAI

and

tAI

capture similar information by being positively correlated with gene conservation, measured by ERI, and essentiality, whereas,

CAI

and

Nc

appear to be less sensitive to evolutionary-functional parameters. Notably, the rate of variation of

tAI

and

CompAI

with ERI allows to obtain sets of genes that consistently belong to specific clusters of orthologous genes (COGs). We also investigate the correlation of codon bias at the genomic level with the network features of protein-protein interactions in

E.coli

. We find that the most densely connected communities of the network share a similar level of codon bias (as measured by

CompAI

and

tAI

). Conversely, a small difference in codon bias between two genes is, statistically, a prerequisite for the corresponding proteins to interact. Importantly, among all codon bias indices,

CompAI

turns out to have the most coherent distribution over the communities of the interactome, pointing to the significance of competition among cognate and near-cognate tRNAs for explaining codon usage adaptation

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

Archivio della ricerca della Scuola IMT Alti Studi Lucca

ART

Archivio della ricerca- Università di Roma La Sapienza

IMT Institutional Repository

FigShare

Hidden Markov Models for Gene Sequence Classification: Classifying the VSG genes in the Trypanosoma brucei Genome

Author: Alvarez-Valin Fernando
Basterrech Sebastián
Guerberoff Gustavo
Mesa Andrea
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/10/2015
Field of study

The article presents an application of Hidden Markov Models (HMMs) for pattern recognition on genome sequences. We apply HMM for identifying genes encoding the Variant Surface Glycoprotein (VSG) in the genomes of Trypanosoma brucei (T. brucei) and other African trypanosomes. These are parasitic protozoa causative agents of sleeping sickness and several diseases in domestic and wild animals. These parasites have a peculiar strategy to evade the host's immune system that consists in periodically changing their predominant cellular surface protein (VSG). The motivation for using patterns recognition methods to identify these genes, instead of traditional homology based ones, is that the levels of sequence identity (amino acid and DNA sequence) amongst these genes is often below of what is considered reliable in these methods. Among pattern recognition approaches, HMM are particularly suitable to tackle this problem because they can handle more naturally the determination of gene edges. We evaluate the performance of the model using different number of states in the Markov model, as well as several performance metrics. The model is applied using public genomic data. Our empirical results show that the VSG genes on T. brucei can be safely identified (high sensitivity and low rate of false positives) using HMM.Comment: Accepted article in July, 2015 in Pattern Analysis and Applications, Springer. The article contains 23 pages, 4 figures, 8 tables and 51 reference

arXiv.org e-Print Archive

DSpace at VSB Technical University of Ostrava

Bacterial riboproteogenomics : the era of N-terminal proteoform existence revealed

Author: Fijalkowska Daria
Fijalkowski Igor
Van Damme Petra
Willems Patrick
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2020
Field of study

With the rapid increase in the number of sequenced prokaryotic genomes, relying on automated gene annotation became a necessity. Multiple lines of evidence, however, suggest that current bacterial genome annotations may contain inconsistencies and are incomplete, even for so-called well-annotated genomes. We here discuss underexplored sources of protein diversity and new methodologies for high-throughput genome re-annotation. The expression of multiple molecular forms of proteins (proteoforms) from a single gene, particularly driven by alternative translation initiation, is gaining interest as a prominent contributor to bacterial protein diversity. In consequence, riboproteogenomic pipelines were proposed to comprehensively capture proteoform expression in prokaryotes by the complementary use of (positional) proteomics and the direct readout of translated genomic regions using ribosome profiling. To complement these discoveries, tailored strategies are required for the functional characterization of newly discovered bacterial proteoforms

Ghent University Academic Bibliography

Maximum entropy models capture melodic styles

Author: Loreto Vittorio
Pachet François
Sakellariou Jason
Tria Francesca
Publication venue
Publication date: 11/10/2016
Field of study

We introduce a Maximum Entropy model able to capture the statistics of melodies in music. The model can be used to generate new melodies that emulate the style of the musical corpus which was used to train it. Instead of using the

n-

body interactions of

(n-1)-

order Markov models, traditionally used in automatic music generation, we use a

k-

nearest neighbour model with pairwise interactions only. In that way, we keep the number of parameters low and avoid over-fitting problems typical of Markov models. We show that long-range musical phrases don't need to be explicitly enforced using high-order Markov interactions, but can instead emerge from multiple, competing, pairwise interactions. We validate our Maximum Entropy model by contrasting how much the generated sequences capture the style of the original corpus without plagiarizing it. To this end we use a data-compression approach to discriminate the levels of borrowing and innovation featured by the artificial sequences. The results show that our modelling scheme outperforms both fixed-order and variable-order Markov models. This shows that, despite being based only on pairwise interactions, this Maximum Entropy scheme opens the possibility to generate musically sensible alterations of the original phrases, providing a way to generate innovation

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza