Search CORE

PolyU Institutional Repository

The Hong Kong Polytechnic University Pao Yue-kong Library

Visualization of the protein-coding regions with a self adaptive spectral rotation approach

Author: Akhtar
Akhtar
Anastassiou
Anastassiou
Azad
Bennetzen
Berthelsen
Bo Chen
Borodovsky
Burge
Cao
Cebrat
Chang
Claverie
Do
Dodin
Dodin
Fickett
Fickett
Fickett
Frenkel
Frenkel
Gao
Haimovich
Henderson
Jiang
Kotlar
Li
Masoom
Olson
Orlov
Peng
Ping Ji
Ré
Salzberg
Staden
Stanke
Te Boekhorst
Tiwari
Tuqan
Tuqan
Voss
Yan
Yin
Zhang
Zhang
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

PolyU Institutional Repository

Springer - Publisher Connector

Periodicity of DNA in exons

Author: Eskesen Frank N
Eskesen Stephen T
Kinghorn Brian
Ruvinsky Anatoly
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: The periodic pattern of DNA in exons is a known phenomenon. It was suggested that one of the initial causes of periodicity could be the universal (RNY)(n)pattern (R = A or G, Y = C or U, N = any base) of ancient RNA. Two major questions were addressed in this paper. Firstly, the cause of DNA periodicity, which was investigated by comparisons between real and simulated coding sequences. Secondly, quantification of DNA periodicity was made using an evolutionary algorithm, which was not previously used for such purposes. RESULTS: We have shown that simulated coding sequences, which were composed using codon usage frequencies only, demonstrate DNA periodicity very similar to the observed in real exons. It was also found that DNA periodicity disappears in the simulated sequences, when the frequencies of codons become equal. Frequencies of the nucleotides (and the dinucleotide AG) at each location along phase 0 exons were calculated for C. elegans, D. melanogaster and H. sapiens. Two models were used to fit these data, with the key objective of describing periodicity. Both of the models showed that the best-fit curves closely matched the actual data points. The first dynamic period determination model consistently generated a value, which was very close to the period equal to 3 nucleotides. The second fixed period model, as expected, kept the period exactly equal to 3 and did not detract from its goodness of fit. CONCLUSIONS: Conclusion can be drawn that DNA periodicity in exons is determined by codon usage frequencies. It is essential to differentiate between DNA periodicity itself, and the length of the period equal to 3. Periodicity itself is a result of certain combinations of codons with different frequencies typical for a species. The length of period equal to 3, instead, is caused by the triplet nature of genetic code. The models and evolutionary algorithm used for characterising DNA periodicity are proven to be an effective tool for describing the periodicity pattern in a species, when a number of exons in the same phase are analysed

Localizing triplet periodicity in DNA and cDNA sequences

Author: AA Tsonis
AWC Liew
D Anastassiou
DL Black
G Gutierrez
I Daubechies
J Epps
J Sanchez
J Tuqan
JK Pickrell
JP Mena-Chalco
K Okamura
Lincoln D Stein
Liya Wang
M Stanke
M Yan
R Lewis
S Tiwari
TP George
WG Fairbrother
WJ Kent
YT Chan
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The protein-coding regions (coding exons) of a DNA sequence exhibit a triplet periodicity (TP) due to fact that coding exons contain a series of three nucleotide codons that encode specific amino acid residues. Such periodicity is usually not observed in introns and intergenic regions. If a DNA sequence is divided into small segments and a Fourier Transform is applied on each segment, a strong peak at frequency 1/3 is typically observed in the Fourier spectrum of coding segments, but not in non-coding regions. This property has been used in identifying the locations of protein-coding genes in unannotated sequence. The method is fast and requires no training. However, the need to compute the Fourier Transform across a segment (window) of arbitrary size affects the accuracy with which one can localize TP boundaries. Here, we report a technique that provides higher-resolution identification of these boundaries, and use the technique to explore the biological correlates of TP regions in the genome of the model organism <it>C. elegans</it>. Results Using both simulated TP signals and the real <it>C. elegans </it>sequence F56F11 as an example, we demonstrate that, (1) Modified Wavelet Transform (MWT) can better define the boundary of TP region than the conventional Short Time Fourier Transform (STFT); (2) The scale parameter (a) of MWT determines the precision of TP boundary localization: bigger values of a give sharper TP boundaries but result in a lower signal to noise ratio; (3) RNA splicing sites have weaker TP signals than coding region; (4) TP signals in coding region can be destroyed or recovered by frame-shift mutations; (5) 6 bp periodicities in introns and intergenic region can generate false positive signals and it can be removed with 6 bp MWT. Conclusions MWT can provide more precise TP boundaries than STFT and the boundaries can be further refined by bigger scale MWT. Subtraction of 6 bp periodicity signals reduces the number of false positives. Experimentally-introduced frame-shift mutations help recover TP signal that have been lost by possible ancient frame-shifts. More importantly, TP signal has the potential to be used to detect the splice junctions in fully spliced mRNA sequence.</p

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Mechanisms of Geomagnetic Field Influence on Gene Expression Using Influenza as a Model System: Basics of Physical Epidemiology

Author: Ponomarenko Andriy
Zaporozhan Valeriy
Publication venue
Publication date: 01/01/2010
Field of study

Recent studies demonstrate distinct changes in gene expression in cells exposed to a weak magnetic field (MF). Mechanisms of this phenomenon are not understood yet. We propose that proteins of the Cryptochrome family (CRY) are “epigenetic sensors” of the MF fluctuations, i.e., magnetic field-sensitive part of the epigenetic controlling mechanism. It was shown that CRY represses activity of the major circadian transcriptional complex CLOCK/BMAL1. At the same time, function of CRY, is apparently highly responsive to weak MF because of radical pairs that periodically arise in the functionally active site of CRY and mediate the radical pair mechanism of magnetoreception. It is known that the circadian complex influences function of every organ and tissue, including modulation of both NF-κB- and glucocorticoids- dependent signaling pathways. Thus, MFs and solar cycles-dependent geomagnetic field fluctuations are capable of altering expression of genes related to function of NF-κB, hormones and other biological regulators. Notably, NF-κB, along with its significant role in immune response, also participates in differential regulation of influenza virus RNA synthesis. Presented data suggests that in the case of global application (example—geomagnetic field), MF-mediated regulation may have epidemiological and other consequences

Odessa National Medical University Institutional Repository

Patterns of nucleotides that flank substitutions in human orthologous genes

Author: Huang Zhuoran
Jiang Xiaoqian
Ma Lei
Tao Shiheng
Zhang Tingting
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Sequence context is an important aspect of base mutagenesis, and three-base periodicity is an intrinsic property of coding sequences. However, how three-base periodicity is influenced in the vicinity of substitutions is still unclear. The effect of context on mutagenesis should be revealed in the usage of nucleotides that flank substitutions. Relative entropy (also known as Kullback-Leibler divergence) is useful for finding unusual patterns in biological sequences. Results Using relative entropy, we visualized the periodic patterns in the context of substitutions in human orthologous genes. Neighbouring patterns differed both among substitution categories and within a category that occurred at three codon positions. Transition tended to occur in periodic sequences relative to transversion. Periodic signals were stronger in a set of flanking sequences of substitutions that occurred at the third-codon positions than in those that occurred at the first- or second-codon positions. To determine how the three-base periodicity was affected near the substitution sites, we fitted a sine model to the values of the relative entropy. A sine of period equal to 3 is a good approximation for the three-base periodicity at sites not in close vicinity to some substitutions. These periods were interrupted near the substitution site and then reappeared away from substitutions. A comparative analysis between the native and codon-shuffled datasets suggested that the codon usage frequency was not the sole origin of the three-base periodicity, implying that the native order of codons also played an important role in this periodicity. Synonymous codon shuffling revealed that synonymous codon usage bias was one of the factors responsible for the observed three-base periodicity. Conclusions Our results offer an efficient way to illustrate unusual periodic patterns in the context of substitutions and provide further insight into the origin of three-base periodicity. This periodicity is a result of the native codon order in the reading frame. The length of the period equal to 3 is caused by the usage bias of nucleotides in synonymous codons. The periodic features in nucleotides surrounding substitutions aid in further understanding genetic variation and nucleotide mutagenesis.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

On the Evolution of the Standard Genetic Code: Vestiges of Critical Scale Invariance from the RNA World in Current Prokaryote Genomes

Author: A Arneodo
BB Mandelbrot
BB Mandelbrot
BJ West
BJ West
C Guerrier-Takada
C Woese
C Woese
C.-K Peng
CJ Michel
CJ Michel
CR Woese
D Arquès
D Sornette
DG Arquès
DJ Kenneth
E Szathmáry
EN Trifonov
EN Trifonov
F Jacob
FHC Crick
FHC Crick
G Eriani
G Frey
GF Joyce
GM Nagel
H Herzel
HB Nicholas
I López-Villaseñor
J Konecny
J Maynard-Smith
JA García
JA García
JB Bassingthwaighte
JC Shepherd
JCW Shepherd
JCW Shepherd
JCW Shepherd
JEM Hornos
José A. García
JT Trevors
JTze-Fei Wong
Juan R. Bobadilla
K Kruger
K Wilson
L Ribas de Pouplana
LE Orgel
M Balter
M Delarue
M Eigen
M Eigen
M Eigen
M Eigen
M Eigen
M Eigen
Marco V. José
Mukund Thattai
MV José
MV José
N Paul
P Bernáola-Galván
P Nissen
PP Amaral
R Jolivet
R Sánchez
RD Knight
SJ Freeland
SN Rodin
SV Buldyrev
TH Jukes
Tzipe Govezensky
W Gilbert
WK Johnston
Publication venue: Public Library of Science
Publication date: 02/02/2009
Field of study

Herein two genetic codes from which the primeval RNA code could have originated the standard genetic code (SGC) are derived. One of them, called extended RNA code type I, consists of all codons of the type RNY (purine-any base-pyrimidine) plus codons obtained by considering the RNA code but in the second (NYR type) and third (YRN type) reading frames. The extended RNA code type II, comprises all codons of the type RNY plus codons that arise from transversions of the RNA code in the first (YNY type) and third (RNR) nucleotide bases. In order to test if putative nucleotide sequences in the RNA World and in both extended RNA codes, share the same scaling and statistical properties to those encountered in current prokaryotes, we used the genomes of four Eubacteria and three Archaeas. For each prokaryote, we obtained their respective genomes obeying the RNA code or the extended RNA codes types I and II. In each case, we estimated the scaling properties of triplet sequences via a renormalization group approach, and we calculated the frequency distributions of distances for each codon. Remarkably, the scaling properties of the distance series of some codons from the RNA code and most codons from both extended RNA codes turned out to be identical or very close to the scaling properties of codons of the SGC. To test for the robustness of these results, we show, via computer simulation experiments, that random mutations of current genomes, at the rates of 10−10 per site per year during three billions of years, were not enough for destroying the observed patterns. Therefore, we conclude that most current prokaryotes may still contain relics of the primeval RNA World and that both extended RNA codes may well represent two plausible evolutionary paths between the RNA code and the current SGC