Search CORE

1,035 research outputs found

Mutational Biases and Selective Forces Shaping the Structure of Arabidopsis Genes

Author: AB Rose
AE Vinogradov
AE Vinogradov
AE Vinogradov
Andrea Porceddu
AO Urrutia
BC Meyers
BM Bolstad
C Anselmi
C Seoighe
CI Castillo-Davis
D Wegmann
Domenico Rau
E Eisenberg
HH Le
J Colinas
JM Comeron
KR Bradnam
M Kreitman
M Schmid
MJ Lercher
MZ Radic
N Carels
Pär K. Ingvarsson
Salvatore Camiolo
SW Li
T Mourier
XY Ren
Publication venue: Public Library of Science
Publication date: 27/07/2009
Field of study

Recently features of gene expression profiles have been associated with structural parameters of gene sequences in organisms representing a diverse set of taxa. The emerging picture indicates that natural selection, mediated by gene expression profiles, has a significant role in determining genic structures. However the current situation is less clear in plants as the available data indicates that the effect of natural selection mediated by gene expression is very weak. Moreover, the direction of the patterns in plants appears to contradict those observed in animal genomes. In the present work we analized expression data for >18000 Arabidopsis genes retrieved from public datasets obtained with different technologies (MPSS and high density chip arrays) and compared them with gene parameters. Our results show that the impact of natural selection mediated by expression on genes sequences is significant and distinguishable from the effects of regional mutational biases. In addition, we provide evidence that the level and the breadth of gene expression are related in opposite ways to many structural parameters of gene sequences. Higher levels of expression abundance are associated with smaller transcripts, consistent with the need to reduce costs of both transcription and translation. Expression breadth, however, shows a contrasting pattern, i.e. longer genes have higher breadth of expression, possibly to ensure those structural features associated with gene plasticity. Based on these results, we propose that the specific balance between these two selective forces play a significant role in shaping the structure of Arabidopsis genes

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The Random Nature of Genome Architecture: Predicting Open Reading Frame Distributions

Author: AE Vinogradov
AE Vinogradov
AE Vinogradov
AE Vinogradov
Andrew P. Allen
BM Bolker
BM Bolker
Cecile Fairhead
D Charif
G Bernardi
J Zhang
James F. Gillooly
JL Oliver
JP Kastenmayer
JS Hawkins
KP Burnham
L Cottret
L Loewe
M Lynch
M Lynch
M Lynch
M Lynch
M Pagel
M Skovgaard
M Todinov
MA Basrai
MG Kidwell
Michael W. McCoy
N Siew
P Carpena
P Senapathy
RDC Team
RE Barlow
S Engen
S Engen
SB Carroll
SN Wood
SR Wessler
SV Yi
SV Yi
T Cavalier-Smith
T Cavalier-Smith
TR Gregory
V Daubin
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Background: A better understanding of the size and abundance of open reading frames (ORFS) in whole genomes may shed light on the factors that control genome complexity. Here we examine the statistical distributions of open reading frames (i.e. distribution of start and stop codons) in the fully sequenced genomes of 297 prokaryotes, and 14 eukaryotes. Methodology/Principal Findings: By fitting mixture models to data from whole genome sequences we show that the size-frequency distributions for ORFS are strikingly similar across prokaryotic and eukaryotic genomes. Moreover, we show that i) a large fraction (60–80%) of ORF size-frequency distributions can be predicted a priori with a stochastic assembly model based on GC content, and that (ii) size-frequency distributions of the remaining “non-random” ORFs are well-fitted by log-normal or gamma distributions, and similar to the size distributions of annotated proteins. Conclusions/Significance: Our findings suggest stochastic processes have played a primary role in the evolution of genome complexity, and that common processes govern the conservation and loss of functional genomics units in both prokaryotes and eukaryotes.8 page(s

CiteSeerX

Public Library of Science (PLOS)

Crossref

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

PubMed Central

Macquarie University ResearchOnline

A Structural Split in the Human Genome

Author: A Wagner
AB Reams
AE Vinogradov
AE Vinogradov
Clara S.M. Tang
CS Tang
CT Nguyen
E Beutler
F Antequera
FA Feltus
Guillaume Bourque
H Philippe
JE Horvath
JS Mattick
K Jabbari
K Lin
K Yusa
M Lynch
ME Brun
NG Smith
P Caiafa
P Dimitri
PA Jones
R Frankham
R Kurek
Richard J. Epstein
RJ Epstein
RR Copley
SF Wolf
SL Rogers
T Yoshikawa
TJ Meza
YC Li
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

Background: Promoter-associated CpG islands (PCIs) mediate methylation-dependent gene silencing, yet tend to co-locate to transcriptionally active genes. To address this paradox, we used data mining to assess the behavior of PCI-positive (PCI+) genes in the human genome. Results: PCI+ genes exhibit a bimodal distribution: (1) a 'housekeeping-like' subset characterized by higher GC content and lower intron length/number, and (2) a 'pseudogene paralog' subset characterized by lower GC content and higher intron length/number (p<0.001). These subsets are functionally distinguishable, with the former gene group characterized by higher expression levels and lower evolutionary rate (p<0.001). PCI-negative (PCI-) genes exhibit higher evolutionary rate and narrower expression breadth than PCI+ genes (p<0.001), consistent with more frequent tissue-specific inactivation. Conclusions: Adaptive evolution of the human genome appears driven in part by declining transcription of a subset of PCI+ genes, predisposing to both CpG→TpA mutation and intron insertion. We propose a model of evolving biological complexity in which environmentally-selected gains or losses of PCI methylation respectively favor positive or negative selection, thus polarizing PCI+ gene structures around a genomic core of ancestral PCI- genes. © 2007 Tang, Epstein.published_or_final_versio

Public Library of Science (PLOS)

Intergenic and Genic Sequence Lengths Have Opposite Relationships with Respect to Gene Expression

Author: A Bar-Even
A Gondor
A Taddei
AE Vinogradov
AE Vinogradov
AO Urrutia
Borislav Iordanov
C Seoighe
CE Nelson
CI Castillo-Davis
D Walther
DL Mace
E Eisenberg
F Chiaromonte
F Mignone
Gil Bohrer
H Le Hir
J Colinas
JC Pinheiro
JR Newman
JS Mattick
Juan Valcarcel
Juliette Colinas
JY Lee
K Birnbaum
M Gaszner
MP Levesque
MQ Zhang
Philip N. Benfey
S Cai
Scott C. Schmidler
SR Searle
T Nawy
XY Ren
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Eukaryotic genomes are mostly composed of noncoding DNA whose role is still poorly understood. Studies in several organisms have shown correlations between the length of the intergenic and genic sequences of a gene and the expression of its corresponding mRNA transcript. Some studies have found a positive relationship between intergenic sequence length and expression diversity between tissues, and concluded that genes under greater regulatory control require more regulatory information in their intergenic sequences. Other reports found a negative relationship between expression level and gene length and the interpretation was that there is selection pressure for highly expressed genes to remain small. However, a correlation between gene sequence length and expression diversity, opposite to that observed for intergenic sequences, has also been reported, and to date there is no testable explanation for this observation. To shed light on these varied and sometimes conflicting results, we performed a thorough study of the relationships between sequence length and gene expression using cell-type (tissue) specific microarray data in Arabidopsis thaliana. We measured median gene expression across tissues (expression level), expression variability between tissues (expression pattern uniformity), and expression variability between replicates (expression noise). We found that intergenic (upstream and downstream) and genic (coding and noncoding) sequences have generally opposite relationships with respect to expression, whether it is tissue variability, median, or expression noise. To explain these results we propose a model, in which the lengths of the intergenic and genic sequences have opposite effects on the ability of the transcribed region of the gene to be epigenetically regulated for differential expression. These findings could shed light on the role and influence of noncoding sequences on gene expression

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

DukeSpace

Predicting Housekeeping Genes Based on Fourier Analysis

Author: AE Vinogradov
AE Vinogradov
AI Su
BM Bolstad
Bo Dong
BR Kim
BR Kim
CD Eller
D Karolchik
E Eisenberg
G Rustici
HJ de Jonge
J Ye
J Zhu
JA Warrington
Jen-Tsan Ashley Chi
KS Pollard
Li Liu
LL Breeden
LL Hsiao
M Ashburner
MJ Lawson
ML Whitfield
ML Whitfield
Peng Zhang
Runsheng Chen
S Greer
Shunmin He
T Yamada
U de Lichtenberg
X Ge
Xiaowei Chen
Yunfei Wang
Publication venue: Public Library of Science
Publication date
Field of study

Housekeeping genes (HKGs) generally have fundamental functions in basic biochemical processes in organisms, and usually have relatively steady expression levels across various tissues. They play an important role in the normalization of microarray technology. Using Fourier analysis we transformed gene expression time-series from a Hela cell cycle gene expression dataset into Fourier spectra, and designed an effective computational method for discriminating between HKGs and non-HKGs using the support vector machine (SVM) supervised learning algorithm which can extract significant features of the spectra, providing a basis for identifying specific gene expression patterns. Using our method we identified 510 human HKGs, and then validated them by comparison with two independent sets of tissue expression profiles. Results showed that our predicted HKG set is more reliable than three previously identified sets of HKGs

Crossref

Directory of Open Access Journals

PubMed Central

Nucleosome DNA sequence structure of isochores

Author: AE Rapoport
AE Vinogradov
CE Shannon
DA Denisov
Edward N Trifonov
EN Trifonov
EN Trifonov
EN Trifonov
EN Trifonov
EN Trifonov
F Salih
G Bernardi
G Mengeritsky
HR Chung
I Gabdank
I Gabdank
M Costantini
M Costantini
M Costantini
M Costantini
M Kato
S Kogan
T Bettecken
Thomas Bettecken
VB Zhurkin
VB Zhurkin
Zakharia M Frenkel
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Significant differences in G+C content between different isochore types suggest that the nucleosome positioning patterns in DNA of the isochores should be different as well. Results Extraction of the patterns from the isochore DNA sequences by Shannon N-gram extension reveals that while the general motif YRRRRRYYYYYR is characteristic for all isochore types, the dominant positioning patterns of the isochores vary between TAAAAATTTTTA and CGGGGGCCCCCG due to the large differences in G+C composition. This is observed in human, mouse and chicken isochores, demonstrating that the variations of the positioning patterns are largely G+C dependent rather than species-specific. The species-specificity of nucleosome positioning patterns is revealed by dinucleotide periodicity analyses in isochore sequences. While human sequences are showing CG periodicity, chicken isochores display AG (CT) periodicity. Mouse isochores show very weak CG periodicity only. Conclusions Nucleosome positioning pattern as revealed by Shannon N-gram extension is strongly dependent on G+C content and different in different isochores. Species-specificity of the pattern is subtle. It is reflected in the choice of preferentially periodical dinucleotides.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Nanotechnology approaches to crossing the blood-brain barrier and drug delivery to the CNS

Author: A Friese
AE Gulyaev
AM Ercolini
B Dupas
C Olbrich
C Rousselle
E Peira
GA Silva
Gabriel A Silva
I Brigger
J Kreuter
JR Kanwar
P Calvo
RN Alyaudtin
RN Alyautdin
RN Alyautdin
S Pathak
SC Steiniger
SS Feng
SV Vinogradov
U Schroeder
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Flow cytometric determination of genome size in European sunbleak Leucaspius delineatus (Heckel, 1843)

Author: A Boron
A Homatowska
A Valcarcel
AE Vinogradov
AJR Hickey
CI Castillo-Davis
D Juchno
DK Lamatsch
DW Hedley
F Foresti
Grzegorz Tylko
H Swarup
IL Vindelow
J Ciudad
JN Falco
JS Taylor
M Lynch
Marta Filipiak
MR Pie
PC Fenerich
R Hertwig
R Hinergardner
R Hinergardner
S Peruzzi
SC Le Comber
TL MacIntireTL
TR Gregory
Wincenty Kilarski
Publication venue: Springer Netherlands
Publication date: 01/01/2011
Field of study

The aim of this study was to compare DNA content in hepatocyte and erythrocyte nuclei of the European sunbleak, Leucaspius delineatus, in relation to nuclear and cell size by means of flow cytometry and fluorescence microscopy. The DNA standards, chicken and rainbow trout erythrocytes, were prepared in parallel with both cell types, with initial separation of liver cells in pepsin solution followed by cell filtering. Standards and investigated cells were stained with a mixture of propidium iodide, citric acid, and Nonidet P40 in the presence of RNAse, and fluorescence of at least 50,000 nuclei was analyzed by flow cytometry. Average cell size was determined by flow cytometry, using fresh cell suspension in relation to latex beads of known diameter. The size of nuclei was examined on the basis of digital micrographs obtained by fluorescence microscopy after nuclei staining with DAPI. The sunbleak’s erythrocyte nuclei contain 2.25 ± 0.06 pg of DNA, whereas the hepatocyte nuclei contain 2.46 ± 0.06 pg of DNA. This difference in DNA content was determined spectroscopically using isolated DNA from the two cell types. The modal diameters of the erythrocytes and hepatocytes were estimated to be 5.1 ± 0.2 and 22.3 ± 5.0 μm, respectively, and the corresponding modal dimensions of their nuclei (measured as surface area) were 15.2 and 21.4 μm2, respectively. The nucleoplasmic index, as calculated from diameters estimated from surface area of nuclear profiles, was 2.51 for the erythrocytes compared with 0.08 for hepatocytes

Crossref

Springer - Publisher Connector

PubMed Central

Jagiellonian Univeristy Repository

Relationship between amino acid composition and gene expression in the mouse genome

Abstract Background Codon bias is a phenomenon that refers to the differences in the frequencies of synonymous codons among different genes. In many organisms, natural selection is considered to be a cause of codon bias because codon usage in highly expressed genes is biased toward optimal codons. Methods have previously been developed to predict the expression level of genes from their nucleotide sequences, which is based on the observation that synonymous codon usage shows an overall bias toward a few codons called major codons. However, the relationship between codon bias and gene expression level, as proposed by the translation-selection model, is less evident in mammals. Findings We investigated the correlations between the expression levels of 1,182 mouse genes and amino acid composition, as well as between gene expression and codon preference. We found that a weak but significant correlation exists between gene expression levels and amino acid composition in mouse. In total, less than 10% of variation of expression levels is explained by amino acid components. We found the effect of codon preference on gene expression was weaker than the effect of amino acid composition, because no significant correlations were observed with respect to codon preference. Conclusion These results suggest that it is difficult to predict expression level from amino acid components or from codon bias in mouse.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Large introns in relation to alternative splicing and gene evolution: a case study of Drosophila bruno-3

Author: A Marchler-Bauer
A Mortazavi
A Nekrutenko
A Rambaut
A Resch
AA Patel
AB Carvalho
AE Vinogradov
AG Clark
AM McGuire
AN Ladd
AR Hatton
AS Chang
AV Alekseyenko
AV Philips
B Budagyan
B Modrek
B Modrek
B Prud'homme
BR Graveley
BR Graveley
BR Graveley
C Lee
C Notredame
CI Castillo-Davis
CL Fitzpatrick
CS Thummel
D Babushok
D Baek
D Gatfield
D Monroe
D Ortíz-Barrientos
DA Petrov
DA Petrov
DG Gilbert
DI Nurminsky
DJ Kenan
DL Black
DL Swofford
E Betran
E Kim
E Kim
E Wagner
EA Glazov
F-C Chen
FA Kondrashov
G Ast
G Lev-Maor
G Marais
GD Schuler
H Itoh
IA Swinburne
International Human Genome Sequencing Consortium
J Delaunay
J Felsenstein
J Rozas
JM Burnette
JM Comeron
JM Johnson
K Tamura
KL Fox-Walsh
KM Neugebauer
LF Lareau
M Clamp
M Guo
M Labrador
M Lynch
M Roy
M Talerico
MA Noor
MD Adams
MD Adams
MF Wilkinson
Mohamed AF Noor
MT Levine
Nikolai P Kandul
NJ Proudfoot
NM Kopelman
P Haddrill
PA Sharp
PJ Good
PM O'Grady
Q Pan
R Sorek
R Sorek
R Sorek
R Sorek
R Sorek
S Karlin
S Misra
S Richards
S-T Chen
SF Altschul
SM Berget
TE Royce
V Stolc
W Wang
WG Hill
Y Xing
Y Xing
Z Kan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Background: Alternative splicing (AS) of maturing mRNA can generate structurally and functionally distinct transcripts from the same gene. Recent bioinformatic analyses of available genome databases inferred a positive correlation between intron length and AS. To study the interplay between intron length and AS empirically and in more detail, we analyzed the diversity of alternatively spliced transcripts (ASTs) in the Drosophila RNA-binding Bruno-3 (Bru-3) gene. This gene was known to encode thirteen exons separated by introns of diverse sizes, ranging from 71 to 41,973 nucleotides in D. melanogaster. Although Bru-3's structure is expected to be conducive to AS, only two ASTs of this gene were previously described. Results: Cloning of RT-PCR products of the entire ORF from four species representing three diverged Drosophila lineages provided an evolutionary perspective, high sensitivity, and long-range contiguity of splice choices currently unattainable by high-throughput methods. Consequently, we identified three new exons, a new exon fragment and thirty-three previously unknown ASTs of Bru-3. All exon-skipping events in the gene were mapped to the exons surrounded by introns of at least 800 nucleotides, whereas exons split by introns of less than 250 nucleotides were always spliced contiguously in mRNA. Cases of exon loss and creation during Bru-3 evolution in Drosophila were also localized within large introns. Notably, we identified a true de novo exon gain: exon 8 was created along the lineage of the obscura group from intronic sequence between cryptic splice sites conserved among all Drosophila species surveyed. Exon 8 was included in mature mRNA by the species representing all the major branches of the obscura group. To our knowledge, the origin of exon 8 is the first documented case of exonization of intronic sequence outside vertebrates. Conclusion: We found that large introns can promote AS via exon-skipping and exon turnover during evolution likely due to frequent errors in their removal from maturing mRNA. Large introns could be a reservoir of genetic diversity, because they have a greater number of mutable sites than short introns. Taken together, gene structure can constrain and/or promote gene evolution

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Caltech Authors