Search CORE

Springer - Publisher Connector

Digital Repository @ Iowa State University (ISU)

Is There a Twelfth Protein-Coding Gene in the Genome of Influenza A? A Selection-Based Approach to the Detection of Overlapping Genes in Closely Related Sequences

Author: Dan Graur
Jeffrey S Morris
Niv Sabath
Publication venue
Publication date: 02/04/2020
Field of study

Abstract Protein-coding genes often contain long overlapping open-reading frames (ORFs), which may or may not be functional. Current methods that utilize the signature of purifying selection to detect functional overlapping genes are limited to the analysis of sequences from divergent species, thus rendering them inapplicable to genes found only in closely related sequences. Here, we present a method for the detection of selection signatures on overlapping reading frames by using closely related sequences, and apply the method to several known overlapping genes, and to an overlapping ORF on the negative strand of segment 8 of influenza A virus (NEG8), for which the suggestion has been made that it is functional. We find no evidence that NEG8 is under selection, suggesting that the intact reading frame might be non-functional, although we cannot fully exclude the possibility that the method is not sensitive enough to detect the signature of selection acting on this gene. We present the limitations of the method using known overlapping genes and suggest several approaches to improve it in future studies. Finally, we examine alternative explanations for the sequence conservation of NEG8 in the absence of selection. We show that overlap type and genomic context affect the conservation of intact overlapping ORFs and should therefore be considered in any attempt of estimating the signature of selection in overlapping genes

CiteSeerX

Is there a twelfth protein-coding gene in the genome of influenza A? A selection-based approach to the detection of overlapping genes in closely related sequences

Author: Graur Dan
Morris Jeffrey S
Sabath Niv
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2011
Field of study

ZORA

Sex determination, longevity, and the birth and death of reptilian species

Author: Feldman Anat
Itescu Yuval
Mayrose Itay
Meiri Shai
Sabath Niv
Valenzuela Nicole
Valenzuela Nicole
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

Vertebrate sex-determining mechanisms (SDMs) are triggered by the genotype (GSD), by temperature (TSD), or occasionally, by both. The causes and consequences of SDM diversity remain enigmatic. Theory predicts SDM effects on species diversification, and life-span effects on SDM evolutionary turnover. Yet, evidence is conflicting in clades with labile SDMs, such as reptiles. Here, we investigate whether SDM is associated with diversification in turtles and lizards, and whether alterative factors, such as lifespan\u27s effect on transition rates, could explain the relative prevalence of SDMs in turtles and lizards (including and excluding snakes). We assembled a comprehensive dataset of SDM states for squamates and turtles and leveraged large phylogenies for these two groups. We found no evidence that SDMs affect turtle, squamate, or lizard diversification. However, SDM transition rates differ between groups. In lizards TSD-to-GSD surpass GSD-to-TSD transitions, explaining the predominance of GSD lizards in nature. SDM transitions are fewer in turtles and the rates are similar to each other (TSD-to-GSD equals GSD-to-TSD), which, coupled with TSD ancestry, could explain TSD\u27s predominance in turtles. These contrasting patterns can be explained by differences in life history. Namely, our data support the notion that in general, shorter lizard lifespan renders TSD detrimental favoring GSD evolution in squamates, whereas turtle longevity permits TSD retention. Thus, based on the macro-evolutionary evidence we uncovered, we hypothesize that turtles and lizards followed different evolutionary trajectories with respect to SDM, likely mediated by differences in lifespan. Combined, our findings revealed a complex evolutionary interplay between SDMs and life histories that warrants further research that should make use of expanded datasets on unexamined taxa to enable more conclusive analyses

Repository for Publications and Research Data

Estimates of Positive Darwinian Selection Are Inflated by Errors in Sequencing, Annotation, and Alignment

Author: Adrian Schneider
Alexander Souvorov
Anisimova
Arbiza
Bakewell
Cannarozzi
Clark
Dan Graur
Dessimoz
Gaston H. Gonnet
Gibbs
Giddy Landan
Gonnet
Gonnet
Hill
Hubbard
Hughes
Jorgensen
Kosiol
Landan
Li
Murphy
Niv Sabath
Rom
Schneider
Studer
Yang
Zhang
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Published estimates of the proportion of positively selected genes (PSGs) in human vary over three orders of magnitude. In mammals, estimates of the proportion of PSGs cover an even wider range of values. We used 2,980 orthologous protein-coding genes from human, chimpanzee, macaque, dog, cow, rat, and mouse as well as an established phylogenetic topology to infer the fraction of PSGs in all seven terminal branches. The inferred fraction of PSGs ranged from 0.9% in human through 17.5% in macaque to 23.3% in dog. We found three factors that influence the fraction of genes that exhibit telltale signs of positive selection: the quality of the sequence, the degree of misannotation, and ambiguities in the multiple sequence alignment. The inferred fraction of PSGs in sequences that are deficient in all three criteria of coverage, annotation, and alignment is 7.2 times higher than that in genes with high trace sequencing coverage, “known” annotation status, and perfect alignment scores. We conclude that some estimates on the prevalence of positive Darwinian selection in the literature may be inflated and should be treated with caution

Public Library of Science (PLOS)

A Method for the Simultaneous Estimation of Selection Intensities in Overlapping Genes

Author: A Narechania
A Pavesi
A Pavesi
AL Hughes
AL Hughes
AM Pedersen
BG Barrell
CE Jones
Dan Graur
DC Krakauer
EC Holmes
F Lillo
Giddy Landan
H Okamoto
HL Zaaijer
I Makalowska
IB Rogozin
J Hein
J Montoya
J Zhang
JC Obenauer
KR Sakharkar
KS Li
L Campitelli
M Nei
N Goldman
Niv Sabath
Oliver G. Pybus
P Pamilo
PK Keese
PR Cooper
R Belshaw
R Nielsen
RA Smith
S de Groot
S de Groot
S Guyader
S McCauley
S McCauley
S Normark
SB Needleman
T Miyata
WH Li
Y Bao
Y Suzuki
Z Yang
Z Yang
Z Yang
ZI Johnson
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Inferring the intensity of positive selection in protein-coding genes is important since it is used to shed light on the process of adaptation. Recently, it has been reported that overlapping genes, which are ubiquitous in all domains of life, seem to exhibit inordinate degrees of positive selection. Here, we present a new method for the simultaneous estimation of selection intensities in overlapping genes. We show that the appearance of positive selection is caused by assuming that selection operates independently on each gene in an overlapping pair, thereby ignoring the unique evolutionary constraints on overlapping coding regions. Our method uses an exact evolutionary model, thereby voiding the need for approximation or intensive computation. We test the method by simulating the evolution of overlapping genes of different types as well as under diverse evolutionary scenarios. Our results indicate that the independent estimation approach leads to the false appearance of positive selection even though the gene is in reality subject to negative selection. Finally, we use our method to estimate selection in two influenza A genes for which positive selection was previously inferred. We find no evidence for positive selection in both cases

CiteSeerX

Public Library of Science (PLOS)

Geographic variation in plant community structure of salt marshes: species, functional and phylogenetic perspectives.

Author: A Astorga
AE Kunza
Alana-Rose Lynes
Amy Vargas
B Gilbert
B Zetler
BJ McGill
Bo Li
C Baraloto
CH Graham
CO Webb
Emily M. Zelig
Eran Elhaik
FW Judd
G Sullivan
G Vivian-Smith
H Morlon
H Tuomisto
HE Epstein
Hongyu Guo
Inder Jalli
J Cavender-Bares
J Oksanen
JC Duckworth
JC Nekola
JP Stout
K Cottenie
Kazimierz Więski
L Eleuterius
Laurie Marczak
LN Eleuterius
M Bahram
M Rasser
M Tessier
MD Bertness
MD Bertness
MD Bertness
MW Cadotte
MW Cadotte
N Rowe
Niv Sabath
NJB Kraft
NJB Kraft
O Purschke
P Adam
PP Garcillán
R Condit
RG Wiegert
RH Whittaker
S Pavoine
S Suchrow
SC Pennings
SC Pennings
Scott A. Chamberlain
SP Hubbell
Steven C. Pennings
T Fukami
TJ Davies
TK Rajaniemi
VJ Chapman
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

In general, community similarity is thought to decay with distance; however, this view may be complicated by the relative roles of different ecological processes at different geographical scales, and by the compositional perspective (e.g. species, functional group and phylogenetic lineage) used. Coastal salt marshes are widely distributed worldwide, but no studies have explicitly examined variation in salt marsh plant community composition across geographical scales, and from species, functional and phylogenetic perspectives. Based on studies in other ecosystems, we hypothesized that, in coastal salt marshes, community turnover would be more rapid at local versus larger geographical scales; and that community turnover patterns would diverge among compositional perspectives, with a greater distance decay at the species level than at the functional or phylogenetic levels. We tested these hypotheses in salt marshes of two regions: The southern Atlantic and Gulf Coasts of the United States. We examined the characteristics of plant community composition at each salt marsh site, how community similarity decayed with distance within individual salt marshes versus among sites in each region, and how community similarity differed among regions, using species, functional and phylogenetic perspectives. We found that results from the three compositional perspectives generally showed similar patterns: there was strong variation in community composition within individual salt marsh sites across elevation; in contrast, community similarity decayed with distance four to five orders of magnitude more slowly across sites within each region. Overall, community dissimilarity of salt marshes was lowest on the southern Atlantic Coast, intermediate on the Gulf Coast, and highest between the two regions. Our results indicated that local gradients are relatively more important than regional processes in structuring coastal salt marsh communities. Our results also suggested that in ecosystems with low species diversity, functional and phylogenetic approaches may not provide additional insight over a species-based approach

Lund University Publications

White Rose Research Online

FigShare

DoGFinder: a software for the discovery and quantification of readthrough transcripts from RNA-seq

Author: Niv Sabath
Reut Shalgi
Yuval Wiesel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2018
Field of study

Abstract Background Recent studies have described a widespread induction of transcriptional readthrough as a consequence of various stress conditions in mammalian cells. This novel phenomenon, initially identified from analysis of RNA-seq data, suggests intriguing new levels of gene expression regulation. However, the mechanism underlying naturally occurring transcriptional readthrough, as well as its regulatory consequences, still remain elusive. Furthermore, the readthrough response to stress has thus far not been investigated outside of mammalian species, and the occurrence of readthrough in many physiological and disease conditions remains to be explored. Results To facilitate a wider investigation into transcriptional readthrough, we created the DoGFinder software package, for the streamlined identification and quantification of readthrough transcripts, also known as DoGs (Downstream of Gene-containing transcripts), from any RNA-seq dataset. Using DoGFinder, we explore the dependence of DoG discovery potential on RNA-seq library depth, and show that stress-induced readthrough induction discovery is robust to sequencing depth, and input parameter settings. We further demonstrate the use of the DoGFinder software package on a new publically available RNA-seq dataset, and discover DoG induction in human PME cells following hypoxia – a previously unknown readthrough inducing stress type. Conclusions DoGFinder will enable users to explore, in a few simple steps, the readthrough phenomenon in any condition and organism. DoGFinder is freely available at https://github.com/shalgilab/DoGFinder

Same-strand overlapping genes in bacteria: compositional determinants of phase bias

Author: Graur Dan
Landan Giddy
Sabath Niv
Publication venue: BMC
Publication date: 01/01/2008
Field of study

Abstract Background Same-strand overlapping genes may occur in frameshifts of one (phase 1) or two nucleotides (phase 2). In previous studies of bacterial genomes, long phase-1 overlaps were found to be more numerous than long phase-2 overlaps. This bias was explained by either genomic location or an unspecified selection advantage. Models that focused on the ability of the two genes to evolve independently did not predict this phase bias. Here, we propose that a purely compositional model explains the phase bias in a more parsimonious manner. Same-strand overlapping genes may arise through either a mutation at the termination codon of the upstream gene or a mutation at the initiation codon of the downstream gene. We hypothesized that given these two scenarios, the frequencies of initiation and termination codons in the two phases may determine the number for overlapping genes. Results We examined the frequencies of initiation- and termination-codons in the two phases, and found that termination codons do not significantly differ between the two phases, whereas initiation codons are more abundant in phase 1. We found that the primary factors explaining the phase inequality are the frequencies of amino acids whose codons may combine to form start codons in the two phases. We show that the frequencies of start codons in each of the two phases, and, hence, the potential for the creation of overlapping genes, are determined by a universal amino-acid frequency and species-specific codon usage, leading to a correlation between long phase-1 overlaps and genomic GC content. Conclusion Our model explains the phase bias in same-strand overlapping genes by compositional factors without invoking selection. Therefore, it can be used as a null model of neutral evolution to test selection hypotheses concerning the evolution of overlapping genes. Reviewers This article was reviewed by Bill Martin, Itai Yanai, and Mikhail Gelfand.</p

Springer - Publisher Connector