36 research outputs found

    Is There a Twelfth Protein-Coding Gene in the Genome of Influenza A? A Selection-Based Approach to the Detection of Overlapping Genes in Closely Related Sequences

    Get PDF
    Protein-coding genes often contain long overlapping open-reading frames (ORFs), which may or may not be functional. Current methods that utilize the signature of purifying selection to detect functional overlapping genes are limited to the analysis of sequences from divergent species, thus rendering them inapplicable to genes found only in closely related sequences. Here, we present a method for the detection of selection signatures on overlapping reading frames by using closely related sequences, and apply the method to several known overlapping genes, and to an overlapping ORF on the negative strand of segment 8 of influenza A virus (NEG8), for which the suggestion has been made that it is functional. We find no evidence that NEG8 is under selection, suggesting that the intact reading frame might be non-functional, although we cannot fully exclude the possibility that the method is not sensitive enough to detect the signature of selection acting on this gene. We present the limitations of the method using known overlapping genes and suggest several approaches to improve it in future studies. Finally, we examine alternative explanations for the sequence conservation of NEG8 in the absence of selection. We show that overlap type and genomic context affect the conservation of intact overlapping ORFs and should therefore be considered in any attempt of estimating the signature of selection in overlapping gene

    A potentially novel overlapping gene in the genomes of Israeli acute paralysis virus and its relatives

    Get PDF
    The Israeli acute paralysis virus (IAPV) is a honeybee-infecting virus that was found to be associated with colony collapse disorder. The IAPV genome contains two genes encoding a structural and a nonstructural polyprotein. We applied a recently developed method for the estimation of selection in overlapping genes to detect purifying selection and, hence, functionality. We provide evolutionary evidence for the existence of a functional overlapping gene, which is translated in the +1 reading frame of the structural polyprotein gene. Conserved orthologs of this putative gene, which we provisionally call pog (predicted overlapping gene), were also found in the genomes of a monophyletic clade of dicistroviruses that includes IAPV, acute bee paralysis virus, Kashmir bee virus, and Solenopsis invicta (red imported fire ant) virus 1

    Is There a Twelfth Protein-Coding Gene in the Genome of Influenza A? A Selection-Based Approach to the Detection of Overlapping Genes in Closely Related Sequences

    Get PDF
    Abstract Protein-coding genes often contain long overlapping open-reading frames (ORFs), which may or may not be functional. Current methods that utilize the signature of purifying selection to detect functional overlapping genes are limited to the analysis of sequences from divergent species, thus rendering them inapplicable to genes found only in closely related sequences. Here, we present a method for the detection of selection signatures on overlapping reading frames by using closely related sequences, and apply the method to several known overlapping genes, and to an overlapping ORF on the negative strand of segment 8 of influenza A virus (NEG8), for which the suggestion has been made that it is functional. We find no evidence that NEG8 is under selection, suggesting that the intact reading frame might be non-functional, although we cannot fully exclude the possibility that the method is not sensitive enough to detect the signature of selection acting on this gene. We present the limitations of the method using known overlapping genes and suggest several approaches to improve it in future studies. Finally, we examine alternative explanations for the sequence conservation of NEG8 in the absence of selection. We show that overlap type and genomic context affect the conservation of intact overlapping ORFs and should therefore be considered in any attempt of estimating the signature of selection in overlapping genes

    Is there a twelfth protein-coding gene in the genome of influenza A? A selection-based approach to the detection of overlapping genes in closely related sequences

    Full text link
    Protein-coding genes often contain long overlapping open-reading frames (ORFs), which may or may not be functional. Current methods that utilize the signature of purifying selection to detect functional overlapping genes are limited to the analysis of sequences from divergent species, thus rendering them inapplicable to genes found only in closely related sequences. Here, we present a method for the detection of selection signatures on overlapping reading frames by using closely related sequences, and apply the method to several known overlapping genes, and to an overlapping ORF on the negative strand of segment 8 of influenza A virus (NEG8), for which the suggestion has been made that it is functional. We find no evidence that NEG8 is under selection, suggesting that the intact reading frame might be non-functional, although we cannot fully exclude the possibility that the method is not sensitive enough to detect the signature of selection acting on this gene. We present the limitations of the method using known overlapping genes and suggest several approaches to improve it in future studies. Finally, we examine alternative explanations for the sequence conservation of NEG8 in the absence of selection. We show that overlap type and genomic context affect the conservation of intact overlapping ORFs and should therefore be considered in any attempt of estimating the signature of selection in overlapping gene

    Sex determination, longevity, and the birth and death of reptilian species

    Get PDF
    Vertebrate sex-determining mechanisms (SDMs) are triggered by the genotype (GSD), by temperature (TSD), or occasionally, by both. The causes and consequences of SDM diversity remain enigmatic. Theory predicts SDM effects on species diversification, and life-span effects on SDM evolutionary turnover. Yet, evidence is conflicting in clades with labile SDMs, such as reptiles. Here, we investigate whether SDM is associated with diversification in turtles and lizards, and whether alterative factors, such as lifespan\u27s effect on transition rates, could explain the relative prevalence of SDMs in turtles and lizards (including and excluding snakes). We assembled a comprehensive dataset of SDM states for squamates and turtles and leveraged large phylogenies for these two groups. We found no evidence that SDMs affect turtle, squamate, or lizard diversification. However, SDM transition rates differ between groups. In lizards TSD-to-GSD surpass GSD-to-TSD transitions, explaining the predominance of GSD lizards in nature. SDM transitions are fewer in turtles and the rates are similar to each other (TSD-to-GSD equals GSD-to-TSD), which, coupled with TSD ancestry, could explain TSD\u27s predominance in turtles. These contrasting patterns can be explained by differences in life history. Namely, our data support the notion that in general, shorter lizard lifespan renders TSD detrimental favoring GSD evolution in squamates, whereas turtle longevity permits TSD retention. Thus, based on the macro-evolutionary evidence we uncovered, we hypothesize that turtles and lizards followed different evolutionary trajectories with respect to SDM, likely mediated by differences in lifespan. Combined, our findings revealed a complex evolutionary interplay between SDMs and life histories that warrants further research that should make use of expanded datasets on unexamined taxa to enable more conclusive analyses

    Estimates of Positive Darwinian Selection Are Inflated by Errors in Sequencing, Annotation, and Alignment

    Get PDF
    Published estimates of the proportion of positively selected genes (PSGs) in human vary over three orders of magnitude. In mammals, estimates of the proportion of PSGs cover an even wider range of values. We used 2,980 orthologous protein-coding genes from human, chimpanzee, macaque, dog, cow, rat, and mouse as well as an established phylogenetic topology to infer the fraction of PSGs in all seven terminal branches. The inferred fraction of PSGs ranged from 0.9% in human through 17.5% in macaque to 23.3% in dog. We found three factors that influence the fraction of genes that exhibit telltale signs of positive selection: the quality of the sequence, the degree of misannotation, and ambiguities in the multiple sequence alignment. The inferred fraction of PSGs in sequences that are deficient in all three criteria of coverage, annotation, and alignment is 7.2 times higher than that in genes with high trace sequencing coverage, “known” annotation status, and perfect alignment scores. We conclude that some estimates on the prevalence of positive Darwinian selection in the literature may be inflated and should be treated with caution

    A Method for the Simultaneous Estimation of Selection Intensities in Overlapping Genes

    Get PDF
    Inferring the intensity of positive selection in protein-coding genes is important since it is used to shed light on the process of adaptation. Recently, it has been reported that overlapping genes, which are ubiquitous in all domains of life, seem to exhibit inordinate degrees of positive selection. Here, we present a new method for the simultaneous estimation of selection intensities in overlapping genes. We show that the appearance of positive selection is caused by assuming that selection operates independently on each gene in an overlapping pair, thereby ignoring the unique evolutionary constraints on overlapping coding regions. Our method uses an exact evolutionary model, thereby voiding the need for approximation or intensive computation. We test the method by simulating the evolution of overlapping genes of different types as well as under diverse evolutionary scenarios. Our results indicate that the independent estimation approach leads to the false appearance of positive selection even though the gene is in reality subject to negative selection. Finally, we use our method to estimate selection in two influenza A genes for which positive selection was previously inferred. We find no evidence for positive selection in both cases

    Geographic variation in plant community structure of salt marshes: species, functional and phylogenetic perspectives.

    Get PDF
    In general, community similarity is thought to decay with distance; however, this view may be complicated by the relative roles of different ecological processes at different geographical scales, and by the compositional perspective (e.g. species, functional group and phylogenetic lineage) used. Coastal salt marshes are widely distributed worldwide, but no studies have explicitly examined variation in salt marsh plant community composition across geographical scales, and from species, functional and phylogenetic perspectives. Based on studies in other ecosystems, we hypothesized that, in coastal salt marshes, community turnover would be more rapid at local versus larger geographical scales; and that community turnover patterns would diverge among compositional perspectives, with a greater distance decay at the species level than at the functional or phylogenetic levels. We tested these hypotheses in salt marshes of two regions: The southern Atlantic and Gulf Coasts of the United States. We examined the characteristics of plant community composition at each salt marsh site, how community similarity decayed with distance within individual salt marshes versus among sites in each region, and how community similarity differed among regions, using species, functional and phylogenetic perspectives. We found that results from the three compositional perspectives generally showed similar patterns: there was strong variation in community composition within individual salt marsh sites across elevation; in contrast, community similarity decayed with distance four to five orders of magnitude more slowly across sites within each region. Overall, community dissimilarity of salt marshes was lowest on the southern Atlantic Coast, intermediate on the Gulf Coast, and highest between the two regions. Our results indicated that local gradients are relatively more important than regional processes in structuring coastal salt marsh communities. Our results also suggested that in ecosystems with low species diversity, functional and phylogenetic approaches may not provide additional insight over a species-based approach

    DoGFinder: a software for the discovery and quantification of readthrough transcripts from RNA-seq

    No full text
    Abstract Background Recent studies have described a widespread induction of transcriptional readthrough as a consequence of various stress conditions in mammalian cells. This novel phenomenon, initially identified from analysis of RNA-seq data, suggests intriguing new levels of gene expression regulation. However, the mechanism underlying naturally occurring transcriptional readthrough, as well as its regulatory consequences, still remain elusive. Furthermore, the readthrough response to stress has thus far not been investigated outside of mammalian species, and the occurrence of readthrough in many physiological and disease conditions remains to be explored. Results To facilitate a wider investigation into transcriptional readthrough, we created the DoGFinder software package, for the streamlined identification and quantification of readthrough transcripts, also known as DoGs (Downstream of Gene-containing transcripts), from any RNA-seq dataset. Using DoGFinder, we explore the dependence of DoG discovery potential on RNA-seq library depth, and show that stress-induced readthrough induction discovery is robust to sequencing depth, and input parameter settings. We further demonstrate the use of the DoGFinder software package on a new publically available RNA-seq dataset, and discover DoG induction in human PME cells following hypoxia – a previously unknown readthrough inducing stress type. Conclusions DoGFinder will enable users to explore, in a few simple steps, the readthrough phenomenon in any condition and organism. DoGFinder is freely available at https://github.com/shalgilab/DoGFinder

    Same-strand overlapping genes in bacteria: compositional determinants of phase bias

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Same-strand overlapping genes may occur in frameshifts of one (phase 1) or two nucleotides (phase 2). In previous studies of bacterial genomes, long phase-1 overlaps were found to be more numerous than long phase-2 overlaps. This bias was explained by either genomic location or an unspecified selection advantage. Models that focused on the ability of the two genes to evolve independently did not predict this phase bias. Here, we propose that a purely compositional model explains the phase bias in a more parsimonious manner. Same-strand overlapping genes may arise through either a mutation at the termination codon of the upstream gene or a mutation at the initiation codon of the downstream gene. We hypothesized that given these two scenarios, the frequencies of initiation and termination codons in the two phases may determine the number for overlapping genes.</p> <p>Results</p> <p>We examined the frequencies of initiation- and termination-codons in the two phases, and found that termination codons do not significantly differ between the two phases, whereas initiation codons are more abundant in phase 1. We found that the primary factors explaining the phase inequality are the frequencies of amino acids whose codons may combine to form start codons in the two phases. We show that the frequencies of start codons in each of the two phases, and, hence, the potential for the creation of overlapping genes, are determined by a universal amino-acid frequency and species-specific codon usage, leading to a correlation between long phase-1 overlaps and genomic GC content.</p> <p>Conclusion</p> <p>Our model explains the phase bias in same-strand overlapping genes by compositional factors without invoking selection. Therefore, it can be used as a null model of neutral evolution to test selection hypotheses concerning the evolution of overlapping genes.</p> <p>Reviewers</p> <p>This article was reviewed by Bill Martin, Itai Yanai, and Mikhail Gelfand.</p
    corecore