541 research outputs found

    Bioinformatic analysis suggests that the Orbivirus VP6 cistron encodes an overlapping gene

    Get PDF
    Abstract Background The genus Orbivirus includes several species that infect livestock – including Bluetongue virus (BTV) and African horse sickness virus (AHSV). These viruses have linear dsRNA genomes divided into ten segments, all of which have previously been assumed to be monocistronic. Results Bioinformatic evidence is presented for a short overlapping coding sequence (CDS) in the Orbivirus genome segment 9, overlapping the VP6 cistron in the +1 reading frame. In BTV, a 77–79 codon AUG-initiated open reading frame (hereafter ORFX) is present in all 48 segment 9 sequences analysed. The pattern of base variations across the 48-sequence alignment indicates that ORFX is subject to functional constraints at the amino acid level (even when the constraints due to coding in the overlapping VP6 reading frame are taken into account; MLOGD software). In fact the translated ORFX shows greater amino acid conservation than the overlapping region of VP6. The ORFX AUG codon has a strong Kozak context in all 48 sequences. Each has only one or two upstream AUG codons, always in the VP6 reading frame, and (with a single exception) always with weak or medium Kozak context. Thus, in BTV, ORFX may be translated via leaky scanning. A long (83–169 codon) ORF is present in a corresponding location and reading frame in all other Orbivirus species analysed except Saint Croix River virus (SCRV; the most divergent). Again, the pattern of base variations across sequence alignments indicates multiple coding in the VP6 and ORFX reading frames. Conclusion At ~9.5 kDa, the putative ORFX product in BTV is too small to appear on most published protein gels. Nonetheless, a review of past literature reveals a number of possible detections. We hope that presentation of this bioinformatic analysis will stimulate an attempt to experimentally verify the expression and functional role of ORFX, and hence lead to a greater understanding of the molecular biology of these important pathogens.</p

    A case for a CUG-initiated coding sequence overlapping torovirus ORF1a and encoding a novel 30 kDa product

    Get PDF
    The genus Torovirus (order Nidovirales) includes a number of species that infect livestock. These viruses have a linear positive-sense ssRNA genome of ~25-30 kb, encoding a large polyprotein that is expressed from the genomic RNA, and several additional proteins expressed from a nested set of 3'-coterminal subgenomic RNAs. In this brief report, we describe the bioinformatic discovery of a new, apparently coding, ORF that overlaps the 5' end of the polyprotein coding sequence, ORF1a, in the +2 reading frame. The new ORF has a strong coding signature and, in fact, is more conserved at the amino acid level than the overlapping region of ORF1a. We propose that the new ORF utilizes a non-AUG initiation codon - namely a conserved CUG codon in a strong Kozak context - upstream of the ORF1a AUG initiation codon, resulting in a novel 258 amino acid protein, dubbed '30K'

    Detecting overlapping coding sequences in virus genomes

    Get PDF
    BACKGROUND: Detecting new coding sequences (CDSs) in viral genomes can be difficult for several reasons. The typically compact genomes often contain a number of overlapping coding and non-coding functional elements, which can result in unusual patterns of codon usage; conservation between related sequences can be difficult to interpret – especially within overlapping genes; and viruses often employ non-canonical translational mechanisms – e.g. frameshifting, stop codon read-through, leaky-scanning and internal ribosome entry sites – which can conceal potentially coding open reading frames (ORFs). RESULTS: In a previous paper we introduced a new statistic – MLOGD (Maximum Likelihood Overlapping Gene Detector) – for detecting and analysing overlapping CDSs. Here we present (a) an improved MLOGD statistic, (b) a greatly extended suite of software using MLOGD, (c) a database of results for 640 virus sequence alignments, and (d) a web-interface to the software and database. Tests show that, from an alignment with just 20 mutations, MLOGD can discriminate non-overlapping CDSs from non-coding ORFs with a typical accuracy of up to 98%, and can detect CDSs overlapping known CDSs with a typical accuracy of 90%. In addition, the software produces a variety of statistics and graphics, useful for analysing an input multiple sequence alignment. CONCLUSION: MLOGD is an easy-to-use tool for virus genome annotation, detecting new CDSs – in particular overlapping or short CDSs – and for analysing overlapping CDSs following frameshift sites. The software, web-server, database and supplementary material are available at

    Bioinformatic analysis suggests that the Cypovirus 1 major core protein cistron harbours an overlapping gene

    Get PDF
    Members of the genus Cypovirus (family Reoviridae) are common pathogens of insects. These viruses have linear dsRNA genomes divided into 10–11 segments, which have generally been assumed to be monocistronic. Here, bioinformatic evidence is presented for a short overlapping coding sequence (CDS) in the cypovirus genome segment encoding the major core capsid protein VP1, overlapping the 5'-terminal region of the VP1 ORF in the +1 reading frame. In Cypovirus type 1 (CPV-1), a 62-codon AUG-initiated open reading frame (hereafter ORFX) is present in all four available segment 1 sequences. The pattern of base variations across the sequence alignment indicates that ORFX is subject to functional constraints at the amino acid level (even when the constraints due to coding in the overlapping VP1 reading frame are taken into account; MLOGD software). In fact the translated ORFX shows greater amino acid conservation than the overlapping region of VP1. The genomic location of ORFX is consistent with translation via leaky scanning. A 62–64 codon AUG-initiated ORF is present in a corresponding location and reading frame in other available cypovirus sequences (2 CPV-14, 1 CPV-15) and an 87-codon ORFX homologue may also be present in Aedes pseudoscutellaris reovirus. The ORFX amino acid sequences are hydrophilic and basic, with between 12 and 16 Arg/Lys residues in each though, at 7.5–10.2 kDa, the putative ORFX product is too small to appear on typical published protein gels

    A conserved predicted pseudoknot in the NS2A-encoding sequence of West Nile and Japanese encephalitis flaviviruses suggests NS1' may derive from ribosomal frameshifting

    Get PDF
    Japanese encephalitis, West Nile, Usutu and Murray Valley encephalitis viruses form a tight subgroup within the larger Flavivirus genus. These viruses utilize a single-polyprotein expression strategy, resulting in ~10 mature proteins. Plotting the conservation at synonymous sites along the polyprotein coding sequence reveals strong conservation peaks at the very 5' end of the coding sequence, and also at the 5' end of the sequence encoding the NS2A protein. Such peaks are generally indicative of functionally important non-coding sequence elements. The second peak corresponds to a predicted stable pseudoknot structure whose biological importance is supported by compensatory mutations that preserve the structure. The pseudoknot is preceded by a conserved slippery heptanucleotide (Y CCU UUU), thus forming a classical stimulatory motif for -1 ribosomal frameshifting. We hypothesize, therefore, that the functional importance of the pseudoknot is to stimulate a portion of ribosomes to shift -1 nt into a short (45 codon), conserved, overlapping open reading frame, termed foo. Since cleavage at the NS1-NS2A boundary is known to require synthesis of NS2A in cis, the resulting transframe fusion protein is predicted to be NS1-NS2AN-term-FOO. We hypothesize that this may explain the origin of the previously identified NS1 'extension' protein in JEV-group flaviviruses, known as NS1'

    A +1 ribosomal frameshifting motif prevalent among plant amalgaviruses.

    Get PDF
    Sequence accessions attributable to novel plant amalgaviruses have been found in the Transcriptome Shotgun Assembly database. Sixteen accessions, derived from 12 different plant species, appear to encompass the complete protein-coding regions of the proposed amalgaviruses, which would substantially expand the size of genus Amalgavirus from 4 current species. Other findings include evidence for UUU_CGN as a +1 ribosomal frameshifting motif prevalent among plant amalgaviruses; for a variant version of this motif found thus far in only two amalgaviruses from solanaceous plants; for a region of α-helical coiled coil propensity conserved in a central region of the ORF1 translation product of plant amalgaviruses; and for conserved sequences in a C-terminal region of the ORF2 translation product (RNA-dependent RNA polymerase) of plant amalgaviruses, seemingly beyond the region of conserved polymerase motifs. These results additionally illustrate the value of mining the TSA database and others for novel viral sequences for comparative analyses.M.L.N. was supported in part by a subcontract from NIH grant 5R01GM033050-33. J.D.P. completed his work on this project during a lab rotation for the Ph.D. Training Program in Virology at Harvard University, Cambridge, MA, USA and was supported in part by NIH grant 2T32AI007245-31. A.E.F. was supported in part by the Wellcome Trust (grant 106207).This is the final version of the article. It first appeared from Elsevier via https://doi.org/10.1016/j.virol.2016.07.00

    Translational autoregulation of BZW1 and BZW2 expression by modulating the stringency of start codon selection.

    Get PDF
    The efficiency of start codon selection during ribosomal scanning in eukaryotic translation initiation is influenced by the context or flanking nucleotides surrounding the AUG codon. The levels of eukaryotic translation initiation factors 1 (eIF1) and 5 (eIF5) play critical roles in controlling the stringency of translation start site selection. The basic leucine zipper and W2 domain-containing proteins 1 and 2 (BZW1 and BZW2), also known as eIF5-mimic proteins, are paralogous human proteins containing C-terminal HEAT domains that resemble the HEAT domain of eIF5. We show that translation of mRNAs encoding BZW1 and BZW2 homologs in fungi, plants and metazoans is initiated by AUG codons in conserved unfavorable initiation contexts. This conservation is reminiscent of the conserved unfavorable initiation context that enables autoregulation of EIF1. We show that overexpression of BZW1 and BZW2 proteins enhances the stringency of start site selection, and that their poor initiation codons confer autoregulation on BZW1 and BZW2 mRNA translation. We also show that overexpression of these two proteins significantly diminishes the effect of overexpressing eIF5 on stringency of start codon selection, suggesting they antagonize this function of eIF5. These results reveal a surprising role for BZW1 and BZW2 in maintaining homeostatic stringency of start codon selection, and taking into account recent biochemical, genetic and structural insights into eukaryotic initiation, suggest a model for BZW1 and BZW2 function

    Discovery of frameshifting in Alphavirus 6K resolves a 20-year enigma.

    Get PDF
    BACKGROUND: The genus Alphavirus includes several potentially lethal human viruses. Additionally, species such as Sindbis virus and Semliki Forest virus are important vectors for gene therapy, vaccination and cancer research, and important models for virion assembly and structural analyses. The genome encodes nine known proteins, including the small '6K' protein. 6K appears to be involved in envelope protein processing, membrane permeabilization, virion assembly and virus budding. In protein gels, 6K migrates as a doublet--a result that, to date, has been attributed to differing degrees of acylation. Nonetheless, despite many years of research, its role is still relatively poorly understood. RESULTS: We report that ribosomal -1 frameshifting, with an estimated efficiency of approximately 10-18%, occurs at a conserved UUUUUUA motif within the sequence encoding 6K, resulting in the synthesis of an additional protein, termed TF (TransFrame protein; approximately 8 kDa), in which the C-terminal amino acids are encoded by the -1 frame. The presence of TF in the Semliki Forest virion was confirmed by mass spectrometry. The expression patterns of TF and 6K were studied by pulse-chase labelling, immunoprecipitation and immunofluorescence, using both wild-type virus and a TF knockout mutant. We show that it is predominantly TF that is incorporated into the virion, not 6K as previously believed. Investigation of the 3' stimulatory signals responsible for efficient frameshifting at the UUUUUUA motif revealed a remarkable diversity of signals between different alphavirus species. CONCLUSION: Our results provide a surprising new explanation for the 6K doublet, demand a fundamental reinterpretation of existing data on the alphavirus 6K protein, and open the way for future progress in the further characterization of the 6K and TF proteins. The results have implications for alphavirus biology, virion structure, viroporins, ribosomal frameshifting, and bioinformatic identification of novel frameshift-expressed genes, both in viruses and in cellular organisms
    • …
    corecore