16 research outputs found
Integration of retroviral DNA and generation of short direct repeats flanking the provirus.
<p>(A) DNA breaking and joining steps during integration. Viral and target DNA strands are represented by thick black and parallel lines, respectively, and the viral long terminal repeats (LTRs) are depicted as grey boxes. Nucleotides at the top and bottom strands are denoted by uppercase and lowercase letters, respectively. During 3ā²-end processing, IN removes two nucleotides from the 3ā² end of each strand of linear viral DNA so that the viral 3ā² ends terminate with a conserved CA dinucleotide. Closed arrowheads denote the positions of strand transfer, a concerted cleavage-ligation reaction during which IN makes a staggered break in the target DNA. Host DNA repair enzymes fill in the resulting single-stranded gaps, denoted by D1 to D4 in the upper strand and d1 to d4 in the lower strand of target DNA, and remove the two unpaired nucleotides at the 5ā² ends of the viral DNA (open arrowheads), thereby generating the short direct repeats flanking the provirus. (B) A potential pathway for generating a base transversion in the short direct repeat during XMRV integration. A coordinated integration of the two viral ends occurred at the 4-bp staggered positions as depicted by the closed arrowheads. During repair of the single-stranded gap adjacent to the upstream LTR, an adenine nucleotide was introduced at the D4 position either by misincorporation or aberrant processing of the unpaired AA-dinucleotide at the viral 5ā² end. Subsequent repair of the mismatch resulted in the observed transversion (denoted by bold types).</p
Base composition surrounding XMRV integration sites.
<p>Base compositions of the 4-bp target site duplication (positions D1 to D4; demarcated by the thick vertical lines) and 10 bp upstream (positions ā1 to ā10) and downstream (positions +1 to +10) of the direct repeat were calculated. The datasets include the 13 integration sites with correct 4-bp direct repeat (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0010255#pone-0010255-t001" target="_blank">Table 1</a>), 472 integration sites from acutely infected DU145 cells (GenBank accession numbers EU981292 to EU981799) and 14 integration sites from human prostate cancer tissues (GenBank accession numbers EU981800 to EU981813) <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0010255#pone.0010255-Kim1" target="_blank">[14]</a>. Integration occurs between positions ā1 and D1 on the top strand, and between positions D4 and +1 on the bottom strand (blue arrows). Any base in a position that is significantly overrepresented than the random dataset (<i>P</i><0.0001) is highlighted in green, while any base in a position that is significantly underrepresented than the random dataset (<i>P</i><0.0001) is highlighted in red.</p
Positions of XMRV integration sites and lengths of the target site sequence duplication.
<p>*The nucleotide position corresponds to the position of viral DNA insertion at the top strand of the chromosome indicated. Symbols + and ā within the parenthesis indicate the orientation of the viral transcription is the same and opposite, respectively, to the polarity of the top strand. GenBank accession numbers for the integration site sequences are GU816075 to GU816104.</p><p>ā The left LTR of the provirus contains a 5-bp deletion that includes the conserved CA dinucleotide at the viral end.</p><p>ĻThe target DNA contains a T to A transversion immediately adjacent to the left LTR (position 4).</p
Functional Constraint Profiling of a Viral Protein Reveals Discordance of Evolutionary Conservation and Functionality
<div><p>Viruses often encode proteins with multiple functions due to their compact genomes. Existing approaches to identify functional residues largely rely on sequence conservation analysis. Inferring functional residues from sequence conservation can produce false positives, in which the conserved residues are functionally silent, or false negatives, where functional residues are not identified since they are species-specific and therefore non-conserved. Furthermore, the tedious process of constructing and analyzing individual mutations limits the number of residues that can be examined in a single study. Here, we developed a systematic approach to identify the functional residues of a viral protein by coupling experimental fitness profiling with protein stability prediction using the influenza virus polymerase PA subunit as the target protein. We identified a significant number of functional residues that were influenza type-specific and were evolutionarily non-conserved among different influenza types. Our results indicate that type-specific functional residues are prevalent and may not otherwise be identified by sequence conservation analysis alone. More importantly, this technique can be adapted to any viral (and potentially non-viral) protein where structural information is available.</p></div
Structure-function relationship of residue 281.
<p>(A) The interaction of influenza A PA with the RNA phosphate backbone located between base 3 and 4 is shown. RNA is colored in green. PA is colored in cyan. Hydrogen bonds are represented by dotted black lines. Numbering of residue position is based on A/WSN/33. Conversion of residue position numbering is described in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005310#pgen.1005310.s018" target="_blank">S3 Table</a>. (B) The interaction of influenza B PA with the RNA phosphate backbone located between base 3 and 4 is shown. RNA is colored in green. PA is colored in cyan. Hydrogen bonds are represented by dotted black lines. Numbering of residue position is based on A/WSN/33. Conversion of residue position numbering is described in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005310#pgen.1005310.s018" target="_blank">S3 Table</a>.</p
Sequence entropy analysis.
<p>(A) Distribution of sequence entropy for functional residues, structural residues, and āotherā residues. (B) Distribution of dN/dS for functional residues, structural residues, and āotherā residues. (C) Sequence entropy, dN/dS, the natural concensus residue, FRcons category, and FRsubtype category are shown for the validated functional residues in this study. The dashed line indicated the median value across the entire PA segment. For FRcons and FRsubtype, we considered a residue with a category of ā„ 8 as a hit (a total of 72 residues were identified as a hit in each of these two methods).</p
Fitness profiling of PA influenza virus polymerase subunit.
<p>(A) Correlations of log<sub>10</sub> relative frequency of individual point mutations between replicates are shown. Relative frequency<sub><i>mutation</i><i>i</i></sub> = (Occurrence frequency<sub><i>mutation</i><i>i</i></sub>)/(Occurrence frequency<sub><i>WT</i></sub>) (B) Log<sub>10</sub> RF indices for silent mutations, nonsense mutations, and missense mutations are shown as histograms. Point mutations located at the 5 terminal 400 bp and 3 terminal 400 bp regions are not included in this analysis to avoid complication by the vRNA packaging signal [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005310#pgen.1005310.ref093" target="_blank">93</a>, <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005310#pgen.1005310.ref094" target="_blank">94</a>]. (C) The locations of the PA C-terminal domain and the PA N-terminal domain are shown as white boxes. The locations of the mutated regions in each mutant library are shown as green boxes. Log<sub>10</sub> RF indices for individual point mutations are plotted across the PA gene. Each point mutation is colored coded as in panel B. Purple: silent mutations; Cyan: nonsense mutations; Brown: missense mutations. A smooth curve was fitted by loess and plotted for each point mutation type.</p
Construction of the mutant libraries.
<p>(A) A schematic representation of the fitness profiling experiment is shown. A 240 bp insert was generated by error-prone PCR and BsaI digestion. The corresponding vector was generated by high-fidelity PCR and BsmBI digestion. Each of the nine plasmid libraries in this study consist of ā¼ 50,000 clones. Each viral mutant library was rescued by transfecting ā¼ 35 million 293T cells. Each infection was performed with ā¼ 10 million A549 cells. (B) A schematic representation of the sequencing library preparation is shown. DNA plasmid mutant library or viral cDNA was used for PCR. This PCR amplified the 240 bp randomized region. The amplicon product was then digested with BpmI, end-repaired, dA-tailed, ligated to sequencing adapters, and sequenced using the Illumina MiSeq platform. BpmI digestion removed the primer region in the amplicon PCR, resulting in sequencing reads covering only the barcode for multiplex sequencing and the 240 bp region that was randomized in the mutant library. With this experimental design, the number of mutations carried by individual genomes in the mutant libraries could be precisely determined.</p
Structural analysis of putative functional residues.
<p>(A) The location of a putative functional subdomain is shown on the structure of the influenza polymerase heterotrimeric complex (PDB: 4WSB) [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005310#pgen.1005310.ref064" target="_blank">64</a>]. For PA, residues were colored as according to the scheme presented in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005310#pgen.1005310.g004" target="_blank">Fig 4</a>. A putative host determinant residue, S552, is colored in magenta. Note, residue 559 carries an arginine [R] instead of a lysine [K] on the PA of A/WSN/33. (B) The effects of different PA point mutations on influenza polymerase activity were measured using an influenza A virus-inducible luciferase reporter assay [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005310#pgen.1005310.ref063" target="_blank">63</a>]. Error bar represents the standard deviation of three biological replicates. (C) The expression level of each C-terminal Flag-tagged PA mutant or WT was tested by immunoblot analysis. The expression level of actin was served as a loading control.</p
Systematic identification of functional residues.
<p>(A) Predicted ĪĪG for each point mutation is plotted against the log<sub>10</sub> RF index. The horizontal green line represents the RF index cutoff used in this study, RF index = 0.15. For the N-terminal domain, the Spearmanās rank correlation between log<sub>10</sub> RF index and Predicted ĪĪG is -0.20 (P = 1.3e<sup>ā4</sup>). For the C-terminal, the Spearmanās rank correlation between log<sub>10</sub> RF index and Predicted ĪĪG is -0.18 (P = 6.8e<sup>ā10</sup>). (B) The distributions of relative SASA are shown for residues that carried at least one substitutions of interest (RF index < 0.15 and a predicted ĪĪG < 0) and for residues that did not carry any substitutions of interest. (C) This analysis is performed on those solvent exposed residues (relative SASA > 0.2) that carried a deleterious mutation (RF index < 0.15). The pie chart is showing the fraction of residues that carried a substitution of interest (ĪĪG < 0) and those did not (ĪĪG ā„ 0).</p