38 research outputs found
SDT: a virus classification tool based on pairwise sequence alignment and identity calculation
The perpetually increasing rate at which viral full-genome sequences are being determined is creating a pressing demand for computational tools that will aid the objective classification of these genome sequences. Taxonomic classification approaches that are based on pairwise genetic identity measures are potentially highly automatable and are progressively gaining favour with the International Committee on Taxonomy of Viruses (ICTV). There are, however, various issues with the calculation of such measures that could potentially undermine the accuracy and consistency with which they can be applied to virus classification. Firstly, pairwise sequence identities computed based on multiple sequence alignments rather than on multiple independent pairwise alignments can lead to the deflation of identity scores with increasing dataset sizes. Also, when gap-characters need to be introduced during sequence alignments to account for insertions and deletions, methodological variations in the way that these characters are introduced and handled during pairwise genetic identity calculations can cause high degrees of inconsistency in the way that different methods classify the same sets of sequences. Here we present Sequence Demarcation Tool (SDT), a free user-friendly computer program that aims to provide a robust and highly reproducible means of objectively using pairwise genetic identity calculations to classify any set of nucleotide or amino acid sequences. SDT can produce publication quality pairwise identity plots and colour-coded distance matrices to further aid the classification of sequences according to ICTV approved taxonomic demarcation criteria. Besides a graphical interface version of the program for Windows computers, command-line versions of the program are available for a variety of different operating systems (including a parallel version for cluster computing platforms)
Patterns of recombination in HIV-1M are influenced by selection disfavouring the survival of recombinants with disrupted genomic RNA and protein structures
Genetic recombination is a major contributor to the ongoing diversification of HIV. It is clearly apparent that across the HIV-genome there are defined recombination hot and cold spots which tend to co-localise both with genomic secondary structures and with either inter-gene boundaries or intra-gene domain boundaries. There is also good evidence that most recombination breakpoints that are detectable within the genes of natural HIV recombinants are likely to be minimally disruptive of intra-protein amino acid contacts and that these breakpoints should therefore have little impact on protein folding. Here we further investigate the impact on patterns of genetic recombination in HIV of selection favouring the maintenance of functional RNA and protein structures. We confirm that chimaeric Gag p24, reverse transcriptase, integrase, gp120 and Nef proteins that are expressed by natural HIV-1 recombinants have significantly lower degrees of predicted folding disruption than randomly generated recombinants. Similarly, we use a novel single-stranded RNA folding disruption test to show that there is significant, albeit weak, evidence that natural HIV recombinants tend to have genomic secondary structures that more closely resemble parental structures than do randomly generated recombinants. These results are consistent with the hypothesis that natural selection has acted both in the short term to purge recombinants with disrupted RNA and protein folds, and in the longer term to modify the genome architecture of HIV to ensure that recombination prone sites correspond with those where recombination will be minimally deleterious
The influence of secondary structure, selection and recombination on rubella virus nucleotide substitution rate estimates
BACKGROUND: Annually, rubella virus (RV) still causes severe congenital defects in around 100 000 children globally. An attempt to eradicate RV is currently underway and analytical tools to monitor the global decline of the last remaining RV lineages will be useful for assessing the effectiveness of this endeavour. RV evolves rapidly enough that much of this information might be inferable from RV genomic sequence data. METHODS: Using BEASTv1.8.0, we analysed publically available RV sequence data to estimate genome-wide and gene-specific nucleotide substitution rates to test whether current estimates of RV substitution rates are representative of the entire RV genome. We specifically accounted for possible confounders of nucleotide substitution rate estimates, such as temporally biased sampling, sporadic recombination, and natural selection favouring either increased or decreased genetic diversity (estimated by the PARRIS and FUBAR methods), at nucleotide sites within the genomic secondary structures (predicted by the NASP method). RESULTS: We determine that RV nucleotide substitution rates range from 1.19 x 10-3 substitutions/site/year in the E1 region to 7.52 x 10-4 substitutions/site/year in the P150 region. We find that differences between substitution rate estimates in different RV genome regions are largely attributable to temporal sampling biases such that datasets containing higher proportions of recently sampled sequences, will tend to have inflated estimates of mean substitution rates. Although there exists little evidence of positive selection or natural genetic recombination in RV, we show that RV genomes possess pervasive biologically functional nucleic acid secondary structure and that purifying selection acting to maintain this structure contributes substantially to variations in estimated nucleotide substitution rates across RV genomes. CONCLUSION: Both temporal sampling biases and purifying selection favouring the conservation of RV nucleic acid secondary structures have an appreciable impact on substitution rate estimates but do not preclude the use of RV sequence data to date ancestral sequences. The combination of uniformly high substitution rates across the RV genome and strong temporal structure within the available sequence data, suggests that such data should be suitable for tracking the demographic, epidemiological and movement dynamics of this virus during eradication attempts
Extensive recombination detected among beak and feather disease virus isolates from breeding facilities in Poland
Beak and feather disease virus (BFDV) causes the highly contagious, in some cases fatal,
psittacine beak and feather disease in parrots. The European continent has no native parrots, yet
in the past has been one of the world’s biggest importers of wild-caught exotic parrot species.
Following the banning of this practice in 2007, the demand for exotic pet parrots has largely been
met by established European breeding facilities, which can also supply buyers outside Europe.
However, the years of unregulated importation have provided numerous opportunities for BFDV to
enter Europe, meaning the likelihood of birds within captive breeding facilities being BFDV
positive is high. This study examined the BFDV status of such facilities in Poland, a country
previously shown to have BFDV among captive birds. A total of 209 birds from over 50 captive
breeding facilities across Poland were tested, and 43 birds from 18 different facilities tested
positive for BFDV. The full BFDV genomes from these 43 positive birds were determined, and
phylogenetic analysis revealed that these samples harboured a relatively high degree of diversity
and that they were highly recombinant. It is evident that there have been multiple introductions of
BFDV into Poland over a long period of time, and the close association of different species of
birds in the captive environment has probably facilitated the evolution of new BFDV strains
through recombination.Web of Scienc
Recommended from our members
Recombinant Goose Circoviruses Circulating in Domesticated and Wild Geese in Poland
Circoviruses are circular single-stranded DNA (ssDNA) viruses that infect a variety of animals, both domestic and wild. Circovirus infection in birds is associated with immunosuppression and this in turn predisposes the infected animals to secondary infections that can lead to mortality. Farmed geese (Anser anser) in many parts of the world are infected with circoviruses. The majority of the current genomic information for goose circoviruses (GoCVs) (n = 40) are from birds sampled in China and Taiwan, and only two genome sequences are available from Europe (Germany and Poland). In this study, we sampled 23 wild and 19 domestic geese from the Gopło Lake area in Poland. We determined the genomes of GoCV from 21 geese; 14 domestic Greylag geese (Anser anser), three wild Greylag geese (A. anser), three bean geese (A. fabalis), and one white fronted goose (A. albifrons). These genomes share 83–95% nucleotide pairwise identities with previously identified GoCV genomes, most are recombinants with exchanged fragment sizes up to 50% of the genome. Higher diversity levels can be seen within the genomes from domestic geese compared with those from wild geese. In the GoCV capsid protein (cp) and replication associated protein (rep) gene sequences we found that episodic positive selection appears to largely mirror those of beak and feather disease virus and pigeon circovirus. Analysis of the secondary structure of the ssDNA genome revealed a conserved stem-loop structure with the G-C rich stem having a high degree of negative selection on these nucleotides
Pigeon circoviruses display patterns of recombination, genomic secondary structure and selection similar to those of beak and feather disease viruses
Pigeon circovirus (PiCV) has a ~2 kb genome circular ssDNA genome. All but one of the known PiCV isolates have been found infecting pigeons in various parts of the world. In this study, we screened 324 swab and tissue samples from Polish pigeons and recovered 30 complete genomes, 16 of which came from birds displaying no obvious pathology. Together with 17 other publicly available PiCV complete genomes sampled throughout the Northern Hemisphere and Australia, we find that PiCV displays a similar degree of genetic diversity to that of the related psittacine-infecting circovirus species, beak and feather disease virus (BFDV). We show that, as is the case with its pathology and epidemiology, PiCV also displays patterns of recombination, genomic secondary structure and natural selection that are generally very similar to those of BFDV. It is likely that breeding facilities play a significant role in the emergence of new recombinant PiCV variants and given that ~50 % of the domestic pigeon population is infected subclinically, all pigeon breeding stocks should be screened routinely for this virus
Evidence of pervasive biologically functional secondary structures within the Genomes of Eukaryotic Single-Stranded DNA Viruses
Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures
through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base
pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are
(i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding
methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally
inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA
viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these
for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying
selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we
also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving
in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive
secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure
is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized
structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.Department of HE and Training approved lis
PGE2 alters chromatin through H2A.Z-variant enhancer nucleosome modification to promote hematopoietic stem cell fate
Prostaglandin E2 (PGE2) and 16,16-dimethyl-PGE2 (dmPGE2) are important regulators of hematopoietic stem and progenitor cell (HSPC) fate and offer potential to enhance stem cell therapies [C. Cutler et al. Blood 122, 3074–3081(2013); W. Goessling et al. Cell Stem Cell 8, 445–458 (2011); W. Goessling et al. Cell 136, 1136–1147 (2009)]. Here, we report that PGE2-induced changes in chromatin at enhancer regions through histone-variant H2A.Z permit acute inflammatory gene induction to promote HSPC fate. We found that dmPGE2-inducible enhancers retain MNase-accessible, H2A.Z-variant nucleosomes permissive of CREB transcription factor (TF) binding. CREB binding to enhancer nucleosomes following dmPGE2 stimulation is concomitant with deposition of histone acetyltransferases p300 and Tip60 on chromatin. Subsequent H2A.Z acetylation improves chromatin accessibility at stimuli-responsive enhancers. Our findings support a model where histone-variant nucleosomes retained within inducible enhancers facilitate TF binding. Histone-variant acetylation by TF-associated nucleosome remodelers creates the accessible nucleosome landscape required for immediate enhancer activation and gene induction. Our work provides a mechanism through which inflammatory mediators, such as dmPGE2, lead to acute transcriptional changes and modify HSPC behavior to improve stem cell transplantation
The evolutionary impacts of secondary structures within genomes of eukaryote-infecting single-stranded DNA viruses
Includes bibliographical referencesSecondary structures forming through base-pairing in virus genomes have been proven to regulate several processes during viral replication cycles, including genome replication, transcription, post-transcriptional activities, protein synthesis, genome packaging, generation of viral sub-genomes and evasion of host-cell immune responses. Although computational DNA/RNA folding methods based-on free energy minimisation approaches are capable of predicting structures that form within virus genomes, these methods are not entirely accurate. Notably, many of structures that are accurately predicted will likely have no biological importance within the genomes in which they reside because even randomly generated single-stranded RNA/DNA sequences will form stable secondary structures. Nevertheless, with additional genome evolution analyses involving the detection of natural selection, sequence co-evolution, and genetic recombination, it is possible to both validate the existence of, and infer the biological importance of, computationally predicted structures. Here I implement and deploy free bioinformatics tools to (1) automate nucleotide and protein sequences classification into datasets useful for downstream molecular evolution analyses; (2) improve the accuracy of computational virus-genome-scale secondary structure prediction; (3) enable the identification of biologically relevant secondary structures using signals of purifying selection, coevolution and recombination within aligned sequence datasets; and (4) enable efficient visualisation of structural and selection data for better characterisation of individual secondary structural elements. Using these tools I carried-out large scale studies that predicted and characterised novel functional secondary structures, that potentially regulate transcription, translation, gene splicing, and replication, within the genomes of eukaryote-infecting ssDNA viruses (Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae). I show that purifying selection tends to be stronger at base-paired sites than it is at unpaired sites and, wherever mutations are tolerable within paired regions, I demonstrate that there exist strong associations between base-pairing and complementary coevolution. Finally, I show that the recombinant genomes of some, but not all, eukaryote-infecting ssDNA virus groups display weak evidence of both homologous and non-homologous recombination break-points preferentially occurring at genome sites that minimally disrupt secondary structures. Altogether, these results suggest that natural selection acting to maintain important biologically functional secondary structural elements has been a major process during the evolution of eukaryote-infecting ssDNA viruses
Distribution of pairwise genetic/evolutionary distances of the same set of 25 mastrevirus full genome sequences in the context of progressively larger sequence datasets.
<p>The constant frequency distribution (represented by red graph) illustrates the consistency of pairwise distance calculation based on pairwise alignments while the changing frequency distributions (represented by blue and green graphs) indicate how pairwise distance scores based on multiple sequence alignment tend to become inflated as dataset sizes get larger.</p