934 research outputs found

    A phylogenetic method for detecting positive epistasis in gene sequences and its application to RNA virus evolution

    Get PDF
    RNA virus genomes are compact, often containing multiple overlapping reading frames and functional secondary structure. Consequently, it is thought that evolutionary interactions between nucleotide sites are commonplace in the genomes of these infectious agents. However, the role of epistasis in natural populations of RNA viruses remains unclear. To investigate the pervasiveness of epistasis in RNA viruses, we used a parsimony-based computational method to identify pairs of co-occurring mutations along phylogenies of 177 RNA virus genes. This analysis revealed widespread evidence for positive epistatic interactions at both synonymous and nonsynonymous nucleotide sites and in both clonal and recombining viruses, with the majority of these interactions spanning very short sequence regions. These findings have important implications for understanding the key aspects of RNA virus evolution, including the dynamics of adaptation. Additionally, many comparative analyses that utilize the phylogenetic relationships among gene sequences assume that mutations represent independent, uncorrelated events. Our results show that this assumption may often be invalid.</p

    Phylogenetic surveillance of viral genetic diversity and the evolving molecular epidemiology of human immunodeficiency virus type 1

    Get PDF
    With ongoing generation of viral genetic diversity and increasing levels of migration, the global human immunodeficiency virus type 1 (HIV-1) epidemic is becoming increasingly heterogeneous. In this study, we investigate the epidemiological characteristics of 5,675 HIV-1 pol gene sequences sampled from distinct infections in the United Kingdom. These sequences were phylogenetically analyzed in conjunction with 976 complete-genome and 3,201 pol gene reference sequences sampled globally and representing the broad range of HIV-1 genetic diversity, allowing us to estimate the probable geographic origins of the various strains present in the United Kingdom. A statistical analysis of phylogenetic clustering in this data set identified several independent transmission chains within the United Kingdom involving recently introduced strains and indicated that strains more commonly associated with infections acquired heterosexually in East Africa are spreading among men who have sex with men. Coalescent approaches were also used and indicated that the transmission chains that we identify originated in the late 1980s to early 1990s. Similar changes in the epidemiological structuring of HIV epidemics are likely to be taking in place in other industrialized nations with large immigrant populations. The framework implemented here takes advantage of the vast amount of routinely generated HIV-1 sequence data and can provide epidemiological insights not readily obtainable through standard surveillance methods.</p

    Robust design for coalescent model inference

    Get PDF
    The coalescent process describes how changes in the size or structure of a population influence the genealogical patterns of sequences sampled from that population. The estimation of (effective) population size changes from genealogies that are reconstructed from these sampled sequences is an important problem in many biological fields. Often, population size is characterised by a piecewise-constant function, with each piece serving as a population size parameter to be estimated. Estimation quality depends on both the statistical coalescent inference method employed, and on the experimental protocol, which controls variables such as the sampling of sequences through time and space, or the transformation of model parameters. While there is an extensive literature on coalescent inference methodology, there is comparatively little work on experimental design. The research that does exist is largely simulation-based, precluding the development of provable or general design theorems. We examine three key design problems: temporal sampling of sequences under the skyline demographic coalescent model, spatio-temporal sampling under the structured coalescent model, and time discretisation for sequentially Markovian coalescent models. In all cases we prove that (i) working in the logarithm of the parameters to be inferred (e.g. population size), and (ii) distributing informative coalescent events uniformly among these log-parameters, is uniquely robust. `Robust' means that the total and maximum uncertainty of our parameter estimates are minimised, and made insensitive to their unknown (true) values. This robust design theorem provides rigorous justification for several existing coalescent experimental design decisions, and leads to usable guidelines for future empirical or simulation-based investigations. Given its persistence among models, this theorem may form the basis of an experimental design paradigm for coalescent inference

    Genomic surveillance of avian-origin influenza A viruses causing human disease

    Get PDF
    Avian influenza A viruses (AIVs) pose a threat to global health because of their sporadic zoonotic transmission and potential to cause pandemics. Genomic surveillance of AIVs has become a powerful, cost-effective approach for studying virus transmission, evolution, and dissemination, and has the potential to inform outbreak control efforts and policies

    Disease-associated XMRV sequences are consistent with laboratory contamination

    Get PDF
    BACKGROUND: Xenotropic murine leukaemia viruses (MLV-X) are endogenous gammaretroviruses that infect cells from many species, including humans. Xenotropic murine leukaemia virus-related virus (XMRV) is a retrovirus that has been the subject of intense debate since its detection in samples from humans with prostate cancer (PC) and chronic fatigue syndrome (CFS). Controversy has arisen from the failure of some studies to detect XMRV in PC or CFS patients and from inconsistent detection of XMRV in healthy controls. RESULTS: Here we demonstrate that Taqman PCR primers previously described as XMRV-specific can amplify common murine endogenous viral sequences from mouse suggesting that mouse DNA can contaminate patient samples and confound specific XMRV detection. To consider the provenance of XMRV we sequenced XMRV from the cell line 22Rv1, which is infected with an MLV-X that is indistinguishable from patient derived XMRV. Bayesian phylogenies clearly show that XMRV sequences reportedly derived from unlinked patients form a monophyletic clade with interspersed 22Rv1 clones (posterior probability >0.99). The cell line-derived sequences are ancestral to the patient-derived sequences (posterior probability >0.99). Furthermore, pol sequences apparently amplified from PC patient material (VP29 and VP184) are recombinants of XMRV and Moloney MLV (MoMLV) a virus with an envelope that lacks tropism for human cells. Considering the diversity of XMRV we show that the mean pairwise genetic distance among env and pol 22Rv1-derived sequences exceeds that of patient-associated sequences (Wilcoxon rank sum test: p = 0.005 and p < 0.001 for pol and env, respectively). Thus XMRV sequences acquire diversity in a cell line but not in patient samples. These observations are difficult to reconcile with the hypothesis that published XMRV sequences are related by a process of infectious transmission. CONCLUSIONS: We provide several independent lines of evidence that XMRV detected by sensitive PCR methods in patient samples is the likely result of PCR contamination with mouse DNA and that the described clones of XMRV arose from the tumour cell line 22Rv1, which was probably infected with XMRV during xenografting in mice. We propose that XMRV might not be a genuine human pathogen

    A phylogenetic codon substitution model for antibody lineages

    Get PDF
    Phylogenetic methods have shown promise in understanding the development ofbroadly neutralizing antibody lineages (bNAbs). However, the mutational process that generates these lineages – somatic hypermutation (SHM) – is biased by hotspot motifs, which violates important assumptions in most phylogenetic substitution models. Here, we develop a modified GY94-type substitution model that partially accounts for this context-dependency while preserving independence of sites during calculation. This model shows a substantially better fit to three well-characterized bNAb lineages than the standard GY94 model. We also demonstrate how our model can be used to test hypotheses concerning the roles of different hotspot and coldspot motifs in the evolution of B-cell lineages. Further, we explore the consequences of the idea that the number of hotspot motifs – and perhaps the mutation rate in general – is expected to decay over time in individual bNAb lineages

    Genomic, epidemiological and digital surveillance of Chikungunya virus in the Brazilian Amazon

    Get PDF
    Background Since its first detection in the Caribbean in late 2013, chikungunya virus (CHIKV) has affected 51 countries in the Americas. The CHIKV epidemic in the Americas was caused by the CHIKV-Asian genotype. In August 2014, local transmission of the CHIKV-Asian genotype was detected in the Brazilian Amazon region. However, a distinct lineage, the CHIKV-East-Central-South-America (ECSA)-genotype, was detected nearly simultaneously in Feira de Santana, Bahia state, northeast Brazil. The genomic diversity and the dynamics of CHIKV in the Brazilian Amazon region remains poorly understood despite its importance to better understand the epidemiological spread and public health impact of CHIKV in the country. Methodology/Principal findings We report a large CHIKV outbreak (5,928 notified cases between August 2014 and August 2018) in Boa vista municipality, capital city of Roraima’s state, located in the Brazilian Amazon region. We generated 20 novel CHIKV-ECSA genomes from the Brazilian Amazon region using MinION portable genome sequencing. Phylogenetic analyses revealed that despite an early introduction of the Asian genotype in 2015 in Roraima, the large CHIKV outbreak in 2017 in Boa Vista was caused by an ECSA-lineage most likely introduced from northeastern Brazil. Epidemiological analyses suggest a basic reproductive number of R0 of 1.66, which translates in an estimated 39 (95% CI: 36 to 45) % of Roraima’s population infected with CHIKV-ECSA. Finally, we find a strong association between Google search activity and the local laboratory-confirmed CHIKV cases in Roraima. Conclusions/Significance This study highlights the potential of combining traditional surveillance with portable genome sequencing technologies and digital epidemiology to inform public health surveillance in the Amazon region. Our data reveal a large CHIKV-ECSA outbreak in Boa Vista, limited potential for future CHIKV outbreaks, and indicate a replacement of the Asian genotype by the ECSA genotype in the Amazon region
    corecore