64 research outputs found
Automatic differentiation is no panacea for phylogenetic gradient computation
Gradients of probabilistic model likelihoods with respect to their parameters
are essential for modern computational statistics and machine learning. These
calculations are readily available for arbitrary models via automatic
differentiation implemented in general-purpose machine-learning libraries such
as TensorFlow and PyTorch. Although these libraries are highly optimized, it is
not clear if their general-purpose nature will limit their algorithmic
complexity or implementation speed for the phylogenetic case compared to
phylogenetics-specific code. In this paper, we compare six gradient
implementations of the phylogenetic likelihood functions, in isolation and also
as part of a variational inference procedure. We find that although automatic
differentiation can scale approximately linearly in tree size, it is much
slower than the carefully-implemented gradient calculation for tree likelihood
and ratio transformation operations. We conclude that a mixed approach
combining phylogenetic libraries with machine learning libraries will provide
the optimal combination of speed and model flexibility moving forward.Comment: 15 pages and 2 figures in main text, plus supplementary material
Routes of importation and spatial dynamics of SARS-CoV-2 variants during localized interventions in Chile
Human mobility is strongly associated with the spread of SARS-CoV-2 via air travel on an international scale and with population mixing and the number of people moving between locations on a local scale. However, these conclusions are drawn mostly from observations in the context of the global north where international and domestic connectivity is heavily influenced by the air travel network; scenarios where land-based mobility can also dominate viral spread remain understudied. Furthermore, research on the effects of nonpharmaceutical interventions (NPIs) has mostly focused on national- or regional-scale implementations, leaving gaps in our understanding of the potential benefits of implementing NPIs at higher granularity. Here, we use Chile as a model to explore the role of human mobility on disease spread within the global south; the country implemented a systematic genomic surveillance program and NPIs at a very high spatial granularity. We combine viral genomic data, anonymized human mobility data from mobile phones and official records of international travelers entering the country to characterize the routes of importation of different variants, the relative contributions of airport and land border importations, and the real-time impact of the country's mobility network on the diffusion of SARS-CoV-2. The introduction of variants which are dominant in neighboring countries (and not detected through airport genomic surveillance) is predicted by land border crossings and not by air travelers, and the strength of connectivity between comunas (Chile's lowest administrative divisions) predicts the time of arrival of imported lineages to new locations. A higher stringency of local NPIs was also associated with fewer domestic viral importations. Our analysis sheds light on the drivers of emerging respiratory infectious disease spread outside of air travel and on the consequences of disrupting regular movement patterns at lower spatial scales
The molecular epidemiology of multiple zoonotic origins of SARS-CoV-2
Understanding the circumstances that lead to pandemics is important for their prevention. Here, we analyze the genomic diversity of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) early in the coronavirus disease 2019 (COVID-19) pandemic. We show that SARS-CoV-2 genomic diversity before February 2020 likely comprised only two distinct viral lineages, denoted A and B. Phylodynamic rooting methods, coupled with epidemic simulations, reveal that these lineages were the result of at least two separate cross-species transmission events into humans. The first zoonotic transmission likely involved lineage B viruses around 18 November 2019 (23 October–8 December), while the separate introduction of lineage A likely occurred within weeks of this event. These findings indicate that it is unlikely that SARS-CoV-2 circulated widely in humans prior to November 2019 and define the narrow window between when SARS-CoV-2 first jumped into humans and when the first cases of COVID-19 were reported. As with other coronaviruses, SARS-CoV-2 emergence likely resulted from multiple zoonotic events
Ebola virus transmission initiated by systemic ebola virus disease relapse
During the 2018-2020 Ebola virus disease (EVD) outbreak in North Kivu province in the Democratic Republic of Congo, EVD was diagnosed in a patient who had received the recombinant vesicular stomatitis virus-based vaccine expressing a ZEBOV glycoprotein (rVSV-ZEBOV) (Merck). His treatment included an Ebola virus (EBOV)-specific monoclonal antibody (mAb114), and he recovered within 14 days. However, 6 months later, he presented again with severe EVD-like illness and EBOV viremia, and he died. We initiated epidemiologic and genomic investigations that showed that the patient had had a relapse of acute EVD that led to a transmission chain resulting in 91 cases across six health zones over 4 months. (Funded by the Bill and Melinda Gates Foundation and others.)
Genomic epidemiology reveals multiple introductions of Zika virus into the United States
Zika virus (ZIKV) is causing an unprecedented epidemic linked to severe congenital abnormalities. In July 2016, mosquito-borne ZIKV transmission was reported in the continental United States; since then, hundreds of locally acquired infections have been reported in Florida. To gain insights into the timing, source, and likely route(s) of ZIKV introduction, we tracked the virus from its first detection in Florida by sequencing ZIKV genomes from infected patients and Aedes aegypti mosquitoes. We show that at least 4 introductions, but potentially as many as 40, contributed to the outbreak in Florida and that local transmission is likely to have started in the spring of 2016-several months before its initial detection. By analysing surveillance and genetic data, we show that ZIKV moved among transmission zones in Miami. Our analyses show that most introductions were linked to the Caribbean, a finding corroborated by the high incidence rates and traffic volumes from the region into the Miami area. Our study provides an understanding of how ZIKV initiates transmission in new regions
Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples
Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples (i.e., without isolation and culture) remains challenging for viruses such as Zika, for which metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence-complete genomes, comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimized library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an Internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved in 1-2 d by starting with clinical samples and following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. The protocol can be used to sequence other viral genomes using the online Primal Scheme primer designer software. It is suitable for sequencing either RNA or DNA viruses in the field during outbreaks or as an inexpensive, convenient method for use in the lab
Genome sequencing reveals Zika virus diversity and spread in the Americas
Although the recent Zika virus (ZIKV) epidemic in the Americas and its link to birth defects have attracted a great deal of attention, much remains unknown about ZIKV disease epidemiology and ZIKV evolution, in part owing to a lack of genomic data. Here we address this gap in knowledge by using multiple sequencing approaches to generate 110 ZIKV genomes from clinical and mosquito samples from 10 countries and territories, greatly expanding the observed viral genetic diversity from this outbreak. We analysed the timing and patterns of introductions into distinct geographic regions; our phylogenetic evidence suggests rapid expansion of the outbreak in Brazil and multiple introductions of outbreak strains into Puerto Rico, Honduras, Colombia, other Caribbean islands, and the continental United States. We find that ZIKV circulated undetected in multiple regions for many months before the first locally transmitted cases were confirmed, highlighting the importance of surveillance of viral infections. We identify mutations with possible functional implications for ZIKV biology and pathogenesis, as well as those that might be relevant to the effectiveness of diagnostic tests
- …