Search CORE

60 research outputs found

Automatic differentiation is no panacea for phylogenetic gradient computation

Author: Fourment Mathieu
Galloway Jared G.
Gangavarapu Karthik
Ji Xiang
Matsen IV Frederick A.
Suchard Marc A.
Swanepoel Christiaan J.
Publication venue
Publication date: 03/11/2022
Field of study

Gradients of probabilistic model likelihoods with respect to their parameters are essential for modern computational statistics and machine learning. These calculations are readily available for arbitrary models via automatic differentiation implemented in general-purpose machine-learning libraries such as TensorFlow and PyTorch. Although these libraries are highly optimized, it is not clear if their general-purpose nature will limit their algorithmic complexity or implementation speed for the phylogenetic case compared to phylogenetics-specific code. In this paper, we compare six gradient implementations of the phylogenetic likelihood functions, in isolation and also as part of a variational inference procedure. We find that although automatic differentiation can scale approximately linearly in tree size, it is much slower than the carefully-implemented gradient calculation for tree likelihood and ratio transformation operations. We conclude that a mixed approach combining phylogenetic libraries with machine learning libraries will provide the optimal combination of speed and model flexibility moving forward.Comment: 15 pages and 2 figures in main text, plus supplementary material

arXiv.org e-Print Archive

eScholarship - University of California

The molecular epidemiology of multiple zoonotic origins of SARS-CoV-2

Author: Andersen Kristian G.
Ching Zi Yan Katherine
Crits-Christoph Alexander
Gangavarapu Karthik
Garry Robert F.
Havens Jennifer L.
Holmes Edward C.
Hughes Scott
Izhikevich Katherine
Lee Jungmin
Levy Joshua I.
Lin Raymond Tzer Pin
Magee Andrew
Malpica Serrano Lorena Mariana
Mat Isa Mohd Noor
Matteson Nathaniel L.
Moshiri Niema
Noor Yusuf Muhammad
Park Heedo
Park Man Seong
Parker Edyth
Pekar Jonathan E.
Rambaut Andrew
Suchard Marc A.
Vasylyeva Tetyana I.
Wang Jade C.
Wertheim Joel O.
Worobey Michael
Zeller Mark
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 26/07/2022
Field of study

Understanding the circumstances that lead to pandemics is important for their prevention. Here, we analyze the genomic diversity of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) early in the coronavirus disease 2019 (COVID-19) pandemic. We show that SARS-CoV-2 genomic diversity before February 2020 likely comprised only two distinct viral lineages, denoted A and B. Phylodynamic rooting methods, coupled with epidemic simulations, reveal that these lineages were the result of at least two separate cross-species transmission events into humans. The first zoonotic transmission likely involved lineage B viruses around 18 November 2019 (23 October–8 December), while the separate introduction of lineage A likely occurred within weeks of this event. These findings indicate that it is unlikely that SARS-CoV-2 circulated widely in humans prior to November 2019 and define the narrow window between when SARS-CoV-2 first jumped into humans and when the first cases of COVID-19 were reported. As with other coronaviruses, SARS-CoV-2 emergence likely resulted from multiple zoonotic events

PubMed Central

Edinburgh Research Explorer

eScholarship - University of California

Ebola virus transmission initiated by systemic ebola virus disease relapse

Author: Ahuka-Mundeke Steve
Akonga Marceline
Andersen Kristian G.
Aziza Amuri
Bedford Trevor
Bile Faustin
Bisento Nella
Black Allison
Bula Bula Junior
Crozier Ian
Darnycka Belizaire Marie Roseline
Diagne Moussa M
Diallo Boubacar
Edidi Francois
Epaso Victor
Fall Ibrahima Soce
Faye Martin
Faye Ousmane
Gangavarapu Karthik
Hadfield James
Hensley Lisa E.
Kazadi Donatien
Keita Mory
Kinganda Lusamaki Eddy
Mambu Fabrice
Matondo Meris
Mbala-Kingebeni Placide
Misasi John
Mukadi Daniel
Muyembe-Tamfum Jean-Jacques
Nkuba Ndaye Antoine
Nsunda Bibiche
N’kasar Yannick Tutu Tshia
Pauthner Matthias G
Ploquin Aurelie
Pratt Catherine
Rambaut Andrew
Rambaut Andrew
Rimion Anne W.
Ruffin Mbusa Mutafali
Sabue Mulangu
Salfati Elias L
Sall Amadou A.
Sana Paka Emilia
Suchard Marc A.
Sullivan Nancy J.
Torkamani Ali
Tshiani Olivier
White Bailey
Wiley Michael R.
Yam Abdoulaye
Publication venue: 'Massachusetts Medical Society'
Publication date: 01/04/2021
Field of study

During the 2018-2020 Ebola virus disease (EVD) outbreak in North Kivu province in the Democratic Republic of Congo, EVD was diagnosed in a patient who had received the recombinant vesicular stomatitis virus-based vaccine expressing a ZEBOV glycoprotein (rVSV-ZEBOV) (Merck). His treatment included an Ebola virus (EBOV)-specific monoclonal antibody (mAb114), and he recovered within 14 days. However, 6 months later, he presented again with severe EVD-like illness and EBOV viremia, and he died. We initiated epidemiologic and genomic investigations that showed that the patient had had a relapse of acute EVD that led to a transmission chain resulting in 91 cases across six health zones over 4 months. (Funded by the Bill and Melinda Gates Foundation and others.)

Edinburgh Research Explorer

eScholarship - University of California

Genomic epidemiology reveals multiple introductions of Zika virus into the United States

Author: Andersen Kristian G.
Bailey Varian K
Baniecki Mary Lynn
Barcellona Carolyn M
Barnes Kayla G
Bedford Trevor
Bingham Andrea
Brent Shannon E
Cannons Andrew C
Chak Bridget
Christopher Tomkins-Tinch
Cone Marshall R
Cummings Derek AT
Dudas Gytis
Faria Nuno R
Fauver Joseph R
Freije Catherine A
Gangavarapu Karthik
Garry Robert F
Gillis Leah D
Gladden-Young Adrianne
Gnirke Andreas
Grubaugh Nathan D.
Hogan Kelly N
Isern Sharon
Jean Reynald
Khan Kamran
Kopp IV Edgar W
Kraemer Moritz UG
Ladner Jason T.
Lichtenberger Paola N
Loman Nicholas J
Luo Cynthia
MacInnis Bronwyn
Magnani Diogo M
Matranga Christian B
Metsky Hayden C
Michael Scott F
Monaghan Andrew J
Nagle Elyse R
Oliveira Glenn
Palacios Gustavo F.
Park Daniel J
Paul Lauren M
Porcelli Mario C
Prieto Karla
Pronty Darryl
Pybus Oliver G.
Qu James
Quick Joshua
Rambaut Andrew
Reiner Jr Robert C
Reyes Daniel
Ricciardi Michael
Robles-Sikisaka Refugio
Sabeti Pardis C
Sanchez-Lockhart Mariano
Schaffner Stephen F
Stanek Danielle
Tan Amanda L
Theze Julien
Vasquez Chalmers
Watkins David I
West Kendra L
White Stephen
Wiley Michael R.
Winnicki Sarah M
Wohl Shirlee
Yozwiak Nathan L
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Zika virus (ZIKV) is causing an unprecedented epidemic linked to severe congenital abnormalities. In July 2016, mosquito-borne ZIKV transmission was reported in the continental United States; since then, hundreds of locally acquired infections have been reported in Florida. To gain insights into the timing, source, and likely route(s) of ZIKV introduction, we tracked the virus from its first detection in Florida by sequencing ZIKV genomes from infected patients and Aedes aegypti mosquitoes. We show that at least 4 introductions, but potentially as many as 40, contributed to the outbreak in Florida and that local transmission is likely to have started in the spring of 2016-several months before its initial detection. By analysing surveillance and genetic data, we show that ZIKV moved among transmission zones in Miami. Our analyses show that most introductions were linked to the Caribbean, a finding corroborated by the high incidence rates and traffic volumes from the region into the Miami area. Our study provides an understanding of how ZIKV initiates transmission in new regions

Crossref

University of Birmingham Research Portal

Edinburgh Research Explorer

University of Miami: Scholarship Miami

Oxford University Research Archive

HAL Université de Tours

Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples

Author: Alcantara Luiz Carlos
Andersen Kristian G
Baylis Sally A
Bedford Trevor
Beutler Nathan A
Black Allison
Burton Dennis R
Carroll Miles W
Claro Ingra M
de Jesus Jaqueline Goes
Faria Nuno R
Gangavarapu Karthik
Giovanetti Marta
Grubaugh Nathan D
Hill Sarah C
Lewis-Ximenez Lia Laura
Loman Nicholas J
Loose Matthew
Nunes Marcio
Oliveira Glenn
Pullan Steven T
Pybus Oliver G
Quick Joshua
Robles-Sikisaka Refugio
Rogers Thomas F
Sabino Ester C
Simpson Jared T
Smith Andrew D
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples (i.e., without isolation and culture) remains challenging for viruses such as Zika, for which metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence-complete genomes, comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimized library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an Internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved in 1-2 d by starting with clinical samples and following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. The protocol can be used to sequence other viral genomes using the online Primal Scheme primer designer software. It is suitable for sequencing either RNA or DNA viruses in the field during outbreaks or as an inexpensive, convenient method for use in the lab

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

Genome sequencing reveals Zika virus diversity and spread in the Americas

Author: A Piantadosi
A Rambaut
A-C Gourinat
AA Sall
Aaron E. Lin
Adrianne Gladden-Young
AJ Drummond
AJ Drummond
AJ Drummond
Amanda L. Tan
Andreas Gnirke
Andrew C. Cannons
Andrew Rambaut
AS Fauci
B Shapiro
B Shapiro
BEE Martina
Brandon Sabina
Bridget Chak
Bronwyn L. MacInnis
C Aurrecoechea
Carolyn M. Barcellona
Catherine A. Freije
Catherine M. Brown
CB Matranga
Chalmers Vasquez
Christian B. Matranga
Christopher H. Tomkins-Tinch
CJ Villabona-Arenas
Clarissa Valim
Cynthia Y. Luo
D Hyatt
D Tappe
Daniel J. Park
DE Wood
Diana P. Rojas
DJ Park
Edgar W. Kopp
Edson Delatorre
F Cribari-Neto
Fernando A. Bozza
G Baele
G Baele
Gabriela Paz-Bailey
Giselle Barbosa-Lima
Glenn Oliveira
H Li
Hayden C. Metsky
Irene Bosch
Ivette Lorenzana
J Josse
James Qu
JD Morlan
Jose Cerbino-Neto
Joshua J. Anzinger
JR Brister
JS Schieffelin
K Clark
K Katoh
KA Tsetsarkin
Karthik Gangavarapu
Kayla G. Barnes
Kelly N. Hogan
Kendra West
Kimberly F. Garcia
Kristian G. Andersen
L Fu
Lauren M. Paul
Leda A. Parham
Lee Gehrke
Luis Villar
M Kearse
M Worobey
M. Elizabeth Halloran
MA Brinton
MAR Ferreira
Maria C. Miranda Montoya
Mario C. Porcelli
Marshall R. Cone
Mary Lynn Baniecki
MG Grabherr
MND Balm
MR Reynolds
MRT Nunes
NA O’Leary
Nathan D. Grubaugh
Nathan L. Yozwiak
NR Faria
O Faye
O Faye
P Yarza
Pardis C. Sabeti
Patricia T. Bozza
Refugio Robles-Sikisaka
Rickey R. Shah
Rosa M. Gélvez Ramírez
RS Lanciotti
S Duchêne
S Guindon
S Henikoff
S Lê
Salim Mattar
Sandra Smole
Sarah M. Winnicki
Sarah Scotland
Scott F. Michael
Scott Hennigan
Sharon Isern
Shirlee Wohl
SI Sardi
Simon H. Ye
SK Gire
Stephen F. Schaffner
Thiago M. L. Souza
Wim Degrave
Yasmine R. Vieira
Z Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Although the recent Zika virus (ZIKV) epidemic in the Americas and its link to birth defects have attracted a great deal of attention, much remains unknown about ZIKV disease epidemiology and ZIKV evolution, in part owing to a lack of genomic data. Here we address this gap in knowledge by using multiple sequencing approaches to generate 110 ZIKV genomes from clinical and mosquito samples from 10 countries and territories, greatly expanding the observed viral genetic diversity from this outbreak. We analysed the timing and patterns of introductions into distinct geographic regions; our phylogenetic evidence suggests rapid expansion of the outbreak in Brazil and multiple introductions of outbreak strains into Puerto Rico, Honduras, Colombia, other Caribbean islands, and the continental United States. We find that ZIKV circulated undetected in multiple regions for many months before the first locally transmitted cases were confirmed, highlighting the importance of surveillance of viral infections. We identify mutations with possible functional implications for ZIKV biology and pathogenesis, as well as those that might be relevant to the effectiveness of diagnostic tests

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

DSpace@MIT

ResearchOnline at James Cook University

Edinburgh Research Explorer

Many-core algorithms for high-dimensional gradients on phylogenetic trees.

Author: Gangavarapu Karthik,
Publication venue
Publication date: 21/02/2024
Field of study

Ezid

Recommended from our members

Inferring the risk factors behind the geographical spread and transmission of Zika in the Americas

Author: Bóta András
Gangavarapu Karthik
Gardner Lauren M.
Grubaugh Nathan D.
Kraemer Moritz U. G.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Background: An unprecedented Zika virus epidemic occurred in the Americas during 2015-2016. The size of the epidemic in conjunction with newly recognized health risks associated with the virus attracted significant attention across the research community. Our study complements several recent studies which have mapped epidemiological elements of Zika, by introducing a newly proposed methodology to simultaneously estimate the contribution of various risk factors for geographic spread resulting in local transmission and to compute the risk of spread (or re-introductions) between each pair of regions. The focus of our analysis is on the Americas, where the set of regions includes all countries, overseas territories, and the states of the US. Methodology/Principal findings We present a novel application of the Generalized Inverse Infection Model (GIIM). The GIIM model uses real observations from the outbreak and seeks to estimate the risk factors driving transmission. The observations are derived from the dates of reported local transmission of Zika virus in each region, the network structure is defined by the passenger air travel movements between all pairs of regions, and the risk factors considered include regional socioeconomic factors, vector habitat suitability, travel volumes, and epidemiological data. The GIIM relies on a multi-agent based optimization method to estimate the parameters, and utilizes a data driven stochastic-dynamic epidemic model for evaluation. As expected, we found that mosquito abundance, incidence rate at the origin region, and human population density are risk factors for Zika virus transmission and spread. Surprisingly, air passenger volume was less impactful, and the most significant factor was (a negative relationship with) the regional gross domestic product (GDP) per capita. Conclusions/Significance: Our model generates country level exportation and importation risk profiles over the course of the epidemic and provides quantitative estimates for the likelihood of introduced Zika virus resulting in local transmission, between all origin-destination travel pairs in the Americas. Our findings indicate that local vector control, rather than travel restrictions, will be more effective at reducing the risks of Zika virus transmission and establishment. Moreover, the inverse relationship between Zika virus transmission and GDP suggests that Zika cases are more likely to occur in regions where people cannot afford to protect themselves from mosquitoes. The modeling framework is not specific for Zika virus, and could easily be employed for other vector-borne pathogens with sufficient epidemiological and entomological data

Harvard University - DASH

Directory of Open Access Journals

Recommended from our members

Many-core algorithms for high-dimensional gradients on phylogenetic trees.

Author: Baele Guy
Fourment Mathieu
Gangavarapu Karthik
Ji Xiang
Lemey Philippe
Matsen Frederick
Suchard Marc
Publication venue: eScholarship, University of California
Publication date: 01/02/2024
Field of study

MOTIVATION: Advancements in high-throughput genomic sequencing are delivering genomic pathogen data at an unprecedented rate, positioning statistical phylogenetics as a critical tool to monitor infectious diseases globally. This rapid growth spurs the need for efficient inference techniques, such as Hamiltonian Monte Carlo (HMC) in a Bayesian framework, to estimate parameters of these phylogenetic models where the dimensions of the parameters increase with the number of sequences N. HMC requires repeated calculation of the gradient of the data log-likelihood with respect to (wrt) all branch-length-specific (BLS) parameters that traditionally takes O(N2) operations using the standard pruning algorithm. A recent study proposes an approach to calculate this gradient in O(N), enabling researchers to take advantage of gradient-based samplers such as HMC. The CPU implementation of this approach makes the calculation of the gradient computationally tractable for nucleotide-based models but falls short in performance for larger state-space size models, such as Markov-modulated and codon models. Here, we describe novel massively parallel algorithms to calculate the gradient of the log-likelihood wrt all BLS parameters that take advantage of graphics processing units (GPUs) and result in many fold higher speedups over previous CPU implementations. RESULTS: We benchmark these GPU algorithms on three computing systems using three evolutionary inference examples exploring complete genomes from 997 dengue viruses, 62 carnivore mitochondria and 49 yeasts, and observe a >128-fold speedup over the CPU implementation for codon-based models and >8-fold speedup for nucleotide-based models. As a practical demonstration, we also estimate the timing of the first introduction of West Nile virus into the continental Unites States under a codon model with a relaxed molecular clock from 104 full viral genomes, an inference task previously intractable. AVAILABILITY AND IMPLEMENTATION: We provide an implementation of our GPU algorithms in BEAGLE v4.0.0 (https://github.com/beagle-dev/beagle-lib), an open-source library for statistical phylogenetics that enables parallel calculations on multi-core CPUs and GPUs. We employ a BEAGLE-implementation using the Bayesian phylogenetics framework BEAST (https://github.com/beast-dev/beast-mcmc)

eScholarship - University of California