2 research outputs found
Spread of Middle East Respiratory Coronavirus: Genetic versus Epidemiological Data
ObjectiveHere we use novel methods of phylogenetic transmission graphanalysis to reconstruct the geographic spread of MERS-CoV.We compare these results to those derived from text mining andvisualization of the World Health Organization’s (WHO) DiseaseOutbreak News.IntroductionMERS-CoV was discovered in 2012 in the Middle East and humancases around the world have been carefully reported by the WHO.MERS-CoV virus is a novel betacoronavirus closely related to a virus(NeoCov) hosted by a bat, Neoromicia capensis. MERS-CoV infectshumans and camels. In 2015, MERS-CoV spread from the MiddleEast to South Korea which sustained an outbreak. Thus, it is clearthat the virus can spread among humans in areas in which camels arenot husbanded.MethodsPhylogenetic analysesWe calculated a phylogenetic tree from 100 genomic sequencesof MERS-CoV hosted by humans and camels using NeoCov as theoutgroup. In order to evaluate the relative order and significance ofgeographic places in spread of the virus, we generated a transmissiongraph (Figure 1) based on methods described in 1.The graph indicates places as nodes and transmission events asedges. Transmission direction and frequency are depicted withdirected and weighted edges. Betweenness centrality, representedby node size, measures the number of shortest paths from all nodesto others that pass through the corresponding node. Places withhigh betweenness represent key hubs for the spread of the disease.In contrast, smaller nodes at the periphery of the network are lessimportant for the spread of the disease.Web scraping and mappingDue to the journalistic style of the WHO data, it had to be structuredsuch that mapping software can ingest the data. We used Import.io tobuild the API. We provided the software a sample page, selected thedata that is pertinent, then provided a list of all URLs for the software.We used Tableau to map the information both geographically andtemporally.ResultsGeographic spread of Mers-CoV based on transmissions identifiedin phylogenetic dataMost important among the places in the MERS-CoV epidemicis Saudi Arabia as measured by the betweenness metric applied toa changes in place mapped to a phylogenetic tree. In figure 1, thecircle representing Saudi Arabia is slightly larger compared to otherlocation indicating its high importance in the epidemic. Saudi Arabiais the source of virus for Jordan, England, Qatar, South Korea, UAE,Indiana, and Egypt. The United Arab Emirates has a bidirectionalconnection with Saudi Arabia indicating the virus has spreadbetween the two countries. The United Arab Emirates also has highbetweenness. The United Arab Emirates is between Saudi Arabia andOman and Between Saudi Arabia and France. South Korea, and Qatarhave mild betweeness. South Korea is between Saudi Arabia andChina. Qatar is between Saudi Arabia and Florida. Other locations(Jordan, England, Indiana, and Egypt) have low betweenness as theyhave no outbound connections.Visualization of geographical transmissions in WHO DataCertain articles include the infected individuals’ countries oforigin. ln constrast, many reports are in a lean format that includes asingle paragraph that only summarizes the total number of cases forthat country. If we build the API in a manner that recognizes featuresin the detailed reports, we can generate a map that draws lines fromorigin to reporting country and create visualizations. However, sinceonly some of the articles contain this extra information, mapping inthis manner will miss many of the cases that are reported in the leanformat.ConclusionsOur goal is to develop methods for understanding syndromicand pathogen genetic data on the spread of diseases. Drawingparallels between the transmissions events in the WHO data and thegenetic data has shown to be challenging. Analyses of the geneticinformation can be used to imply a transmission pathway but it ishard to find epidemiological data in the public domain to corroboratethe transmission pathway. There are rare cases in the WHO data thatinclude travel history (e.g. “The patient is from Riyadh and flew to theUK”). We conclude that epidemiological data combined with geneticdata and metadata have strong potential to understand the geographicprogression of an infectious disease. However, reporting standardsneed to be improved where travel history does not impinge on privacy.A transmission graph for MERS-CoV based on viral genomes and place ofisolation metadata. The direction of transmission is represented by the arrow.The frequency of transmission is indicated by the number. The size of the nodesindicates betweenness
Spread of Middle East Respiratory Coronavirus: Genetic versus Epidemiological Data
ObjectiveHere we use novel methods of phylogenetic transmission graphanalysis to reconstruct the geographic spread of MERS-CoV.We compare these results to those derived from text mining andvisualization of the World Health Organization’s (WHO) DiseaseOutbreak News.IntroductionMERS-CoV was discovered in 2012 in the Middle East and humancases around the world have been carefully reported by the WHO.MERS-CoV virus is a novel betacoronavirus closely related to a virus(NeoCov) hosted by a bat, Neoromicia capensis. MERS-CoV infectshumans and camels. In 2015, MERS-CoV spread from the MiddleEast to South Korea which sustained an outbreak. Thus, it is clearthat the virus can spread among humans in areas in which camels arenot husbanded.MethodsPhylogenetic analysesWe calculated a phylogenetic tree from 100 genomic sequencesof MERS-CoV hosted by humans and camels using NeoCov as theoutgroup. In order to evaluate the relative order and significance ofgeographic places in spread of the virus, we generated a transmissiongraph (Figure 1) based on methods described in 1.The graph indicates places as nodes and transmission events asedges. Transmission direction and frequency are depicted withdirected and weighted edges. Betweenness centrality, representedby node size, measures the number of shortest paths from all nodesto others that pass through the corresponding node. Places withhigh betweenness represent key hubs for the spread of the disease.In contrast, smaller nodes at the periphery of the network are lessimportant for the spread of the disease.Web scraping and mappingDue to the journalistic style of the WHO data, it had to be structuredsuch that mapping software can ingest the data. We used Import.io tobuild the API. We provided the software a sample page, selected thedata that is pertinent, then provided a list of all URLs for the software.We used Tableau to map the information both geographically andtemporally.ResultsGeographic spread of Mers-CoV based on transmissions identifiedin phylogenetic dataMost important among the places in the MERS-CoV epidemicis Saudi Arabia as measured by the betweenness metric applied toa changes in place mapped to a phylogenetic tree. In figure 1, thecircle representing Saudi Arabia is slightly larger compared to otherlocation indicating its high importance in the epidemic. Saudi Arabiais the source of virus for Jordan, England, Qatar, South Korea, UAE,Indiana, and Egypt. The United Arab Emirates has a bidirectionalconnection with Saudi Arabia indicating the virus has spreadbetween the two countries. The United Arab Emirates also has highbetweenness. The United Arab Emirates is between Saudi Arabia andOman and Between Saudi Arabia and France. South Korea, and Qatarhave mild betweeness. South Korea is between Saudi Arabia andChina. Qatar is between Saudi Arabia and Florida. Other locations(Jordan, England, Indiana, and Egypt) have low betweenness as theyhave no outbound connections.Visualization of geographical transmissions in WHO DataCertain articles include the infected individuals’ countries oforigin. ln constrast, many reports are in a lean format that includes asingle paragraph that only summarizes the total number of cases forthat country. If we build the API in a manner that recognizes featuresin the detailed reports, we can generate a map that draws lines fromorigin to reporting country and create visualizations. However, sinceonly some of the articles contain this extra information, mapping inthis manner will miss many of the cases that are reported in the leanformat.ConclusionsOur goal is to develop methods for understanding syndromicand pathogen genetic data on the spread of diseases. Drawingparallels between the transmissions events in the WHO data and thegenetic data has shown to be challenging. Analyses of the geneticinformation can be used to imply a transmission pathway but it ishard to find epidemiological data in the public domain to corroboratethe transmission pathway. There are rare cases in the WHO data thatinclude travel history (e.g. “The patient is from Riyadh and flew to theUK”). We conclude that epidemiological data combined with geneticdata and metadata have strong potential to understand the geographicprogression of an infectious disease. However, reporting standardsneed to be improved where travel history does not impinge on privacy.A transmission graph for MERS-CoV based on viral genomes and place ofisolation metadata. The direction of transmission is represented by the arrow.The frequency of transmission is indicated by the number. The size of the nodesindicates betweenness