3 research outputs found
Evaluation of next generation sequency protocols for VIH complete genome sequencing
Vírus da imunodeficiência humana (VIH) é um retrovírus que deu origem a uma pandemia após transmissão zoonótica na primeira metade do século XX. A terapia actual, conhecida como terapia anti-retroviral altamente activa, pode retardar significativamente a progressão da doença. No entanto, apesar de mais de 25 anos de intensa investigação ainda não existe cura disponível.
Todos os fármacos anti-retrovirais disponíveis são confrontados com o desafio colocado pelo alto potencial evolutivo do VIH. Isto implica que, independentemente do coquetel de fármacos administrados, resistência aos mesmos pode e vai desenvolver-se. Para gerir esses efeitos negativos, os pacientes devem ser vigiados regularmente, a fim de detectar o desenvolvimento de resistência a fármacos precocemente, de modo a que se possa ajustar oportunamente o regime terapêutico. É de notar que tanto as estirpes resistentes, que evoluíram de novo ou foram adquiridas por meio de transmissão, podem ter impacto negativo no resultado da terapia. Assim sendo, também os pacientes nunca sujeitos a terapia devem ser avaliados antes do início da mesma.
Essa triagem geralmente envolve genotipagem da população viral através do sequenciamento directo dos produtos de RT-PCR. Infelizmente, essa abordagem não permite a detecção fiável de estirpes virais presentes em menos de 20% a 25% da população. A associação entre populações minoritárias codificantes de resistência a fármacos com a falha terapêutica, impulsionou as investigações para explorar a plataforma da Roche® 454, como tentativa de ganhar conhecimento mais preciso e em profundidade da população viral. Contudo, tais estudos estão limitados a determinadas regiões genómicas e por outro lado os procedimentos aplicados para fragmentação na plataforma da Roche® 454 requerem elevada quantidade de material primário.
Esta tese impõe-se como parte de um projecto mais amplo, comparando os mais recentes protocolos de pré-processamento de amostras para sequenciação completa do genoma de VIH, proveniente de amostras clinicas de plasma e células mononucleares do sangue periférico, e identificação do reservatório mais adequado para detecção de resistência em pacientes recentemente infectados, como segundo objectivo. Assim sendo, este trabalho de investigação foca-se nos aspectos práticos correspondentes ao pré-processamento de amostras antes da geração de dados de sequência.
Em detalhe, todos os procedimentos de laboratório, tanto para a estratégia de amplificação de sequência específica e de sequência aleatória foram realizadas. Para o primeiro, geramos 6 amplicões que se sobrepõem para cobrir o genoma inteiro do VIH-1. Depois de misturamos equimolarmente todos os amplicões para cada amostra, foram realizados dois métodos fragmentação enzimática. Estes serão comparados com o método convencional mecânico de fragmentação empregue pela Roche® 454.
O sequenciamento com êxito de uma amostra e a conclusão de todos os procedimentos de pré-processamento são promissores para outras aplicações, mas uma avaliação abrangente dos dados de sequenciação a serem gerados é necessário fazer uma escolha informada entre as diferentes abordagens.Human immunodeficiency virus (HIV) is a retrovirus that gave rise to a worldwide epidemic after its successful zoonotic transmission in the first half of the twentieth century. Current therapy, referred to as Highly Active AntiRetroviral Therapy (HAART), can significantly delay disease progression. However, despite more than 25 years of intensive research there is still no cure available.
All available antiretroviral drugs are faced with the insurmountable challenge posed by the high evolutionary potential of HIV. This implies that regardless the administered drug cocktail, drug resistance can and will develop. To manage these negative effects, patients should be screened on a regular basis in order to detect the development of drug resistance in an early phase, so the therapy regimen can be timely adjusted. Importantly, both drug resistant variants that have evolved de novo or were acquired through transmission can negatively impact on therapy outcome. Thus, also therapy-naive patients should be screened before therapy onset.
This screening usually involves genotyping of the viral population through the direct sequencing of the RT-PCR products. Unfortunately, this approach does not allow the reliable detection of viral variants present in less then at about 20%-25% of the population. The association of such minor variants harboring drug resistance mutations with therapy failure fueled investigations to exploit the recently developed Roche® 454 NGS platform in an attempt to gain a more accurate in-depth view of the viral population. These inquiries are characterized by two major drawbacks: their focus on limited genomic regions and the need for large amounts of input material characteristic for the proprietary Roche® 454 fragmentation approach.
As part of a larger project on the comparison of currently available sample preprocessing protocols for complete genome sequencing of clinical HIV plasma and PBMC samples, and the identification of the most suitable viral reservoir for resistance testing in newly infected patients as a secondary objective, this thesis focuses on the corresponding practical aspects of pre-processing prior to sequence data generation.
Specifically, all wet-lab procedures for both the sequence-specific and random priming amplification strategies were carried out. For the former, we generated 6 overlapping amplicons to cover the entire HIV-1 genome. After equimolar pooling of all amplicons for each sample, we performed two enzymatic fragmentation methods. These will be compared to conventional mechanical 454 shearing.
The successful sequencing of one sample and the completion of all sample pre-processing procedures is promising for further applications but a comprehensive evaluation of the sequence data to be generated is necessary to make an informed choice among the different approaches
SARS-CoV-2 introductions and early dynamics of the epidemic in Portugal
Genomic surveillance of SARS-CoV-2 in Portugal was rapidly implemented by
the National Institute of Health in the early stages of the COVID-19 epidemic, in collaboration
with more than 50 laboratories distributed nationwide.
Methods By applying recent phylodynamic models that allow integration of individual-based
travel history, we reconstructed and characterized the spatio-temporal dynamics of SARSCoV-2 introductions and early dissemination in Portugal.
Results We detected at least 277 independent SARS-CoV-2 introductions, mostly from
European countries (namely the United Kingdom, Spain, France, Italy, and Switzerland),
which were consistent with the countries with the highest connectivity with Portugal.
Although most introductions were estimated to have occurred during early March 2020, it is
likely that SARS-CoV-2 was silently circulating in Portugal throughout February, before the
first cases were confirmed.
Conclusions Here we conclude that the earlier implementation of measures could have
minimized the number of introductions and subsequent virus expansion in Portugal. This
study lays the foundation for genomic epidemiology of SARS-CoV-2 in Portugal, and highlights the need for systematic and geographically-representative genomic surveillance.We gratefully acknowledge to Sara Hill and Nuno Faria (University of Oxford) and
Joshua Quick and Nick Loman (University of Birmingham) for kindly providing us with
the initial sets of Artic Network primers for NGS; Rafael Mamede (MRamirez team,
IMM, Lisbon) for developing and sharing a bioinformatics script for sequence curation
(https://github.com/rfm-targa/BioinfUtils); Philippe Lemey (KU Leuven) for providing
guidance on the implementation of the phylodynamic models; Joshua L. Cherry
(National Center for Biotechnology Information, National Library of Medicine, National
Institutes of Health) for providing guidance with the subsampling strategies; and all
authors, originating and submitting laboratories who have contributed genome data on
GISAID (https://www.gisaid.org/) on which part of this research is based. The opinions
expressed in this article are those of the authors and do not reflect the view of the
National Institutes of Health, the Department of Health and Human Services, or the
United States government. This study is co-funded by Fundação para a Ciência e Tecnologia
and Agência de Investigação Clínica e Inovação Biomédica (234_596874175) on
behalf of the Research 4 COVID-19 call. Some infrastructural resources used in this study
come from the GenomePT project (POCI-01-0145-FEDER-022184), supported by
COMPETE 2020 - Operational Programme for Competitiveness and Internationalisation
(POCI), Lisboa Portugal Regional Operational Programme (Lisboa2020), Algarve Portugal
Regional Operational Programme (CRESC Algarve2020), under the PORTUGAL
2020 Partnership Agreement, through the European Regional Development Fund
(ERDF), and by Fundação para a Ciência e a Tecnologia (FCT).info:eu-repo/semantics/publishedVersio
De genetische traceerbaarheid van virale pathogenen: van isolatie door afstand tot nabijheid via mobiliteit
The emergence of viral infectious diseases imposes a heavy burden on public health and global economies.
The high mortality and morbidity this can bring about has been demonstrated during the Influenza A H1N1 pandemic in 2009 and more recently during the Ebola outbreak in West Africa in 2014.
It is hypothesised that their emergence is largely driven by ecological, environmental and socio-economic factors.
This has prompted a variety of research fields from mathematical modelling to the epidemiology field to work towards understanding the epidemiological patterns of these diseases and ultimately controlling them.
As a part of this research dynamic, a new field arose - phylodynamics - aiming to extract epidemiological information from the evolutionary imprint in viral genomes.
Phylodynamics benefits from the unprecedented amounts of genetic sequence data becoming available and uses statistical models of molecular sequence variation and evolution, population dynamics, geographic and ecological information to reconstruct ancestral history and test hypotheses about the spatio-temporal patterns and drivers that shape epidemic dynamics.
This thesis builds on recent developments in phylodynamics and aims to extend the current phylogeographic models and visualisation techniques in order to trace viral evolutionary history and identify the factors that shaped the spatial distributions of plant, animal and human emerging viruses with genetic structures ranging from `isolation by distance' to `proximity by mobility'.
Chapter 1 begins with a brief account on how events of infectious disease emergence have been occurring since antiquity while the fundamental concepts to study these outbreaks arose around two centuries ago.
The chapter also discusses the aspects that make RNA viruses particularly interesting systems to study from an evolutionary perspective.
Such research requires statistical models and computational inference tools to retrieve evolutionary and epidemiological patterns from genetic sequences.
Here, we focus on a flexible Bayesian framework that permits the integration of different sources of information and that infers the posterior distribution of phylodynamic histories, which appropriately characterise the uncertainty of the estimates.
The chapter concludes by presenting the epidemiological and ecological setting of the different viral systems studied in this doctoral thesis,
and sets out the objectives for the development and application of state-of-the-art phylogeographic approaches to elucidate how processes of movement and growth of the host population can drive the viral evolutionary and dispersal dynamics.
Chapter 2 examines the evolutionary and spatio-temporal history of Rice yellow mottle virus, a pathogen that infects rice with important socio-economical consequences in Africa, and formally demonstrates that the relatively recent epidemic expansion was driven by intensification of rice agriculture in Africa.
Chapter 3 reveals that viral genetic sequence data can provide important insights into the complex host ecology of the highly pathogenic avian Influenza A H5N1 virus.
In particular, we identify avian hosts belonging to the Anatidae family as the main contributor to the viral dissemination.
Subsequently, Chapter 4 focuses on modelling the temporal heterogeneity in spatial spread characteristic of the seasonal dynamics of human Influenza A and B viruses worldwide, while pointing at global air transportation as the main predictor driving their global circulation.
The final research project presented in Chapter 5 offers preliminary findings on the dispersal dynamics of seasonal Influenza A at the smaller geographical scale, that of continental USA, and more specifically on the human mobility networks shaping its spatio-temporal patterns.
Chapter 6 puts the methods used in this thesis in the context of past and current efforts towards improved phylodynamics reconstructions.
It discusses the central key achievements of each project and how they advance the knowledge on the evolutionary processes of these viruses.
Finally, the chapter concludes with a reflection on potential extensions of the work presented and how state-of-the-art statistical inference approaches may shape the future of evolutionary and spatial reconstructions in infectious diseases.
By capitalising on genetic sequence data and the integration of various sources of information, including data about underlying connectivity and mobility of the hosts, we recovered epidemiological patterns that constitute evidence-based data for policy makers that ultimately could help increase our preparedness for future emerging infectious diseases.Contents
Acknowledgments i
Contents v
Abbreviations ix
Summary xiii
Samenvatting xv
List of Figures xvii
List of Tables xxvii
1 Introduction 1
1.1 EvolutionarydynamicsofRNAviruses. . . . . . . . . . . . . . 4
1.2 Phylogenetic trees: characterising epidemiological linkage . . . 5
1.3 Modelling evolutionary and population genetic processes . . . . 6
1.3.1 Nucleotidesubstitutionmodels .............. 6
1.3.2 Molecularclockmodel................... 7
1.3.3 Temporalsignal ...................... 9
1.3.4 Coalescentmodels ..................... 10
1.4 Classic phylogenetic inference approaches . . . . . . . . . . . . 11
1.5 Bayesianinference ......................... 13
1.5.1 Bayestheorem ....................... 14
1.5.2 Samplingalgorithms.................... 17
1.5.3 The Bayesian phylogenetic framework: models and sourcesofinformation ................... 19
1.5.4 Phylogeographic hypothesis testing . . . . . . . . . . . . 23
1.5.5 Modelselection....................... 24
1.5.6 Reducingcomputationalintensity. . . . . . . . . . . . . 25
1.6 Viralpathogensstudiedinthisthesis. . . . . . . . . . . . . . . 25
1.7 Researchgoals: tracingviralpathogens . . . . . . . . . . . . . 27
2 Host ecology determines the dispersal patterns of a plant virus 31
2.1 Abstract............................... 32
2.2 Introduction............................. 32
2.3 MaterialsandMethods....................... 35
2.3.1 Datasetcompilation .................... 35
2.3.2 Temporalsignal ...................... 36
2.3.3 Bayesianevolutionaryinference . . . . . . . . . . . . . . 38
2.4 Results................................ 43
2.4.1 Evolutionary rate and divergence time estimation . . . . 43
2.4.2 Discretegeography..................... 45
2.4.3 Continuousphylogeography................ 53
2.5 DiscussionandConclusion..................... 56
3 Bayesian inference reveals host-specific contributions to the epidemic expansion of Influenza A H5N1 61
3.1 Abstract............................... 62
3.2 Introduction............................. 62
3.3 MaterialsandMethods....................... 65
3.3.1 Datacollection ....................... 65
3.3.2 Bayesianevolutionaryinference . . . . . . . . . . . . . . 68
3.3.3 Grid-based visualisation of continuous spatial diffusion . 75
3.4 Results................................ 76
3.4.1 Spatialexpansion...................... 77
3.4.2 Hosttransmissionpatterns ................ 85
3.5 DiscussionandConclusion..................... 96
4 Phylogeographic modelling of temporal heterogeneity in the global circulation of human influenza lineages 101
4.1 Abstract............................... 102
4.2 Introduction............................. 102
4.3 Methodology ............................ 105
4.3.1 Datacompilation...................... 105
4.3.2 Bayesianevolutionaryinference . . . . . . . . . . . . . . 105
4.3.3 Discrete phylogeography an temporal heterogeneity . . . 106
4.4 Results................................ 110
4.5 DiscussionandConclusion..................... 123
5 Determinants of seasonal influenza A dispersal patterns in the USA 127
5.1 Abstract............................... 128
5.2 Introduction............................. 128
5.3 Methodology ............................ 130
5.3.1 Datacompilation...................... 130
5.3.2 Bayesianevolutionaryinference . . . . . . . . . . . . . . 131
5.3.3 Discrete geography an temporal heterogeneity . . . . . . 132
5.4 Results................................ 133
5.5 DiscussionandConclusion..................... 135
6 Discussion and future perspectives 139
6.1 Advances in statistical inference procedures for viral phylodynamics141
6.2 Future directions: Phylodynamics of plant, animal and human viruses................................ 146
Bibliography 157
Curriculum vitae 187status: publishe