Search CORE

197 research outputs found

Bioinformatics tools for analysing viral genomic data

Author: Davison A.
Gu Q.
Hughes J.
Maabar M.
Modha S.
Orton R.J.
Vattipally Sreenu
Wilkie G.S.
Publication venue: 'O.I.E (World Organisation for Animal Health)'
Publication date: 01/04/2016
Field of study

The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing

Enlighten

Recommended from our members

Quantitative Approaches to the Genomics of Clonal Evolution

Author: Zairis Sakellarios
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Many problems in the biological sciences reduce to questions of genetic evolution. Entire classes of medical pathology, such as malignant neoplasia or infectious disease, can be viewed in the light of Darwinian competition of genomes. With the benefit of today's maturing sequencing technologies we can observe and quantify genetic evolution with nucleotide resolution. This provides a molecular view of genetic material that has adapted, or is in the process of adapting, to its local selection pressures. A series of problems will be discussed in this thesis, all involving the mathematical modeling of genomic data derived from clonally evolving populations. We use a variety of computational approaches to characterize over-represented features in the data, with the underlying hypothesis that we may be detecting fitness-conferring features of the biology. In Part I we consider the cross-sectional sampling of human tumors via RNA-sequencing, and devise computational pipelines for detecting oncogenic gene fusions and oncovirus infections. Genomic translocation and oncovirus infection can each be a highly penetrant alteration in a tumor's evolutionary history, with famous examples of both populating the cancer biology literature. In order to exert a transforming influence over the host cell, gene fusions and viral genetic programs need to be expressed and thus can be detected via whole transcriptome sequencing of a malignant cell population. We describe our approaches to predicting oncogenic gene fusions (Chapter 2) and quantifying host-viral interactions (Chapter 3) in large panels of human tumor tissue. The alterations that we characterize prompt the larger question of how the genetics of tumors and viruses might vary in time, leading us to the study of serially sampled populations. In Part II we consider longitudinal sampling of a clonally evolving population. Phylogenetic trees are the standard representation of a clonal process, an evolutionary picture as old as Darwin's voyages on the Beagle. Chapter 4 first reviews phylogenetic inference and then introduces a certain phylogenetic tree space that forms the starting point of our work on the topic. Specifically, Chapter 4 describes the construction of our projective tree space along with an explicit implementation for visualizing point clouds of rescaled trees. The Chapter finishes by defining a method for stable dimensionality reduction of large phylogenies, which is useful for analyzing long genomic time series. In Chapter 5 we consider medically relevant instances of clonal evolution and the longitudinal genetic data sets to which they give rise. We analyze data from (i) the sequencing of cancers along their therapeutic course, (ii) the passaging of a xenografted tumor through a mouse model, and (iii) the seasonal surveillance of H3N2 influenza's hemagglutinin segment. A novel approach to predicting influenza vaccine effectiveness is demonstrated using statistics of point clouds in tree spaces. Our investigations into clonal processes may be extended beyond naturally occurring genomes. In Part III we focus on the directed clonal evolution of populations of synthetic RNAs in vitro. Analogous to the selection pressures exerted upon malignant cells or viral particles, these synthetic RNA genomes can be evolved against a desired fitness objective. We investigate fitness objectives related to reprogramming ribosomal translation. Chapter 6 identifies high fitness RNA pseudoknot geometries capable of inducing ribosomal frameshift, while Chapter 7 takes an unbiased approach to evolving sequence and structural elements that promote stop codon readthrough

Columbia University Academic Commons

BMC Genomics

Author
Publication venue
Publication date
Field of study

BackgroundDeep sequencing makes it possible to observe low-frequency viral variants and sub-populations with greater accuracy and sensitivity than ever before. Existing platforms can be used to multiplex a large number of samples; however, analysis of the resulting data is complex and involves separating barcoded samples and various read manipulation processes ending in final assembly. Many assembly tools were designed with larger genomes and higher fidelity polymerases in mind and do not perform well with reads derived from highly variable viral genomes. Reference-based assemblers may leave gaps in viral assemblies while de novo assemblers may struggle to assemble unique genomes.ResultsThe IRMA (iterative refinement meta-assembler) pipeline solves the problem of viral variation by the iterative optimization of read gathering and assembly. As with all reference-based assembly, reads are included in assembly when they match consensus template sets; however, IRMA provides for on-the-fly reference editing, correction, and optional elongation without the need for additional reference selection. This increases both read depth and breadth. IRMA also focuses on quality control, error correction, indel reporting, variant calling and variant phasing. In fact, IRMA\ue2\u20ac\u2122s ability to detect and phase minor variants is one of its most distinguishing features. We have built modules for influenza and ebolavirus. We demonstrate usage and provide calibration data from mixture experiments. Methods for variant calling, phasing, and error estimation/correction have been redesigned to meet the needs of viral genomic sequencing.ConclusionIRMA provides a robust next-generation sequencing assembly solution that is adapted to the needs and characteristics of viral genomes. The software solves issues related to the genetic diversity of viruses while providing customized variant calling, phasing, and quality control. IRMA is freely available for non-commercial use on Linux and Mac OS X and has been parallelized for high-throughput computing.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3030-6) contains supplementary material, which is available to authorized users.2016-09-05T00:00:00Z27595578PMC501193

CDC Stacks

Exploring the phylodynamics, genetic reassortment and RNA secondary structure formation patterns of orthomyxoviruses by comparative sequence analysis

Author: Nindo Fredrick Nzabanyi
Publication venue: Department of Clinical Laboratory Sciences
Publication date: 30/04/2020
Field of study

RNA viruses are among the most virulent microorganisms that threaten the health of humans and livestock. Among the most socio-economically important of the known RNA viruses are those found in the family Orthomyxovirus. In this era of rapid low-cost genome sequencing and advancements in computational biology techniques, many previously difficult research questions relating to the molecular epidemiology and evolutionary dynamics of these viruses can now be answered with ease. Using sequence data together with associated meta-data, in chapter two of this dissertation I tested the hypothesis that the Influenza A/H1N1 2009 pandemic virus was introduced multiple times into Africa, and subsequently dispersed heterogeneously across the continent. I further tested to what degree factors such as road distances and air travel distances impacted the observed pattern of spread of this virus in Africa using a generalised linear modelbased approach. The results suggested that their were multiple simultaneous introductions of 2009 pandemic A/H1N1 into Africa, and geographical distance and human mobility through air travel played an important role towards dissemination. In chapter three, I set out to test two hypotheses: (1) that there is no difference in the frequency of reassortments among the segments that constitute influenza virus genomes; and (2) that there is epochal temporal reassortment among influenza viruses and that all geographical regions are equally likely sources of epidemiologically important influenza virus reassortant lineages. The findings suggested that surface segments are more frequently exchanges than internal genes and that North America/Asia, Oceania, and Asia could be the most likely source locations for reassortant Influenza A, B and C virus lineages respectively. In chapter four of this thesis, I explored the formation of RNA secondary structures within the genomes of orthomyxoviruses belonging to five genera: Influenza A, B and C, Infectious Salmon Anaemia Virus and Thogotovirus using in silico RNA folding predictions and additional molecular evolution and phylogenetic tests to show that structured regions may be biologically functional. The presence of some conserved structures across the five genera is likely a reflection of the biological importance of these structures, warranting further investigation regarding their role in the evolution and possible development of antiviral resistance. The studies herein demonstrate that pathogen genomics-based analytical approaches are useful both for understanding the mechanisms that drive the evolution and spread of rapidly evolving viral pathogens such as orthomyxoviruses, and for illuminating how these approaches could be leveraged to improve the management of these pathogens

Cape Town University OpenUCT

Phylodynamic Patterns in Pathogen Ecology and Evolution.

Author: Zinder Daniel
Publication venue
Publication date: 01/01/2015
Field of study

The rapid evolution of viral pathogens requires us to consider epidemiological, ecological and evolutionary processes as coupled together and occurring at the same timescale. Rotavirus and influenza account for high levels of morbidity and mortality worldwide and are two important examples of such dynamics. In this work, I investigate the different evolutionary and ecological processes that shape the antigenic structure and phylogenetic characteristics of these two viruses. In the first part of my work, I use a theoretical model of influenza A/H3N2 to identify the relative importance of antigenic novelty, competition between lineages, and changes in the susceptibility of the host population to circulating strains in determining the evolutionary and epidemiological trajectory of the virus. I develop this model further to correspond with patterns of immunity and infection observed in rotavirus, and investigate how reassortment, the swapping of gene segments between viruses, influences the formation and replacement of rotavirus genotypes through immune mediated processes. In the second part of my work, I use a tool (SeasMig), which I developed, to infer alternative stochastically generated migration and mutation events along phylogenetic trees in a Bayesian manner. Using SeasMig, I first show how the seasonality of A/H3N2 influenza incidence corresponds to rates of immigration and emigration of the virus. Subsequently, I tease out the different evolutionary and ecological processes, which drive changes in the US rotavirus population following onset of routine vaccination. My work has implications for identifying likely evolutionary mechanisms, which may lead to reduced vaccine efficacy, and for vaccine strain selection.PhDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113494/1/dzinder_1.pd

Deep Blue Documents at the University of Michigan

Gain-of-Function Experiments With Bacteriophage Lambda Uncover Residues Under Diversifying Selection in Nature

Author: Johnson Daniel T.
Maddamsetti Rohan
Marks Debora S.
Meyer Justin R.
Petrie Katherine L.
Spielman Stephanie J.
Publication venue: ODU Digital Commons
Publication date: 01/01/2018
Field of study

Viral gain-of-function mutations frequently evolve during laboratory experiments. Whether the specific mutations that evolve in the lab also evolve in nature and whether they have the same impact on evolution in the real world is unknown. We studied a model virus, bacteriophage λ, that repeatedly evolves to exploit a new host receptor under typical laboratory conditions. Here, we demonstrate that two residues of λ’s J protein are required for the new function. In natural λ variants, these amino acid sites are highly diverse and evolve at high rates. Insertions and deletions at these locations are associated with phylogenetic patterns indicative of ecological diversification. Our results show that viral evolution in the laboratory mirrors that in nature and that laboratory experiments can be coupled with protein sequence analyses to identify the causes of viral evolution in the real world. Furthermore, our results provide evidence for widespread host-shift evolution in lambdoid viruses

Old Dominion University

Transmission dynamics of SARS-CoV-2 within-host diversity in two major hospital outbreaks in South Africa

Author: Chimukangara Benjamin
de Oliveira Tulio
Fish Maryam
Fonseca Vagner
Gazy Inbal
Giandhari Jennifer
Kanzi Aquillah M
Khanyile Khulekani
Kiran Anmol M
Lessells Richard
Martin Darren P
Nelson Chase W
Ngcapu Sinaye
Pillay Sureshnee
San James E
Singh Lavanya
Smidt Werner
Tegally Houriiyah
Wilkinson Eduan
Publication venue: 'Oxford University Press (OUP)'
Publication date: 21/04/2021
Field of study

University of Liverpool Repository

Gain-of-Function Experiments With Bacteriophage Lambda Uncover Residues Under Diversifying Selection in Nature

Author: Johnson Daniel T.
Maddamsetti Rohan
Marks Debora S.
Meyer Justin R.
Petrie Katherine L.
Spielman Stephanie J.
Publication venue: ODU Digital Commons
Publication date: 01/01/2018
Field of study

Rowan University

Old Dominion University

Computational Methods for Assessment and Prediction of Viral Evolutionary and Epidemiological Dynamics

Author: Mohebbi Fatemeh
Publication venue: ScholarWorks @ Georgia State University
Publication date: 31/08/2023
Field of study

The ability to comprehend the dynamics of viruses’ transmission and their evolution, even to a limited extent, can significantly enhance our capacity to predict and control the spread of infectious diseases. An example of such significance is COVID-19 caused by the severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2). In this dissertation, I am proposing computational models that present more precise and comprehensive approaches in viral outbreak investigations and epidemiology, providing invaluable insights into the transmission dynamics, and potential inter- ventions of infectious diseases by facilitating the timely detection of viral variants. The first model is a mathematical framework based on population dynamics for the calculation of a numerical measure of the fitness of SARS-CoV-2 subtypes. The second model I propose here is a transmissibility estimation method based on a Bayesian approach to calculate the most likely fitness landscape for SARS-CoV-2 using a generalized logistic sub-epidemic model. Using the proposed model I estimate the epistatic interaction networks of spike protein in SARS-CoV-2. Based on the community structure of these epistatic networks, I propose a computational framework that predicts emerging haplotypes of SARS-CoV-2 with altered transmissibility. The last method proposed in this dissertation is a maximum likelihood framework that integrates phylogenetic and random graph models to accurately infer transmission networks without requiring case-specific data

ScholarWorks @ Georgia State University

Molecular epidemiology of acute respiratory virus infections

Author: Dyrdak Robert
Publication venue: 'European Centre for Disease Control and Prevention (ECDC)'
Publication date: 02/06/2023
Field of study

Acute respiratory virus infections are very common but can also cause severe disease. In my thesis, I have analysed the molecular epidemiology of acute respiratory virus infections caused by enterovirus D68 and coronaviruses. In Paper I, we used real-time PCR and Sanger sequencing to analyse the outbreak of enterovirus D68 in Stockholm in 2016. We found that the outbreak was caused by the subclade B3, and we also described three patients with neurological manifestations. The virus sequences were closely related to concurrent sequences from North America. In Paper II, we developed an assay for whole-genome sequencing of enterovirus D68 a next-generation platform. By using the assay on the samples from the 2016 outbreak, we found that the outbreak was caused by multiple independent introductions of the virus. We also estimated the time to the most common recent ancestor for the subclades B1 and B3 to 2009. In Paper III, we used the whole-genome sequencing assay in a European multicentre study of enterovirus D68 circulation in the 2018 season. We also included sequences in public repositories. We found that the viruses in 2018 belonged to subclades A2 and B3 and that sequences in subclade B3 originated from the circulation in 2016. We also found that enterovirus D68 had a rapid geographic mixing and that residues on the surface of the virus particle had an elevated substitution rate of amino acids. Hence, we proposed asymptomatic reinfections of adults to explain both rapid geographical dispersal and selective pressure on the surface residues. In Paper IV, we analysed stored results from routine clinical diagnostics for the four common cold coronaviruses. The data contained the results from September 2009 to April 2020. At the species level, we found a pattern of alternating biennial circulation, and we also found the circulation of Betacoronaviruses to peak earlier than that of Alphacoronaviruses. In Paper V, we investigated Sweden’s first SARS-CoV-2 pandemic wave in 2020. We analysed stored respiratory samples with real-time PCR for SARS-CoV-2 and found that community transmissions started earlier than previously appreciated. We also se-quenced stored SARS-CoV-2-positive samples. To these sequences, we added infor-mation from contact tracing records and combined them with data from public reposi-tories. Among cases exposed abroad, we mainly found clades 20B and 20A, whereas clade 20C dominated domestic infections. Furthermore, we found the proportion of clade 20C to be correlated with the cumulative number of deaths due to COVID-19. We interpreted this as early undetected introductions of clade 20C having had a significant impact on the further course of the pandemic in Sweden

Publications from Karolinska Institutet