260 research outputs found

    Methods for Viral Intra-Host and Inter-Host Data Analysis for Next-Generation Sequencing Technologies

    Get PDF
    The deep coverage offered by next-generation sequencing (NGS) technology has facilitated the reconstruction of intra-host RNA viral populations at an unprecedented level of detail. However, NGS data requires sophisticated analysis dealing with millions of error-prone short reads. This dissertation will first review the challenges and methods for viral NGS genomic data analysis in the NGS era. Second, it presents a software tool CliqueSNV for inferring viral quasispecies based on extracting pairs of statistically linked mutations from noisy reads, which effectively reduces sequencing noise and enables identifying minority haplotypes with a frequency below the sequencing error rate. Finally, the dissertation describes algorithms VOICE and MinDistB for inference of relatedness between viral samples, identification of transmission clusters, and sources of infection

    Bioinformatics tools for analysing viral genomic data

    Get PDF
    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing

    A high-quality genome and comparison of short- versus long-read transcriptome of the palaearctic duck Aythya fuligula (tufted duck)

    Get PDF
    Background: The tufted duck is a non-model organism that experiences high mortality in highly pathogenic avian influenza outbreaks. It belongs to the same bird family (Anatidae) as the mallard, one of the best-studied natural hosts of low-pathogenic avian influenza viruses. Studies in non-model bird species are crucial to disentangle the role of the host response in avian influenza virus infection in the natural reservoir. Such endeavour requires a high-quality genome assembly and transcriptome. Findings: This study presents the first high-quality, chromosome-level reference genome assembly of the tufted duck using the Vertebrate Genomes Project pipeline. We sequenced RNA (complementary DNA) from brain, ileum, lung, ovary, spleen, and testis using Illumina short-read and Pacific Biosciences long-read sequencing platforms, which were used for annotation. We found 34 autosomes plus Z and W sex chromosomes in the curated genome assembly, with 99.6% of the sequence assigned to chromosomes. Functional annotation revealed 14,099 protein-coding genes that generate 111,934 transcripts, which implies a mean of 7.9 isoforms per gene. We also identified 246 small RNA families. Conclusions: This annotated genome contributes to continuing research into the host response in avian influenza virus infections in a natural reservoir. Our findings from a comparison between short-read and long -read reference transcriptomics contribute to a deeper understanding of these competing options. In this study, both technologies complemented each other. We expect this annotation to be a foundation for further comparative and evolutionary genomic studies, including many waterfowl relatives with differing susceptibilities to avian influenza viruses

    Next-generation sequencing : an eye-opener for the surveillance of antiviral resistance in influenza

    Get PDF
    Next-generation sequencing (NGS) can enable a more effective response to a wide range of communicable disease threats, such as influenza, which is one of the leading causes of human morbidity and mortality worldwide. After vaccination, antivirals are the second line of defense against influenza. The use of currently available antivirals can lead to antiviral resistance mutations in the entire influenza genome. Therefore, the methods to detect these mutations should be developed and implemented. In this Opinion, we assess how NGS could be implemented to detect drug resistance mutations in clinical influenza virus isolates

    Recent advances in inferring viral diversity from high-throughput sequencing data

    Get PDF
    Rapidly evolving RNA viruses prevail within a host as a collection of closely related variants, referred to as viral quasispecies. Advances in high-throughput sequencing (HTS) technologies have facilitated the assessment of the genetic diversity of such virus populations at an unprecedented level of detail. However, analysis of HTS data from virus populations is challenging due to short, error-prone reads. In order to account for uncertainties originating from these limitations, several computational and statistical methods have been developed for studying the genetic heterogeneity of virus population. Here, we review methods for the analysis of HTS reads, including approaches to local diversity estimation and global haplotype reconstruction. Challenges posed by aligning reads, as well as the impact of reference biases on diversity estimates are also discussed. In addition, we address some of the experimental approaches designed to improve the biological signal-to-noise ratio. In the future, computational methods for the analysis of heterogeneous virus populations are likely to continue being complemented by technological developments.ISSN:0168-170

    Algorithms for Analysis of Heterogeneous Cancer and Viral Populations Using High-Throughput Sequencing Data

    Get PDF
    Next-generation sequencing (NGS) technologies experienced giant leaps in recent years. Short read samples reach millions of reads, and the number of samples has been growing enormously in the wake of the COVID-19 pandemic. This data can expose essential aspects of disease transmission and development and reveal the key to its treatment. At the same time, single-cell sequencing saw the progress of getting from dozens to tens of thousands of cells per sample. These technological advances bring new challenges for computational biology and require the development of scalable, robust methods to deal with a wide range of problems varying from epidemiology to cancer studies. The first part of this work is focused on processing virus NGS data. It proposes algorithms that can facilitate the initial data analysis steps by filtering genetically related sequencing and the tool investigating intra-host virus diversity vital for biomedical research and epidemiology. The second part addresses single-cell data in cancer studies. It develops evolutionary cancer models involving new quantitative parameters of cancer subclones to understand the underlying processes of cancer development better

    Algorithms for analysis of next-generation viral sequencing data

    Get PDF
    RNA viruses mutate at extremely high rates, forming an intra-host viral population of closely related variants, which allows them to evade the host’s immune system and makes them particularly dangerous. Viral outbreaks pose a significant threat for public health. Progress of sequencing technologies made it possible to identify and sample intra-host viral populations at great depth. Consequently, the contribution of sequencing technologies to molecular surveillance of viral outbreaks becomes more and more substantial. Genome sequencing of viral populations reveals similarities between samples, allows to measure viral genetic distance and facilitate outbreak identification and isolation. Computational methods can be used to infer transmission characteristics from sequencing data. However, due to the specifics of next-generation sequencing (NGS) approaches, and the limited availability of viral data, existing methods lack accuracy and efficiency. In this dissertation, I present a novel, flexible methods, that allow tackling crucial epidemiological problems, such as identification of transmission clusters, sources of infection, and transmission direction

    Nanopore-Based Metagenomic Sequencing in Respiratory Tract Infection:A Developing Diagnostic Platform

    Get PDF
    Respiratory tract infection (RTI) remains a significant cause of morbidity and mortality across the globe. The optimal management of RTI relies upon timely pathogen identification via evaluation of respiratory samples, a process which utilises traditional culture-based methods to identify offending microorganisms. This process can be slow and often prolongs the use of broad-spectrum antimicrobial therapy, whilst also delaying the introduction of targeted therapy as a result. Nanopore sequencing (NPS) of respiratory samples has recently emerged as a potential diagnostic tool in RTI. NPS can identify pathogens and antimicrobial resistance profiles with greater speed and efficiency than traditional sputum culture-based methods. Increased speed to pathogen identification can improve antimicrobial stewardship by reducing the use of broad-spectrum antibiotic therapy, as well as improving overall clinical outcomes. This new technology is becoming more affordable and accessible, with some NPS platforms requiring minimal sample preparation and laboratory infrastructure. However, questions regarding clinical utility and how best to implement NPS technology within RTI diagnostic pathways remain unanswered. In this review, we introduce NPS as a technology and as a diagnostic tool in RTI in various settings, before discussing the advantages and limitations of NPS, and finally what the future might hold for NPS platforms in RTI diagnostics.</p

    Nanopore-Based Metagenomic Sequencing in Respiratory Tract Infection:A Developing Diagnostic Platform

    Get PDF
    Respiratory tract infection (RTI) remains a significant cause of morbidity and mortality across the globe. The optimal management of RTI relies upon timely pathogen identification via evaluation of respiratory samples, a process which utilises traditional culture-based methods to identify offending microorganisms. This process can be slow and often prolongs the use of broad-spectrum antimicrobial therapy, whilst also delaying the introduction of targeted therapy as a result. Nanopore sequencing (NPS) of respiratory samples has recently emerged as a potential diagnostic tool in RTI. NPS can identify pathogens and antimicrobial resistance profiles with greater speed and efficiency than traditional sputum culture-based methods. Increased speed to pathogen identification can improve antimicrobial stewardship by reducing the use of broad-spectrum antibiotic therapy, as well as improving overall clinical outcomes. This new technology is becoming more affordable and accessible, with some NPS platforms requiring minimal sample preparation and laboratory infrastructure. However, questions regarding clinical utility and how best to implement NPS technology within RTI diagnostic pathways remain unanswered. In this review, we introduce NPS as a technology and as a diagnostic tool in RTI in various settings, before discussing the advantages and limitations of NPS, and finally what the future might hold for NPS platforms in RTI diagnostics.</p
    corecore