29,923 research outputs found

    Challenges in RNA virus bioinformatics

    Get PDF
    Motivation: Computer-assisted studies of structure, function and evolution of viruses remains a neglected area of research. The attention of bioinformaticians to this interesting and challenging field is far from commensurate with its medical and biotechnological importance. It is telling that out of >200 talks held at ISMB 2013, the largest international bioinformatics conference, only one presentation explicitly dealt with viruses. In contrast to many broad, established and well-organized bioinformatics communities (e.g. structural genomics, ontologies, next-generation sequencing, expression analysis), research groups focusing on viruses can probably be counted on the fingers of two hands. Results: The purpose of this review is to increase awareness among bioinformatics researchers about the pressing needs and unsolved problems of computational virology. We focus primarily on RNA viruses that pose problems to many standard bioinformatics analyses owing to their compact genome organization, fast mutation rate and low evolutionary conservation. We provide an overview of tools and algorithms for handling viral sequencing data, detecting functionally important RNA structures, classifying viral proteins into families and investigating the origin and evolution of viruses. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online. The references for this article can be found in the Supplementary Materia

    Viral pathogen discovery.

    Get PDF
    Viral pathogen discovery is of critical importance to clinical microbiology, infectious diseases, and public health. Genomic approaches for pathogen discovery, including consensus polymerase chain reaction (PCR), microarrays, and unbiased next-generation sequencing (NGS), have the capacity to comprehensively identify novel microbes present in clinical samples. Although numerous challenges remain to be addressed, including the bioinformatics analysis and interpretation of large datasets, these technologies have been successful in rapidly identifying emerging outbreak threats, screening vaccines and other biological products for microbial contamination, and discovering novel viruses associated with both acute and chronic illnesses. Downstream studies such as genome assembly, epidemiologic screening, and a culture system or animal model of infection are necessary to establish an association of a candidate pathogen with disease

    Bioinformatics tools for analysing viral genomic data

    Get PDF
    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing

    Creation of functional viruses from non-functional cDNA clones obtained from an RNA virus population by the use of ancestral reconstruction

    Get PDF
    RNA viruses have the highest known mutation rates. Consequently it is likely that a high proportion of individual RNA virus genomes, isolated from an infected host, will contain lethal mutations and be non-functional. This is problematic if the aim is to clone and investigate high-fitness, functional cDNAs and may also pose problems for sequence-based analysis of viral evolution. To address these challenges we have performed a study of the evolution of classical swine fever virus (CSFV) using deep sequencing and analysis of 84 full-length cDNA clones, each representing individual genomes from a moderately virulent isolate. In addition to here being used as a model for RNA viruses generally, CSFV has high socioeconomic importance and remains a threat to animal welfare and pig production. We find that the majority of the investigated genomes are non-functional and only 12% produced infectious RNA transcripts. Full length sequencing of cDNA clones and deep sequencing of the parental population identified substitutions important for the observed phenotypes. The investigated cDNA clones were furthermore used as the basis for inferring the sequence of functional viruses. Since each unique clone must necessarily be the descendant of a functional ancestor, we hypothesized that it should be possible to produce functional clones by reconstructing ancestral sequences. To test this we used phylogenetic methods to infer two ancestral sequences, which were then reconstructed as cDNA clones. Viruses rescued from the reconstructed cDNAs were tested in cell culture and pigs. Both reconstructed ancestral genomes proved functional, and displayed distinct phenotypes in vitro and in vivo. We suggest that reconstruction of ancestral viruses is a useful tool for experimental and computational investigations of virulence and viral evolution. Importantly, ancestral reconstruction can be done even on the basis of a set of sequences that all correspond to non-functional variants

    Tissue Tropism in Host Transcriptional Response to Members of the Bovine Respiratory Disease Complex.

    Get PDF
    Bovine respiratory disease (BRD) is the most common infectious disease of beef and dairy cattle and is characterized by a complex infectious etiology that includes a variety of viral and bacterial pathogens. We examined the global changes in mRNA abundance in healthy lung and lung lesions and in the lymphoid tissues bronchial lymph node, retropharyngeal lymph node, nasopharyngeal lymph node and pharyngeal tonsil collected at the peak of clinical disease from beef cattle experimentally challenged with either bovine respiratory syncytial virus, infectious bovine rhinotracheitis, bovine viral diarrhea virus, Mannheimia haemolytica or Mycoplasma bovis. We identified signatures of tissue-specific transcriptional responses indicative of tropism in the coordination of host's immune tissue responses to infection by viral or bacterial infections. Furthermore, our study shows that this tissue tropism in host transcriptional response to BRD pathogens results in the activation of different networks of response genes. The differential crosstalk among genes expressed in lymphoid tissues was predicted to be orchestrated by specific immune genes that act as 'key players' within expression networks. The results of this study serve as a basis for the development of innovative therapeutic strategies and for the selection of cattle with enhanced resistance to BRD

    Unbiased Metagenomic Sequencing for Pediatric Meningitis in Bangladesh Reveals Neuroinvasive Chikungunya Virus Outbreak and Other Unrealized Pathogens.

    Get PDF
    The burden of meningitis in low-and-middle-income countries remains significant, but the infectious causes remain largely unknown, impeding institution of evidence-based treatment and prevention decisions. We conducted a validation and application study of unbiased metagenomic next-generation sequencing (mNGS) to elucidate etiologies of meningitis in Bangladesh. This RNA mNGS study was performed on cerebrospinal fluid (CSF) specimens from patients admitted in the largest pediatric hospital, a World Health Organization sentinel site, with known neurologic infections (n = 36), with idiopathic meningitis (n = 25), and with no infection (n = 30), and six environmental samples, collected between 2012 and 2018. We used the IDseq bioinformatics pipeline and machine learning to identify potentially pathogenic microbes, which we then confirmed orthogonally and followed up through phone/home visits. In samples with known etiology and without infections, there was 83% concordance between mNGS and conventional testing. In idiopathic cases, mNGS identified a potential bacterial or viral etiology in 40%. There were three instances of neuroinvasive Chikungunya virus (CHIKV), whose genomes were >99% identical to each other and to a Bangladeshi strain only previously recognized to cause febrile illness in 2017. CHIKV-specific qPCR of all remaining stored CSF samples from children who presented with idiopathic meningitis in 2017 (n = 472) revealed 17 additional CHIKV meningitis cases, exposing an unrecognized meningitis outbreak. Orthogonal molecular confirmation, case-based clinical data, and patient follow-up substantiated the findings. Case-control CSF mNGS surveys can complement conventional diagnostic methods to identify etiologies of meningitis, conduct surveillance, and predict outbreaks. The improved patient- and population-level data can inform evidence-based policy decisions.IMPORTANCE Globally, there are an estimated 10.6 million cases of meningitis and 288,000 deaths every year, with the vast majority occurring in low- and middle-income countries. In addition, many survivors suffer from long-term neurological sequelae. Most laboratories assay only for common bacterial etiologies using culture and directed PCR, and the majority of meningitis cases lack microbiological diagnoses, impeding institution of evidence-based treatment and prevention strategies. We report here the results of a validation and application study of using unbiased metagenomic sequencing to determine etiologies of idiopathic (of unknown cause) cases. This included CSF from patients with known neurologic infections, with idiopathic meningitis, and without infection admitted in the largest children's hospital of Bangladesh and environmental samples. Using mNGS and machine learning, we identified and confirmed an etiology (viral or bacterial) in 40% of idiopathic cases. We detected three instances of Chikungunya virus (CHIKV) that were >99% identical to each other and to a strain previously recognized to cause systemic illness only in 2017. CHIKV qPCR of all remaining stored 472 CSF samples from children who presented with idiopathic meningitis in 2017 at the same hospital uncovered an unrecognized CHIKV meningitis outbreak. CSF mNGS can complement conventional diagnostic methods to identify etiologies of meningitis, and the improved patient- and population-level data can inform better policy decisions

    Elucidating the phylodynamics of endemic rabies virus in eastern Africa using whole-genome sequencing

    Get PDF
    Many of the pathogens perceived to pose the greatest risk to humans are viral zoonoses, responsible for a range of emerging and endemic infectious diseases. Phylogeography is a useful tool to understand the processes that give rise to spatial patterns and drive dynamics in virus populations. Increasingly, whole-genome information is being used to uncover these patterns, but the limits of phylogenetic resolution that can be achieved with this are unclear. Here, whole-genome variation was used to uncover fine-scale population structure in endemic canine rabies virus circulating in Tanzania. This is the first whole-genome population study of rabies virus and the first comprehensive phylogenetic analysis of rabies virus in East Africa, providing important insights into rabies transmission in an endemic system. In addition, sub-continental scale patterns of population structure were identified using partial gene data and used to determine population structure at larger spatial scales in Africa. While rabies virus has a defined spatial structure at large scales, increasingly frequent levels of admixture were observed at regional and local levels. Discrete phylogeographic analysis revealed long-distance dispersal within Tanzania, which could be attributed to human-mediated movement, and we found evidence of multiple persistent, co-circulating lineages at a very local scale in a single district, despite on-going mass dog vaccination campaigns. This may reflect the wider endemic circulation of these lineages over several decades alongside increased admixture due to human-mediated introductions. These data indicate that successful rabies control in Tanzania could be established at a national level, since most dispersal appears to be restricted within the confines of country borders but some coordination with neighbouring countries may be required to limit transboundary movements. Evidence of complex patterns of rabies circulation within Tanzania necessitates the use of whole-genome sequencing to delineate finer scale population structure that can that can guide interventions, such as the spatial scale and design of dog vaccination campaigns and dog movement controls to achieve and maintain freedom from disease

    Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    Get PDF
    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available.ImportanceTo fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available
    • …
    corecore