229 research outputs found

    Evolutionary history of endogenous Human Herpesvirus 6 reflects human migration out of Africa

    Get PDF
    Human herpesvirus 6A and 6B (HHV-6) can integrate into the germline, and as a result, ∼70 million people harbor the genome of one of these viruses in every cell of their body. Until now, it has been largely unknown if 1) these integrations are ancient, 2) if they still occur, and 3) whether circulating virus strains differ from integrated ones. Here, we used next-generation sequencing and mining of public human genome data sets to generate the largest and most diverse collection of circulating and integrated HHV-6 genomes studied to date. In genomes of geographically dispersed, only distantly related people, we identified clades of integrated viruses that originated from a single ancestral event, confirming this with fluorescent in situ hybridization to directly observe the integration locus. In contrast to HHV-6B, circulating and integrated HHV-6A sequences form distinct clades, arguing against ongoing integration of circulating HHV-6A or “reactivation” of integrated HHV-6A. Taken together, our study provides the first comprehensive picture of the evolution of HHV-6, and reveals that integration of heritable HHV-6 has occurred since the time of, if not before, human migrations out of Africa

    Streaming histogram sketching for rapid microbiome analytics

    Get PDF
    Background: The growth in publically available microbiome data in recent years has yielded an invaluable resource for genomic research, allowing for the design of new studies, augmentation of novel datasets and reanalysis of published works. This vast amount of microbiome data, as well as the widespread proliferation of microbiome research and the looming era of clinical metagenomics, means there is an urgent need to develop analytics that can process huge amounts of data in a short amount of time. To address this need, we propose a new method for the compact representation of microbiome sequencing data using similarity-preserving sketches of streaming k-mer spectra. These sketches allow for dissimilarity estimation, rapid microbiome catalogue searching and classification of microbiome samples in near real time. Results: We apply streaming histogram sketching to microbiome samples as a form of dimensionality reduction, creating a compressed ‘histosketch’ that can efficiently represent microbiome k-mer spectra. Using public microbiome datasets, we show that histosketches can be clustered by sample type using the pairwise Jaccard similarity estimation, consequently allowing for rapid microbiome similarity searches via a locality sensitive hashing indexing scheme. Furthermore, we use a ‘real life’ example to show that histosketches can train machine learning classifiers to accurately label microbiome samples. Specifically, using a collection of 108 novel microbiome samples from a cohort of premature neonates, we trained and tested a random forest classifier that could accurately predict whether the neonate had received antibiotic treatment (97% accuracy, 96% precision) and could subsequently be used to classify microbiome data streams in less than 3 s. Conclusions: Our method offers a new approach to rapidly process microbiome data streams, allowing samples to be rapidly clustered, indexed and classified. We also provide our implementation, Histosketching Using Little K-mers (HULK), which can histosketch a typical 2 GB microbiome in 50 s on a standard laptop using four cores, with the sketch occupying 3000 bytes of disk space

    A Comparative Study of Human TLR 7/8 Stimulatory Trimer Compositions in Influenza A Viral Genomes

    Get PDF
    Background: Variation in the genomes of single-stranded RNA viruses affects their infectivity and pathogenicity in two ways. First, viral genome sequence variations lead to changes in viral protein sequences and activities. Second, viral genome sequence variation produces diversity at the level of nucleotide composition and diversity in the interactions between viral RNAs and host toll-like receptors (TLRs). A viral genome-typing method based on this type of diversity has not yet been established. Methodology/Principal Findings: In this study, we propose a novel genomic trait called the ‘‘TLR stimulatory trimer composition’ ’ (TSTC) and two quantitative indicators, Score S and Score N, named ‘‘TLR stimulatory scores’ ’ (TSS). Using the complete genome sequences of 10,994 influenza A viruses (IAV) and 251 influenza B viruses, we show that TSTC analysis reveals the diversity of Score S and Score N among the IAVs isolated from various hosts. In addition, we show that low values of Score S are correlated with high pathogenicity and pandemic potential in IAVs. Finally, we use Score S and Score N to construct a logistic regression model to recognize IAV strains that are highly pathogenic or have high pandemic potential. Conclusions/Significance: Results from the TSTC analysis indicate that there are large differences between human and avian IAV genomes (except for segment 3), as illustrated by Score S. Moreover, segments 1, 2, 3 and 4 may be majo

    Crystallographic reconstruction study of the effects of finish rolling temperature on the variant selection during bainite transformation in C-Mn high-strength steels

    Full text link
    The effect of finish rolling temperature (FRT) on the austenite- () to-bainite () phase transformation is quantitatively investigated in high-strength C-Mn steels. In particular, the present study aims to clarify the respective contributions of the conditioning during the hot rolling and the variant selection (VS) during the phase transformation to the inherited texture. To this end, an alternative crystallographic reconstruction procedure, which can be directly applied to experimental electron backscatter diffraction (EBSD) mappings, is developed by combining the best features of the existing models: the orientation relationship (OR) refinement, the local pixel-by-pixel analysis and the nuclei identification and spreading strategy. The applicability of this method is demonstrated on both quenching and partitioning (Q&P) and as-quenched lath-martensite steels. The results obtained on the C-Mn steels confirm that the sample finish rolled at the lowest temperature (829{\deg}C) exhibits the sharpest transformation texture. It is shown that this sharp texture is exclusively due to a strong VS from parent brass {110}, S {213} and Goss {110} grains, whereas the VS from the copper {112} grains is insensitive to the FRT. In addition, a statistical VS analysis proves that the habit planes of the selected variants do not systematically correspond to the predicted active slip planes using the Taylor model. In contrast, a correlation between the Bain group to which the selected variants belong and the FRT is clearly revealed, regardless of the parent orientation. These results are discussed in terms of polygranular accommodation mechanisms, especially in view of the observed development in the hot-rolled samples of high-angle grain boundaries with misorientation axes between and

    Genomic Characterization and High Prevalence of Bocaviruses in Swine

    Get PDF
    Using random PCR amplification followed by plasmid subcloning and DNA sequencing, we detected bocavirus related sequences in 9 out of 17 porcine stool samples. Using primer walking, we sequenced the nearly complete genomes of two highly divergent bocaviruses we provisionally named porcine bocavirus 1 isolate H18 (PBoV1-H18) and porcine bocavirus 2 isolate A6 (PBoV2-A6) which differed by 51.8% in their NS1 protein. Phylogenetic analysis indicated that PBoV1-H18 was very closely related to a ∼2 Kb central region of a porcine bocavirus-like virus (PBo-LikeV) from Sweden described in 2009. PBoV2-A6 was very closely related to the porcine bocavirus genomes PBoV-1 and PBoV2 from China described in 2010. Among 340 fecal samples collected from different age, asymptomatic swine in five Chinese provinces, the prevalence of PBoV1-H18 and PBoV2-A6 related viruses were 45–75% and 55–70% respectively, with 30–47% of pigs co-infected. PBoV1-A6 related strains were highly conserved, while PBoV2-H18 related strains were more diverse, grouping into two genotypes corresponding to the previously described PBoV1 and PBoV2. Together with the recently described partial bocavirus genomes labeled V6 and V7, a total of three major porcine bocavirus clades have therefore been described to date. Further studies will be required to elucidate the possible pathogenic impact of these diverse bocaviruses either alone or in combination with other porcine viruses

    Virus Identification in Unknown Tropical Febrile Illness Cases Using Deep Sequencing

    Get PDF
    Dengue virus is an emerging infectious agent that infects an estimated 50–100 million people annually worldwide, yet current diagnostic practices cannot detect an etiologic pathogen in ∼40% of dengue-like illnesses. Metagenomic approaches to pathogen detection, such as viral microarrays and deep sequencing, are promising tools to address emerging and non-diagnosable disease challenges. In this study, we used the Virochip microarray and deep sequencing to characterize the spectrum of viruses present in human sera from 123 Nicaraguan patients presenting with dengue-like symptoms but testing negative for dengue virus. We utilized a barcoding strategy to simultaneously deep sequence multiple serum specimens, generating on average over 1 million reads per sample. We then implemented a stepwise bioinformatic filtering pipeline to remove the majority of human and low-quality sequences to improve the speed and accuracy of subsequent unbiased database searches. By deep sequencing, we were able to detect virus sequence in 37% (45/123) of previously negative cases. These included 13 cases with Human Herpesvirus 6 sequences. Other samples contained sequences with similarity to sequences from viruses in the Herpesviridae, Flaviviridae, Circoviridae, Anelloviridae, Asfarviridae, and Parvoviridae families. In some cases, the putative viral sequences were virtually identical to known viruses, and in others they diverged, suggesting that they may derive from novel viruses. These results demonstrate the utility of unbiased metagenomic approaches in the detection of known and divergent viruses in the study of tropical febrile illness

    Cross-Species Transmission of a Novel Adenovirus Associated with a Fulminant Pneumonia Outbreak in a New World Monkey Colony

    Get PDF
    Adenoviruses are DNA viruses that naturally infect many vertebrates, including humans and monkeys, and cause a wide range of clinical illnesses in humans. Infection from individual strains has conventionally been thought to be species-specific. Here we applied the Virochip, a pan-viral microarray, to identify a novel adenovirus (TMAdV, titi monkey adenovirus) as the cause of a deadly outbreak in a closed colony of New World monkeys (titi monkeys; Callicebus cupreus) at the California National Primate Research Center (CNPRC). Among 65 titi monkeys housed in a building, 23 (34%) developed upper respiratory symptoms that progressed to fulminant pneumonia and hepatitis, and 19 of 23 monkeys, or 83% of those infected, died or were humanely euthanized. Whole-genome sequencing of TMAdV revealed that this adenovirus is a new species and highly divergent, sharing <57% pairwise nucleotide identity with other adenoviruses. Cultivation of TMAdV was successful in a human A549 lung adenocarcinoma cell line, but not in primary or established monkey kidney cells. At the onset of the outbreak, the researcher in closest contact with the monkeys developed an acute respiratory illness, with symptoms persisting for 4 weeks, and had a convalescent serum sample seropositive for TMAdV. A clinically ill family member, despite having no contact with the CNPRC, also tested positive, and screening of a set of 81 random adult blood donors from the Western United States detected TMAdV-specific neutralizing antibodies in 2 individuals (2/81, or 2.5%). These findings raise the possibility of zoonotic infection by TMAdV and human-to-human transmission of the virus in the population. Given the unusually high case fatality rate from the outbreak (83%), it is unlikely that titi monkeys are the native host species for TMAdV, and the natural reservoir of the virus is still unknown. The discovery of TMAdV, a novel adenovirus with the capacity to infect both monkeys and humans, suggests that adenoviruses should be monitored closely as potential causes of cross-species outbreaks
    corecore