44 research outputs found

    Bayesian mixture models for metagenomic community profiling

    Get PDF
    Metagenomics can be defined as the study of DNA sequences from environmental or community samples. This is a rapidly progressing field and application ideas that seemed outlandish a few years ago are now routine and familiar. Metagenomics’ scope is broad and includes the analysis of a diverse set of samples such as environmental or clinical samples. Human tissues are in essence metagenomic samples due to the presence of microorganisms, such as bacteria, viruses and fungi in both healthy and diseased individuals. Deep sequencing of clinical samples is now an established tool for pathogen detection, with direct medical applications. The large amount of data generated produces an opportunity to detect species even at very low levels, provided that computational tools can effectively profile the relevant metagenomic communities. Data interpretation is complicated by the fact that short sequencing reads can match multiple organisms and by the lack of completeness of existing databases, particularly for viruses. The research presented in this thesis focuses on using Bayesian Mixture Model techniques to produce taxonomic profiles for metagenomic data. A novel Bayesian mixture model framework for resolving complex metagenomic mixtures is introduced, called metaMix. The use of parallel Monte Carlo Markov chains (MCMC) for the exploration of the species space enables the identification of the set of species most likely to contribute to the mixture. The improved accuracy of metaMix compared to relevant methods is demonstrated, particularly for profiling complex communities consisting of several related species. metaMix was designed specifically for the analysis of deep transcriptome sequencing datasets, with a focus on viral pathogen detection. However, the principles are generally applicable to all types of metagenomic mixtures. metaMix is implemented as a user friendly R package, freely available on CRAN: http://cran.r-project.org/web/packages/metaMix

    Novel Human Embryonic Stem Cell Regulators Identified by Conserved and Distinct CpG Island Methylation State

    Get PDF
    Human embryonic stem cells (hESCs) undergo epigenetic changes in vitro which may compromise function, so an epigenetic pluripotency "signature" would be invaluable for line validation. We assessed Cytosine-phosphate-Guanine Island (CGI) methylation in hESCs by genomic DNA hybridisation to a CGI array, and saw substantial variation in CGI methylation between lines. Comparison of hESC CGI methylation profiles to corresponding somatic tissue data and hESC mRNA expression profiles identified a conserved hESC-specific methylation pattern associated with expressed genes. Transcriptional repressors and activators were over-represented amongst genes whose associated CGIs were methylated or unmethylated specifically in hESCs, respectively. Knockdown of candidate transcriptional regulators (HMGA1, GLIS2, PFDN5) induced differentiation in hESCs, whereas ectopic expression in fibroblasts modulated iPSC colony formation. Chromatin immunoprecipitation confirmed interaction between the candidates and the core pluripotency transcription factor network. We thus identify novel pluripotency genes on the basis of a conserved and distinct epigenetic configuration in human stem cells

    Mixed cytomegalovirus genotypes in HIV-positive mothers show compartmentalization and distinct patterns of transmission to infants.

    Get PDF
    Cytomegalovirus (CMV) is the commonest cause of congenital infection and particularly so among infants born to HIV-infected women. Studies of congenital CMV infection (cCMVi) pathogenesis are complicated by the presence of multiple infecting maternal CMV strains, especially in HIV-positive women, and the large, recombinant CMV genome. Using newly developed tools to reconstruct CMV haplotypes, we demonstrate anatomic CMV compartmentalization in five HIV-infected mothers and identify the possibility of congenitally transmitted genotypes in three of their infants. A single CMV strain was transmitted in each congenitally infected case, and all were closely related to those that predominate in the cognate maternal cervix. Compared to non-transmitted strains, these congenitally transmitted CMV strains showed statistically significant similarities in 19 genes associated with tissue tropism and immunomodulation. In all infants, incident superinfections with distinct strains from breast milk were captured during follow-up. The results represent potentially important new insights into the virologic determinants of early CMV infection

    Mixed cytomegalovirus genotypes in HIV positive mothers show compartmentalization and distinct patterns of transmission to infants

    Get PDF
    Cytomegalovirus (CMV) is the commonest cause of congenital infection (cCMVi) and particularly so among infants born to HIV-infected women. Studies of cCMVi pathogenesis are complicated by the presence of multiple infecting maternal CMV strains, especially in HIV-positive women, and the large, recombinant CMV genome. Using newly developed tools to reconstruct CMV haplotypes, we demonstrate anatomic CMV compartmentalization in five HIV-infected mothers and identify the possibility of congenitally transmitted genotypes in three of their infants. A single CMV strain was transmitted in each congenitally infected case, and all were closely related to those that predominate in the cognate maternal cervix. Compared to non-transmitted strains, these congenitally transmitted CMV strains showed statistically significant similarities in 19 genes associated with tissue-tropism and immunomodulation. In all infants, incident superinfections with distinct strains from breast milk were captured during follow-up. The results represent potentially important new insights into the virologic determinants of early CMV infection

    Use of Whole-genome Sequencing of Adenovirus in Immunocompromised Paediatric Patients to Identify Nosocomial Transmission and Mixed-genotype Infection

    Get PDF
    Background: Adenoviruses are significant pathogens for the immunocompromised, arising from primary infection or reinfection. Serotyping is insufficient to support nosocomial transmission investigations. We investigate whether whole-genome sequencing (WGS) provides clinically relevant information on transmission among patients in a paediatric tertiary hospital. Methods: We developed a target-enriched adenovirus WGS technique for clinical samples and retrospectively sequenced 107 adenovirus-positive residual diagnostic samples, including viraemias (>5x104 copies/ml), from 37 patients collected January 2011 - March 2016. WGS was used to determine genotype and for phylogenetic analysis. Results: Adenovirus sequences were recovered from 105/107 samples. Full genome sequences were recovered from all 20 non-species C samples and from 36/85 species C viruses, with partial genome sequences recovered from the rest. Whole genome phylogenetic analysis suggested linkage of three genotype A31 cases and uncovered an unsuspected epidemiological link to an A31 infection first detected on the same ward four years earlier. In nine samples from one patient who died we identified a mixed genotype adenovirus infection. Conclusions: Adenovirus WGS from clinical samples is possible and useful for genotyping and molecular epidemiology. WGS identified likely nosocomial transmission with greater resolution than conventional genotyping, and distinguished between adenovirus disease due to single or multiple genotypes

    Epstein-Barr virus (EBV) deletions as biomarkers of response to treatment of chronic active EBV

    Get PDF
    Chronic active Epstein–Barr virus (CAEBV) disease is a rare condition characterised by persistent EBV infection in previously healthy individuals. Defective EBV genomes were found in East Asian patients with CAEBV. In the present study, we sequenced 14 blood EBV samples from three UK patients with CAEBV, comparing the results with saliva CAEBV samples and other conditions. We observed EBV deletions in blood, some of which may disrupt viral replication, but not saliva in CAEBV. Deletions were lost overtime after successful treatment. These findings are compatible with CAEBV being associated with the evolution and persistence of EBV+ haematological clones that are lost on successful treatment

    Deep sequencing reveals persistence of cell-associated mumps vaccine virus in chronic encephalitis.

    Get PDF
    Routine childhood vaccination against measles, mumps and rubella has virtually abolished virus-related morbidity and mortality. Notwithstanding this, we describe here devastating neurological complications associated with the detection of live-attenuated mumps virus Jeryl Lynn (MuV(JL5)) in the brain of a child who had undergone successful allogeneic transplantation for severe combined immunodeficiency (SCID). This is the first confirmed report of MuV(JL5) associated with chronic encephalitis and highlights the need to exclude immunodeficient individuals from immunisation with live-attenuated vaccines. The diagnosis was only possible by deep sequencing of the brain biopsy. Sequence comparison of the vaccine batch to the MuV(JL5) isolated from brain identified biased hypermutation, particularly in the matrix gene, similar to those found in measles from cases of SSPE. The findings provide unique insights into the pathogenesis of paramyxovirus brain infections

    Bioinformatics challenges and potentialities in studying extreme environments

    Get PDF
    Cold environments are populated by organisms able to contravene deleterious effects of low temperature by diverse adaptive strategies, including the production of ice binding proteins (IBPs) that inhibit the growth of ice crystals inside and outside cells. We describe the properties of such a protein (EfcIBP) identified in the metagenome of an Antarctic biological consortium composed of the ciliate Euplotes focardii and psychrophilic non-cultured bacteria. Recombinant EfcIBP can resist freezing without any conformational damage and is moderately heat stable, with a midpoint temperature of 66.4 degrees C. Tested for its effects on ice, EfcIBP shows an unusual combination of properties not reported in other bacterial IBPs. First, it is one of the best-performing IBPs described to date in the inhibition of ice recrystallization, with effective concentrations in the nanomolar range. Moreover, EfcIBP has thermal hysteresis activity (0.53 degrees C at 50 mu M) and it can stop a crystal from growing when held at a constant temperature within the thermal hysteresis gap. EfcIBP protects purified proteins and bacterial cells from freezing damage when exposed to challenging temperatures. EfcIBP also possesses a potential N-terminal signal sequence for protein transport and a DUF3494 domain that is common to secreted IBPs. These features lead us to hypothesize that the protein is either anchored at the outer cell surface or concentrated around cells to provide survival advantage to the whole cell consortium

    Astrovirus VA1/HMO-C: An Increasingly Recognized Neurotropic Pathogen in Immunocompromised Patients.

    Get PDF
    An 18-month-old boy developed encephalopathy, for which extensive investigation failed to identify an etiology, 6 weeks after stem cell transplant. To exclude a potential infectious cause, we performed high-throughput RNA sequencing on brain biopsy
    corecore