20 research outputs found
Metatranscriptomic profiles of Eastern subterranean termites, Reticulitermes flavipes (Kollar) fed on second generation feedstocks
Background: Second generation lignocellulosic feedstocks are being considered as an alternative to first generation biofuels that are derived from grain starches and sugars. However, the current pre-treatment methods for second generation biofuel production are inefficient and expensive due to the recalcitrant nature of lignocellulose. In this study, we used the lower termite Reticulitermes flavipes (Kollar), as a model to identify potential pretreatment genes/enzymes specifically adapted for use against agricultural feedstocks. Results: Metatranscriptomic profiling was performed on worker termite guts after feeding on corn stover (CS), soybean residue (SR), or 98% pure cellulose (paper) to identify (i) microbial community, (ii) pathway level and (iii) gene-level responses. Microbial community profiles after CS and SR feeding were different from the paper feeding profile, and protist symbiont abundance decreased significantly in termites feeding on SR and CS relative to paper. Functional profiles after CS feeding were similar to paper and SR; whereas paper and SR showed different profiles. Amino acid and carbohydrate metabolism pathways were downregulated in termites feeding on SR relative to paper and CS. Gene expression analyses showed more significant down regulation of genes after SR feeding relative to paper and CS. Stereotypical lignocellulase genes/enzymes were not differentially expressed, but rather were among the most abundant/constitutively-expressed genes. Conclusions: These results suggest that the effect of CS and SR feeding on termite gut lignocellulase composition is minimal and thus, the most abundantly expressed enzymes appear to encode the best candidate catalysts for use in saccharification of these and related second-generation feedstocks. Further, based on these findings we hypothesize that the most abundantly expressed lignocellulases, rather than those that are differentially expressed have the best potential as pretreatment enzymes for CS and SR feedstocks. © 2015 Rajarapu et al
Viral forensic genomics reveals the relatedness of classic herpes simplex virus strains KOS, KOS63, and KOS79
Herpes simplex virus 1 (HSV-1) is a widespread global pathogen, of which the strain KOS is one of the most extensively studied. Previous sequence studies revealed that KOS does not cluster with other strains of North American geographic origin, but instead clustered with Asian strains. We sequenced a historical isolate of the original KOS strain, called KOS63, along with a separately isolated strain attributed to the same source individual, termed KOS79. Genomic analyses revealed that KOS63 closely resembled other recently sequenced isolates of KOS and was of Asian origin, but that KOS79 was a genetically unrelated strain that clustered in genetic distance analyses with HSV-1 strains of North American/European origin. These data suggest that the human source of KOS63 and KOS79 could have been infected with two genetically unrelated strains of disparate geographic origins. A PCR RFLP test was developed for rapid identification of these strains
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
Bioinformatics of the Hessian fly
The Hessian fly, Mayetiola destructor (Say), is a major pest affecting wheat-growing regions worldwide, and is annually responsible for significant financial loss. All deleterious effects of the insect on wheat are due to a biological reprogramming of the infested plant that allows the insect\u27s survival. Artificially disrupting this interaction would protect wheat from pest damage and provide a new form of resistance to combat the diminishing effectiveness of currently deployed resistance (R) genes. RNA interference (RNAi) is a useful reverse genetics tool for studying such insect virulence pathways, but requires a systemic phenotype, which is not found in all species. In an effort to correlate the systemic RNAi phenotype with a genetic basis, we have aggregated and compared RNAi related genes across five species. While most of the micro RNA (miRNA) and transcriptional silencing pathway genes were highly conserved across species, the small interfering RNA (siRNA) pathway genes showed increased relative variability. In particular, the Piwi/Argonaute/Zwille (PAZ) domain of Dcr2 had the least amount of sequence similarity of any domain between species surveyed, with a trend of increased conservation by those species with amenable systemic RNAi. Furthermore, the Dicer dsRNA-binding fold domain of Dcr2 was absent from M. destructor, possibly indicating functional incompetence. M. destructor also had the highest degree of RNAi gene expansion of all insect species surveyed, as measured by number of gene duplications observed, suggesting that such pathway expansions do not correlate with efficacy of systemic RNAi. Pathways peripheral to dsRNA uptake and RNAi, including endocytosis, phagocytosis, tunneling nanotubes (TNTs), microvesicles (MVs), and secretory exosomes, were uniformly intact across all insects considered, implying their conservation. Finally, over 100 novel miRNA precursors were identified in the M. destructor genome through highly stringent computational means, which may serve as a reference for future M. destructor small RNAseq research. Taken together, the annotations and associated trends reported here may help describe the disparity of systemic RNAi phenotypes between insect species
Recommended from our members
Rapid Genome Assembly and Comparison Decode Intrastrain Variation in Human Alphaherpesviruses
Herpes simplex virus (HSV) is a widespread pathogen that causes epithelial lesions with recurrent disease that manifests over a lifetime. The lifelong aspect of infection results from latent viral infection of neurons, a reservoir from which the virus reactivates periodically. Recent work has demonstrated the breadth of genetic variation in globally distributed HSV strains. However, the amount of variation or capacity for mutation within one strain has not been well studied. Here we developed and applied a streamlined new approach for assembly and comparison of large DNA viral genomes such as HSV-1. This viral genome assembly (VirGA) workflow incorporates a combination of de novo assembly, alignment, and annotation strategies to automate the generation of draft genomes for large viruses. We applied this approach to quantify the amount of variation between clonal derivatives of a common parental virus stock. In addition, we examined the genetic basis for syncytial plaque phenotypes displayed by a subset of these strains. In each of the syncytial strains, we found an identical DNA change, affecting one residue in the gB (UL27) fusion protein. Since these identical mutations could have appeared after extensive in vitro passaging, we applied the VirGA sequencing and comparison approach to two clinical HSV-1 strains isolated from the same patient. One of these strains was syncytial upon first culturing; its sequence revealed the same gB mutation. These data provide insight into the extent and origin of genome-wide intrastrain HSV-1 variation and present useful methods for expansion to in vivo patient infection studies
Recommended from our members
A Personalized Clinical-Decision Tool to Improve the Diagnostic Accuracy of Myelodysplastic Syndromes
Background
While histo- and cytomorphological examinations are central to the diagnosis of myelodysplastic syndromes (MDS), significant inter-observer variability exists. The diagnosis can be challenging in pancytopenic patients (pts) without evidence of dysplasia and is contingent on observer expertise.
We developed and externally validated a geno-clinical model that uses mutational data and peripheral blood counts/clinical variables to distinguish MDS from other myeloid malignancies.
Methods
Clinical and genomic data, including commercially available next-generation sequencing panels, were obtained for patients (pts) treated at the Cleveland Clinic (CC; 652 pts), Munich Leukemia Laboratory (MLL; 1509 pts), and the University of Pavia in Italy (UP, 536 pts). All patients had carried a diagnosis of MDS, chronic myelomonocytic leukemia (CMML), MDS/myeloproliferative neoplasm overlap (MDS/MPN), myeloproliferative neoplasm (MPN; either polycythemia vera, essential thrombocythemia, or myelofibrosis), clonal cytopenia of undetermined significance (CCUS), or idiopathic cytopenia of undetermined significance (ICUS). All diagnoses were established with bone marrow aspiration and according to World Health Organization 2017 criteria.
The training cohort included data from CC and UP and randomly divided into learner (80%) and test (20%) cohorts. The final model was independently validated in the MLL cohort.
A machine learning algorithm was used to build the model; multiple extraction algorithms were used to extract genomic/clinical variables on both the cohort and individual levels. Performance was evaluated according to the area under the curve of the receiver operating characteristic (ROC-AUC) and accuracy matrices.
Results
Among the 2697 pts included from all sites, the median age was 70 years [36 - 86]. Median hemoglobin (Hb) was 10.4g/dl [6.9 - 15.7], median platelet count (PLT) was 132 k/dL [14 - 722], median WBC count was 5.3 k/dL [1.4 - 49.9], median ANC was 2.8 k/dL [0.3 - 27.7], median monocyte count was 0.3 k/dL [0 - 9.9], and median lymphocyte count (ALC) was 1.1 k/dL [0.1 - 5.4], and median peripheral blast percentage 0% [0 - 8]. The most commonly mutated genes in all patients were (list top 5 genes) and among pts with MDS were SF3B1 (27%), TET2 (25%), ASXL1 (19%), SRSF2 (16%), and DNMT3A (11%); among patients with MDS-MPN/CMML, the most commonly mutated genes were MDS-MPN/CMML (TET2 46%, ASXL1 34%, SRSF2 29%, RUNX1 13%, CBL 12%) ; among patients with MPNs, the most commonly mutated genes were (JAK2 64%, ASXL1 27%, TET2 14%, DNMT3A 8%, U2AF1 7%); among patients with CCUS the most commonly mutated genes were (TET2 41%, DNMT3A 27%, ASXL1 19%, SRSF2 17%, ZRSR2 10%).
The most important features for model predictions (ranked from the most to the least important) included: number of mutations detected/sample, peripheral blast percentage, AMC, JAK2 status, Hb, basophil count, age, eosinophil count, ALC, WBC, EZH2 mutation status, ANC, mutation status of KRAS and SF3B1, platelets, and gender. The final model achieved an average AUROC of 0.95 (95% CI 0.93-0.96) when applied to the test cohort and 0.93 (95% CI 0.91 - 0.94) when it was applied to the MLL cohort.
The model also provides individual-level explanations for predictions, providing top differential diagnoses and individual-level explanations of how features influence a putative diagnosis (Figure 1b).
Conclusions
We developed and externally validated a highly accurate and interpretable model that can distinguish MDS from other myeloid malignancies using clinical and mutational data from a large international cohort. The model can provide personalized interpretations of its outcome and can aid physicians and hematopathologists in recognizing MDS with high accuracy when encountering pts with pancytopenia and with a suspected diagnosis of MDS.
Disclosures
Sekeres: Pfizer: Consultancy, Membership on an entity's Board of Directors or advisory committees; Takeda/Millenium: Consultancy, Membership on an entity's Board of Directors or advisory committees; BMS: Consultancy, Membership on an entity's Board of Directors or advisory committees. Mukherjee:Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Partnership for Health Analytic Research, LLC (PHAR, LLC): Honoraria; Bristol Myers Squib: Honoraria; Celgene: Consultancy, Honoraria, Research Funding; Aplastic Anemia and MDS International Foundation: Honoraria; Celgene/Acceleron: Membership on an entity's Board of Directors or advisory committees; EUSA Pharma: Consultancy. Gerds:Sierra Oncology: Research Funding; Imago Biosciences: Research Funding; Apexx Oncology: Consultancy; Celgene: Consultancy, Research Funding; Incyte Corporation: Consultancy, Research Funding; Roche/Genentech: Research Funding; CTI Biopharma: Consultancy, Research Funding; AstraZeneca/MedImmune: Consultancy; Gilead Sciences: Research Funding; Pfizer: Research Funding. Maciejewski:Alexion, BMS: Speakers Bureau; Novartis, Roche: Consultancy, Honoraria. Nazha:Jazz: Research Funding; Incyte: Speakers Bureau; Novartis: Speakers Bureau; MEI: Other: Data monitoring Committee
Recommended from our members
Geno-Clinical Model for the Diagnosis of Bone Marrow Myeloid Neoplasms
Background
Myelodysplastic syndromes (MDS) and other myeloid neoplasms are mainly diagnosed based on morphological changes in the bone marrow. Diagnosis can be challenging in patients (pts) with pancytopenia with minimal dysplasia, and is subject to inter-observer variability, with up to 40% disagreement in diagnosis (Zhang, ASH 2018). Somatic mutations can be identified in all myeloid neoplasms, but no gene or set of genes are diagnostic for each disease phenotype.
We developed a geno-clinical model that uses mutational data, peripheral blood values, and clinical variables to distinguish among several bone marrow disorders that include: MDS, idiopathic cytopenia of undetermined significance (ICUS), clonal cytopenia of undetermined significance (CCUS), MDS/myeloproliferative neoplasm (MPN) overlaps including chronic myelomonocytic leukemia (CMML), and MPNs such as polycythemia vera (PV), essential thrombocythemia (ET), and myelofibrosis (PMF).
Methods
We combined genomic and clinical data from 2471 pts treated at our institution (684) and the Munich Leukemia Laboratory (1787). Pts were diagnosed with MDS, ICUS, CCUS, CMML, MDS/MPN, PV, ET, and PMF according to 2016 WHO criteria. Diagnoses were confirmed by independent hematopathologists not associated with the study. A panel of 60 genes commonly mutated in myeloid malignancies was included. The cohort was randomly divided into learner (80%) and validation (20%) cohorts. Machine learning algorithms were applied to predict the phenotype. Feature extraction algorithms were used to extract genomic/clinical variables that impacted the algorithm decision and to visualize the impact of each variable on phenotype. Prediction performance was evaluated according to the area under the curve of the receiver operator characteristic (ROC-AUC).
Results
Of 2471 pts, 1306 had MDS, 223 had ICUS, 107 had CCUS, 478 had CMML, 89 had MDS/MPN, 79 had PV, 90 had ET, and 99 had PMF. The median age for the entire cohort was 71 years (range, 9-102); 38% were female. The median white blood cell count (WBC) was 3.2x10^9/L (range, 0.00-179), absolute monocyte count (AMC) 0.21x10^9/L (range, 0-96), absolute lymphocyte count (ALC) 0.88x10^9/L (range, 0-357), absolute neutrophil count (ANC) 0.60x10^9/L (range, 0-170), and hemoglobin (Hgb) 10.50 g/dL (range, 3.9-24.0).
The most commonly mutated genes in all pts were: TET2 (28%), ASXL1 (23%), SF3B1 (15%). In MDS, they were: TET2 (26%), SF3B1 (24%), ASXL1 (21%). In CCUS: TET2 (46%), SRSF2 (24%), ASXL1 (23%). In CMML, TET2 (51%), ASXL1 (43 %), SRSF2 (25%). In MDS/MPN: SF3B1 (39%), JAK2 (37%), TET2 (20%). In PV, JAK2 (94%), TET2 (22%), DNMT3A (8%). In ET: JAK2 (44%), TET2 (13%), DNMT3A (8%). In PMF: JAK2 (67%), ASXL1 (43%), SRSF2 (17%).
71 genomic/clinical variables were evaluated. Feature extraction algorithms were used to identify the variables with the most significant impacts on prediction. The top variables are shown in the Figure 1. Overall, the most important variables were: age, AMC, ANC, Hgb, Plt, ALC, total number of mutations, JAK2, ASXL1, TET2, U2AF1, SRSF2, SF3B1, BCOR, EZH2, and DNMT3A. The top variables for each disease were different, see Figure.
When applying the model to the validation cohort, AUC performance was as follows (a perfect predictor has an AUC of 1, and AUC ≥ 0.90 are generally considered excellent): MDS: 0.95 +/- 0.04, ICUS: 0.96 +/- 0.05, CCUS: 0.95 +/- 0.05, CMML: 0.95 +/- 0.05, MDS/MPN: 0.95 +/- 0.05, PV: 0.95 +/- 0.05, ET: 0.96 +/- 0.05, PMF: 0.95 +/- 0.05. When the analysis was restricted to MDS, ICUS, and CCUS, the AUC remained high, 0.95 +/- 0.4. The model can also provide personalized explanations of the variables supporting the prediction and the impact of each variable on the outcome (Figure).
Conclusions
We propose a new approach using interpretable, individualized modeling to predict myeloid neoplasm phenotypes based on genomic and clinical data without bone marrow biopsy data. This approach can aid clinicians and hematopathologists when encountering pts with cytopenias and suspicion for these disorders. The model also provides feature attributions that allow for quantitative understanding of the complex interplay among genotypes, clinical variables, and phenotypes. A web application to facilitate the translation of this model into the clinic is under development and will be presented at the meeting.
Figure 1
Disclosures
Meggendorfer: MLL Munich Leukemia Laboratory: Employment. Sekeres:Syros: Membership on an entity's Board of Directors or advisory committees; Celgene: Membership on an entity's Board of Directors or advisory committees; Millenium: Membership on an entity's Board of Directors or advisory committees. Walter:MLL Munich Leukemia Laboratory: Employment. Hutter:MLL Munich Leukemia Laboratory: Employment. Savona:Incyte Corporation: Membership on an entity's Board of Directors or advisory committees, Research Funding; Karyopharm Therapeutics: Consultancy, Equity Ownership, Membership on an entity's Board of Directors or advisory committees; Selvita: Membership on an entity's Board of Directors or advisory committees; Sunesis: Research Funding; TG Therapeutics: Membership on an entity's Board of Directors or advisory committees, Research Funding; Takeda: Membership on an entity's Board of Directors or advisory committees, Research Funding; AbbVie: Membership on an entity's Board of Directors or advisory committees; Boehringer Ingelheim: Patents & Royalties; Celgene Corporation: Membership on an entity's Board of Directors or advisory committees. Gerds:Incyte: Consultancy, Research Funding; Roche: Research Funding; Imago Biosciences: Research Funding; CTI Biopharma: Consultancy, Research Funding; Pfizer: Consultancy; Celgene Corporation: Consultancy, Research Funding; Sierra Oncology: Research Funding. Mukherjee:Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Projects in Knowledge: Honoraria; Celgene Corporation: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Partnership for Health Analytic Research, LLC (PHAR, LLC): Consultancy; McGraw Hill Hematology Oncology Board Review: Other: Editor; Pfizer: Honoraria; Bristol-Myers Squibb: Speakers Bureau; Takeda: Membership on an entity's Board of Directors or advisory committees. Komrokji:JAZZ: Speakers Bureau; Agios: Consultancy; Incyte: Consultancy; DSI: Consultancy; pfizer: Consultancy; celgene: Consultancy; JAZZ: Consultancy; Novartis: Speakers Bureau. Haferlach:MLL Munich Leukemia Laboratory: Employment, Equity Ownership. Maciejewski:Alexion: Consultancy; Novartis: Consultancy. Haferlach:MLL Munich Leukemia Laboratory: Employment, Equity Ownership. Nazha:Tolero, Karyopharma: Honoraria; MEI: Other: Data monitoring Committee; Novartis: Speakers Bureau; Jazz Pharmacutical: Research Funding; Incyte: Speakers Bureau; Daiichi Sankyo: Consultancy; Abbvie: Consultancy
A geno-clinical decision model for the diagnosis of myelodysplastic syndromes
Abstract
The differential diagnosis of myeloid malignancies is challenging and subject to interobserver variability. We used clinical and next-generation sequencing (NGS) data to develop a machine learning model for the diagnosis of myeloid malignancies independent of bone marrow biopsy data based on a 3-institution, international cohort of patients. The model achieves high performance, with model interpretations indicating that it relies on factors similar to those used by clinicians. In addition, we describe associations between NGS findings and clinically important phenotypes and introduce the use of machine learning algorithms to elucidate clinicogenomic relationships
Genetic Determinants for Enzymatic Digestion of Lignocellulosic Biomass Are Independent of Those for Lignin Abundance in a Maize Recombinant Inbred Population
Biotechnological approaches to reduce or modify lignin in biomass crops are predicated on the assumption that it is the principal determinant of the recalcitrance of biomass to enzymatic digestion for biofuels production. We defined quantitative trait loci (QTL) in the Intermated B73 × Mo17 recombinant inbred maize (Zea mays) population using pyrolysis molecular-beam mass spectrometry to establish stem lignin content and an enzymatic hydrolysis assay to measure glucose and xylose yield. Among five multiyear QTL for lignin abundance, two for 4-vinylphenol abundance, and four for glucose and/or xylose yield, not a single QTL for aromatic abundance and sugar yield was shared. A genome-wide association study for lignin abundance and sugar yield of the 282-member maize association panel provided candidate genes in the 11 QTL of the B73 and Mo17 parents but showed that many other alleles impacting these traits exist among this broader pool of maize genetic diversity. B73 and Mo17 genotypes exhibited large differences in gene expression in developing stem tissues independent of allelic variation. Combining these complementary genetic approaches provides a narrowed list of candidate genes. A cluster of SCARECROW-LIKE9 and SCARECROW-LIKE14 transcription factor genes provides exceptionally strong candidate genes emerging from the genome-wide association study. In addition to these and genes associated with cell wall metabolism, candidates include several other transcription factors associated with vascularization and fiber formation and components of cellular signaling pathways. These results provide new insights and strategies beyond the modification of lignin to enhance yields of biofuels from genetically modified biomass