8 research outputs found
A Machine Learning Model of Response to Hypomethylating Agents in Myelodysplastic Syndromes
Hypomethylating agents (HMA) prolong survival and improve cytopenias in individuals with higher-risk myelodysplastic syndrome (MDS). Only 30-40% of patients, however, respond to HMAs, and responses may not occur for more than 6 months after HMA initiation. We developed a model to more rapidly assess HMA response by analyzing early changes in patients’ blood counts. Three institutions’ data were used to develop a model that assessed patients’ response to therapy 90 days after the initiation using serial blood counts. The model was developed with a training cohort of 424 patients from2 institutions and validated on an independent cohort of 90 patients. The final model achieved an area under the receiver operating characteristic curve (AUROC) of 0.79 in the train/test group and 0.84 in the validation group. The model provides cohort-wide and individual- level explanations for model predictions, and model certainty can be interrogated to gauge the reliability of a given prediction
Recommended from our members
Personalized Transcriptomic Analyses Identify Unique Signatures That Correlate with Genomic Subtypes in Acute Myeloid Leukemia (AML) Using Explainable Artificial Intelligence
Background
Multi-omic analysis can identify unique signatures that correlate with cancer subtypes. While clinically meaningful molecular subtypes of AML have been defined based on the status of single genes such as NPM1 and FLT3, such categories remain heterogeneous and further work is needed to characterize their genetic and transcriptomic diversity on a truly individualized basis. Further, patients (pts) with NPM1+/FLT3-ITD- AML have a better overall survival compared to patients with NPM1-/FLT3-ITD+, suggesting that these pts could have different transcriptomic signature that impact phenotype, pathophysiology, and outcomes. Many current transcriptome analytic techniques use clustering analysis to aggregate samples and look at relationships on a cohort-wide basis to build transcriptomic signatures that correlate with phenotype or outcome. Such approaches can undermine the heterogeneity of the gene expression in pts with the same signatures.
In this study, we took advantage of state of the art machine learning algorithms to identify unique transcriptomic signatures that correlate with AML genomic phenotype.
Methods
Genomic (whole exome sequencing and targeted deep sequencing) and transcriptomic data from 451 AML pts included in the Beat AML study (publicly available data) were used to build transcriptomic signatures that are specific for AML patients with NPM1+/FLT3-ITD+ compared to NPM1+/FLT3-ITD, and NPM1-/FLT3-ITD-. We chose these AML phenotypes as they have been described extensively and they correlate with clinical outcomes.
Results
A total of 242 patients (54%) had NPM1-/FLT3-, 35 (8%) were NPM1+/FLT3-, and 47 (10%) were NPM1+/FLT3+.
Our algorithm identified 20 genes that are highly specific for NPM1/FLT3ITD phenotype: HOXB-AS3, SCRN1, LMX1B, PCBD1, DNAJC15, HOXA3, NPTXq, RP11-1055B8, ABDH128, HOXB8, SOCS2, HOXB3, HOXB9, MIR503HG, FAM221B, NRP1, NDUFAF3, MEG3, CCDC136, and HIST1H2BC. Interestingly, several of those genes were overexpressed or underexpressed in specific phenotypes. For example, SCRN1, LMX1B, RP11-1055B8, ABDH128, HOXB8, MIR503HG, NRP1 are only overexpressed or underexpressed in patients with NPM1-/FLT3-, while PCBD1, NDUFAF3, FAM221B are overexpressed or underexpressed in pts with NPM1+/FLT3+. These genes affect several important pathways that regulate cell differentiation, proliferation, mitochondrial oxidative phosphorylation, histone modification and lipid metabolism. All these genes had previously been reported as having altered expression in genomic studies of AML, confirming our approach's ability to identify biologically meaningful relationships. Further, our algorithm can provide a personalized explanation of overexpressed and underexpressed genes specific for a given patient, thus identifying targetable pathways for each pt. Figure 1 below shows three pts with the same genotype (NPM1+/FLT3-ITD+) but demonstrate different transcriptomic patterns of overexpression or underexpression that affect different biological pathways.
Conclusions
We describe the use of a state of the art explainable machine learning approach to define transcriptomic signatures that are specific for individual pts. In addition to correctly distinguishing AML subtype based on specific transcriptomic signatures, our model was able to accurately identify upregulated and downregulated genes that affecte several important biological pathways in AML and can summarize these pathways at an individual level. Such an approach can be used to provide personalized treatment options that can target the activated pathways at an individual level.
Disclosures
Mukherjee: Partnership for Health Analytic Research, LLC (PHAR, LLC): Honoraria; Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; EUSA Pharma: Consultancy; Celgene/Acceleron: Membership on an entity's Board of Directors or advisory committees; Bristol Myers Squib: Honoraria; Aplastic Anemia and MDS International Foundation: Honoraria; Celgene: Consultancy, Honoraria, Research Funding. Maciejewski:Alexion, BMS: Speakers Bureau; Novartis, Roche: Consultancy, Honoraria. Sekeres:BMS: Consultancy; Takeda/Millenium: Consultancy; Pfizer: Consultancy. Nazha:Jazz: Research Funding; Incyte: Speakers Bureau; Novartis: Speakers Bureau; MEI: Other: Data monitoring Committee
Recommended from our members
Multicenter Validation of a Personalized Model to Predict Hypomethylating Agent Response in Myelodysplastic Syndromes (MDS)
Background
While hypomethylating agents (HMAs) can improve cytopenias and even survival for MDS patients (pts), only 30-40% of pts respond to HMAs. Predicting response or resistance to therapy can improve pt outcomes, decrease cost and toxicities, and suggest alternative therapies when response is unlikely. No clinical or molecular model can reliability predict response or resistance to HMAs.
We developed and validated a model to provide personalized predictions of response or resistance to HMAs during 12 weeks of treatment by monitoring changes in blood counts during therapy.
Methods
MDS pts treated with HMAs (azacitidine or decitabine) at Cleveland Clinic (314 pts) and the Moffit Cancer Center (100) and had their CBCs with differential monitored every 1-2 weeks in the first 12 weeks of therapy compromised the training cohort. The final model was externally validated in 80 MDS pts treated with HMAs at Sunnybrook hospital. Responses were defined per 2006 IWG criteria and pts with complete response (CR), marrow CR, partial response (PR), or hematologic improvement (HI) were considered responders.
Time series analysis (analysis of serial changes in blood count parameters) using machine learning technology was used to develop the model, analogous to voice recognition algorithms such as Apple's Siri and Alexa, in which the sequence of words allows these algorithms to understand sentences. Changes in blood counts and monitoring the patterns of these changes during HMA therapy similarly can predict response/resistance to treatment. The area under the curve (AUC) was used to evaluate the performance of the final model. A feature importance algorithm was used to define the variables that most impacted the algorithm's decision for a given pt.
Results
For 494 included pts from all cohorts, the median age was 72 years (range: 40-94), 145 (29%) were female. Pts' IPSS-R scores at the time of treatment were: very low 4%; low 21%; intermediate 24%; high 21%; and very high 22%. Responses included: 56 (11%) complete remission (CR), 17 (3%) marrow CR, 6 (3%) partial remission (PR), and 143 (29%) hematologic improvement (HI).
When trained exclusively on serial CBC values (adding other clinical or molecular values did not improve the model's performance), the model achieved an AUC of 0.82 in a cross-validated train/test schema and a similar AUC of 0.78 when it was applied to the Sunnybrook cohort.
Feature importance algorithms identified improvements in hemoglobin from baseline between days 21-30 of therapy, improvement in platelets between days 51 and 60, changes in monocyte % between days 41 and 50, and changes in MCV and RDW between days 31 and 60 as predictors of response, Figure 1a. The model also can provide a personalized heatmap that summarizes the variables that impacted the response or resistance to HMAs and are specific for a given pt, Figure 1b, 1c.
Conclusions
We developed and externally validated a personalized prediction model that uses changes in blood counts during the initial 3 cycles of HMA therapy and can predict response or resistance to treatment with high accuracy. The model can provide personalized explanations of the variables that inform a given outcome. It can be used to develop novel clinical trial designs in which pts who are predicted not to respond within 3 cycles of HMA therapy can receive an investigational agent in addition to continuing HMA or change treatment entirely, whereas patients who are predicted to respond continue to receive HMA monotherapy.
Disclosures
Sallman: Agios, Bristol Myers Squibb, Celyad Oncology, Incyte, Intellia Therapeutics, Kite Pharma, Novartis, Syndax: Consultancy; Celgene, Jazz Pharma: Research Funding. Buckstein:Celgene: Research Funding; Takeda: Research Funding; Celgene: Honoraria; Astex: Honoraria; Novartis: Honoraria. Brunner:Forty Seven, Inc: Consultancy; Biogen: Consultancy; Acceleron Pharma Inc.: Consultancy; Jazz Pharma: Consultancy; Novartis: Consultancy, Research Funding; Takeda: Consultancy, Research Funding; Xcenda: Consultancy; GSK: Research Funding; Janssen: Research Funding; Astra Zeneca: Research Funding; Celgene/BMS: Consultancy, Research Funding. Mukherjee:Celgene/Acceleron: Membership on an entity's Board of Directors or advisory committees; Aplastic Anemia and MDS International Foundation: Honoraria; Celgene: Consultancy, Honoraria, Research Funding; Bristol Myers Squib: Honoraria; Partnership for Health Analytic Research, LLC (PHAR, LLC): Honoraria; Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; EUSA Pharma: Consultancy. Komrokji:Abbvie: Honoraria; Agios: Speakers Bureau; BMS: Honoraria, Speakers Bureau; Jazz: Honoraria, Speakers Bureau; Incyte: Honoraria; Acceleron: Honoraria; Geron: Honoraria; Novartis: Honoraria. Maciejewski:Novartis, Roche: Consultancy, Honoraria; Alexion, BMS: Speakers Bureau. Sekeres:BMS: Consultancy; Pfizer: Consultancy; Takeda/Millenium: Consultancy. Nazha:Jazz: Research Funding; Incyte: Speakers Bureau; Novartis: Speakers Bureau; MEI: Other: Data monitoring Committee
Recommended from our members
Predicting Response to Hypomethylating Agents in Patients with Myelodysplastic Syndromes (MDS) Using Artificial Intelligence (AI)
Introduction
While the hypomethylating agents (HMAs) azacitidine (AZA) and decitabine (DAC) improve cytopenias and prolong survival in MDS patients (pts), response is not guaranteed. Timely identification of non-responders could prevent prolonged exposure to ineffective therapy, thereby reducing toxicities and costs. Currently no widely accepted clinical or genomic models exist to predict response or resistance to HMAs.
We developed a clinical model to predict response or resistance to HMA after 90 days of initiating therapy based on changes in blood counts using time series analysis technology similar to the kind used in Apple's Siri or Google Assistant. In the setting of voice recognition, the sequence and context of words determines the meaning of a sentence; similarly, we hypothesized that the pattern of changes in MDS pts' blood counts would predict response or resistance early during treatment.
Methods
We screened a cohort of 107 pts with MDS (per 2016 WHO criteria) who received HMAs at our institution between February 2005 and July 2013 and had regular CBCs drawn during treatment. Mutations from a panel of 60 genes commonly mutated in myeloid malignancy were included. Responses were assessed after 6 months of therapy per International Working Group (IWG) 2006 criteria. Pts were divided randomly into training (80%) and validation (20%) cohorts. To address the potential for bias due to a small sample size, an oversampling algorithm was used to cluster similar pts based on their CBC data, Revised International Prognostic Scoring System (IPSS-R) score, and % bone marrow blasts at the time of diagnosis. CBC data from the first 90 days of treatment were fed into deep neural network (recurrent neural network) and decision tree algorithms, which were trained to predict whether pts would achieve a response (defined as complete remission (CR), partial remission (PR), or hematologic Improvement (HI)). Area under the curve (AUC) was used to assess model performance. Important features that impact the algorithm's predictions were extracted and plotted.
Results
20747 unique data points were used, including CBC, clinical and genomic data. Among 107 pts, 61 (57.0%) received AZA only, 19 (17.8%) DAC only, 4 (3.7%) received both DAC and AZA, and 23 (21.5%) received HMA with an additional agent. Median age was 69 years (range: 37-100 years), and 27 (26.4%) were female. Forty pts (37.4%) were very low/low risk, 32 (29.9%) intermediate, 19 (17.8%) high, and 16 (14.9%) very high risk per IPSS-R. Responses included 23 (22.5%) CR, 2 (1.9%) marrow CR, 4 (3.9%) PR, and 20 (19.6%) HI. The most commonly mutated genes were ASXL1 (17.6%), TET2 (16.7%), SRSF2 (15.7%), SF3B1 (11.8%), RUNX1 (10.8%), STAG2(10.8%), and DNMT3A (10.8%). The median number of mutations per sample was 1 (range, 0-11), and 40 pts (39.2%) had > 3 mutations per sample.
When trained using absolute values and changes in CBC values, the model's AUC was 0.95 in the training cohort and 0.83 in the validation cohort. When the cohort was oversampled to 1000 pts, the validation cohort AUC increased to 0.89. Feature extraction algorithms identified increases in MCV and RDW during weeks 2-8 of treatment, increased proportion of lymphocytes, decreased proportion of monocytes, and increased platelet counts during weeks 6-8 as factors favoring response to HMA. The model provides personalized, patient-specific predictions that correlate with blood counts (Figure 1).
Conclusions
We describe a machine learning model that monitors changes in blood counts during therapy with HMA to predict response or resistance to HMA in MDS pts. Such a model can be used to develop novel trial designs wherein pts predicted to not respond after 90 days of HMA treatment could be assigned to an investigational agent. Conversely, it would help inform the decision to continue HMA therapy in pts predicted to respond. Increasing sample size with oversampling dramatically increased model accuracy; a larger cohort of pts treated at different institutions is currently under development.
Disclosures
Sekeres: Millenium: Membership on an entity's Board of Directors or advisory committees; Syros: Membership on an entity's Board of Directors or advisory committees; Celgene: Membership on an entity's Board of Directors or advisory committees. Mukherjee:Partnership for Health Analytic Research, LLC (PHAR, LLC): Consultancy; Takeda: Membership on an entity's Board of Directors or advisory committees; Celgene Corporation: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Projects in Knowledge: Honoraria; Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Pfizer: Honoraria; McGraw Hill Hematology Oncology Board Review: Other: Editor; Bristol-Myers Squibb: Speakers Bureau. Advani:Glycomimetics: Consultancy, Research Funding; Kite Pharmaceuticals: Consultancy; Amgen: Research Funding; Pfizer: Honoraria, Research Funding; Macrogenics: Research Funding; Abbvie: Research Funding. Maciejewski:Alexion: Consultancy; Novartis: Consultancy. Nazha:Novartis: Speakers Bureau; Tolero, Karyopharma: Honoraria; Abbvie: Consultancy; Jazz Pharmacutical: Research Funding; Incyte: Speakers Bureau; Daiichi Sankyo: Consultancy; MEI: Other: Data monitoring Committee
Recommended from our members
A Personalized Prediction Model to Risk Stratify Patients with Acute Myeloid Leukemia (AML) Using Artificial Intelligence
Background
AML is a heterogeneous clonal disorder that is characterized by the accumulation of complex genomic alterations that affect disease biology and outcomes. Despite significant advances in our understanding of the impact of these mutations on overall survival (OS), established AML risk stratification guidelines are based primarily on cytogenetic analyses and a limited number of genes, don't take into account the complexity and the interaction between these mutations, and how particular constellations of genomic and clinical risk factors affect patient (pt) outcome.
We developed a novel prognostic model that incorporates clinical, cytogenetic, and mutational data to determine personalized outcomes specific to a particular pt.
Method
A total of 792,779 genomic and clinical data points from 3,421 pts were analyzed. The cohort was comprised of five independent datasets: 443 pts from the Beat AML Master Trial (Tyner et al, Nature, 2018), 855 pts from Cleveland Clinic, 414 pts from Munich Leukemia Laboratory (MLL), 1,509 pts from the German-Austrian Study Group (Papaemmanuil et al, NEJM, 2016), and 200 pts from The Cancer Genome Atlas (NEJM, 2013). A panel of 44 gene mutations commonly implicated in AML was used in the analysis, along with numerous cytogenetic and clinical variables such age, white blood cell count WBC) at diagnosis, and AML subtype (primary vs. secondary vs. therapy-related. A machine learning algorithm capable of accounting for survival (XGBOOST) was used to build the new model, in which clinical and molecular variables were randomly selected for inclusion in determining OS. Feature extraction algorithms were used to isolate the most important variables that impacted decision making within the model. The algorithm can also plot the important features that are specific for a given pt and show the impact of each feature on the outcome (positive vs. negative). The C-index was used to evaluate the accuracy of the new model compared to 2017 ELN risk classification.
Results
The median age of the cohort was 56 years (range, 18-100); 1,122 pts (32.8%) had favorable risk cytogenetics per ELN criteria, 956 (27.9%) intermediate (INT), and 1,343 (39.3%) adverse. The most commonly mutated genes were: NPM1 (24%), FLT3 (23%), DNMT3A (20%), NRAS (13%), IDH2 (11%), RUNX1 (10%) and TET2 (10%). Mutations occurred in different frequencies in each cytogenetic risk group. The most commonly mutated genes in the favorable risk group were: NRAS (30%), KIT (23%), FLT3 (17%), and KRAS (8%). The most commonly mutated genes in the INT risk group were: NPM1 (28%), FLT3 (26%), DNMT3A (22%), IDH2 (12%), TET2 (11%), NRAS (11%), and RUNX1 (11%). The most commonly mutated genes in pts with adverse cytogenetics included: TP53 (34%), DNMT3A (13%), NRAS (11%), RUNX1 (10%), PTPN11 (8%), IDH2 (7%), U2AF1 (6%) and FLT3 (6%). All genomic-clinical variables were included in the machine learning algorithm. Variable importance analyses (the most important variables that contributed to the outcome) and multiple backward elimination analyses (identifying the least number of variables that can provide the least error rate) identified the following variables that impacted OS: age, transplant (yes vs. no), WBC, bone marrow blast %, cytogenetics, ASXL1, CEBPA, DNMT3A, FLT3, KDM6A, KIT, KRAS, NPM1, NRAS, PHF6, PTPN11, RUNX1, TET2, and TP53. The clinical and mutational variables that impacted each pt outcome can be visualized in a highly personalized manner, Figure 1.
The C-index for the new model was 0.80 which significantly outperformed ELN classification (0.59). When applying the new model to each of the five patient cohorts, the c-indices remained high and were as follows: Beat AML (0.81), Cleveland Clinic (0.85), MLL (0.83), Papaemmanuil E, et al (0.79), and TCGA (0.80).
Conclusions
Genomic alterations have a differential impact on OS in each cytogenetic risk group, highlighting the complexity of incorporating these mutations into risk stratification. A personalized prediction model based on clinical-genomic data can accurately provide survival unique to each individual pt and can significantly outperform ELN classifications or any currently available models. To ease the translation of this model into the clinic, a web application is currently under development and will be publicly available for use.
Disclosures
Meggendorfer: MLL Munich Leukemia Laboratory: Employment. Mukherjee:Bristol-Myers Squibb: Speakers Bureau; Takeda: Membership on an entity's Board of Directors or advisory committees; Pfizer: Honoraria; Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Projects in Knowledge: Honoraria; Celgene Corporation: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Partnership for Health Analytic Research, LLC (PHAR, LLC): Consultancy; McGraw Hill Hematology Oncology Board Review: Other: Editor. Walter:MLL Munich Leukemia Laboratory: Employment. Hutter:MLL Munich Leukemia Laboratory: Employment. Maciejewski:Novartis: Consultancy; Alexion: Consultancy. Haferlach:MLL Munich Leukemia Laboratory: Employment, Equity Ownership. Sekeres:Millenium: Membership on an entity's Board of Directors or advisory committees; Syros: Membership on an entity's Board of Directors or advisory committees; Celgene: Membership on an entity's Board of Directors or advisory committees. Haferlach:MLL Munich Leukemia Laboratory: Employment, Equity Ownership. Nazha:Jazz Pharmacutical: Research Funding; Novartis: Speakers Bureau; Incyte: Speakers Bureau; Tolero, Karyopharma: Honoraria; Abbvie: Consultancy; Daiichi Sankyo: Consultancy; MEI: Other: Data monitoring Committee
Recommended from our members
A Personalized Clinical-Decision Tool to Improve the Diagnostic Accuracy of Myelodysplastic Syndromes
Background
While histo- and cytomorphological examinations are central to the diagnosis of myelodysplastic syndromes (MDS), significant inter-observer variability exists. The diagnosis can be challenging in pancytopenic patients (pts) without evidence of dysplasia and is contingent on observer expertise.
We developed and externally validated a geno-clinical model that uses mutational data and peripheral blood counts/clinical variables to distinguish MDS from other myeloid malignancies.
Methods
Clinical and genomic data, including commercially available next-generation sequencing panels, were obtained for patients (pts) treated at the Cleveland Clinic (CC; 652 pts), Munich Leukemia Laboratory (MLL; 1509 pts), and the University of Pavia in Italy (UP, 536 pts). All patients had carried a diagnosis of MDS, chronic myelomonocytic leukemia (CMML), MDS/myeloproliferative neoplasm overlap (MDS/MPN), myeloproliferative neoplasm (MPN; either polycythemia vera, essential thrombocythemia, or myelofibrosis), clonal cytopenia of undetermined significance (CCUS), or idiopathic cytopenia of undetermined significance (ICUS). All diagnoses were established with bone marrow aspiration and according to World Health Organization 2017 criteria.
The training cohort included data from CC and UP and randomly divided into learner (80%) and test (20%) cohorts. The final model was independently validated in the MLL cohort.
A machine learning algorithm was used to build the model; multiple extraction algorithms were used to extract genomic/clinical variables on both the cohort and individual levels. Performance was evaluated according to the area under the curve of the receiver operating characteristic (ROC-AUC) and accuracy matrices.
Results
Among the 2697 pts included from all sites, the median age was 70 years [36 - 86]. Median hemoglobin (Hb) was 10.4g/dl [6.9 - 15.7], median platelet count (PLT) was 132 k/dL [14 - 722], median WBC count was 5.3 k/dL [1.4 - 49.9], median ANC was 2.8 k/dL [0.3 - 27.7], median monocyte count was 0.3 k/dL [0 - 9.9], and median lymphocyte count (ALC) was 1.1 k/dL [0.1 - 5.4], and median peripheral blast percentage 0% [0 - 8]. The most commonly mutated genes in all patients were (list top 5 genes) and among pts with MDS were SF3B1 (27%), TET2 (25%), ASXL1 (19%), SRSF2 (16%), and DNMT3A (11%); among patients with MDS-MPN/CMML, the most commonly mutated genes were MDS-MPN/CMML (TET2 46%, ASXL1 34%, SRSF2 29%, RUNX1 13%, CBL 12%) ; among patients with MPNs, the most commonly mutated genes were (JAK2 64%, ASXL1 27%, TET2 14%, DNMT3A 8%, U2AF1 7%); among patients with CCUS the most commonly mutated genes were (TET2 41%, DNMT3A 27%, ASXL1 19%, SRSF2 17%, ZRSR2 10%).
The most important features for model predictions (ranked from the most to the least important) included: number of mutations detected/sample, peripheral blast percentage, AMC, JAK2 status, Hb, basophil count, age, eosinophil count, ALC, WBC, EZH2 mutation status, ANC, mutation status of KRAS and SF3B1, platelets, and gender. The final model achieved an average AUROC of 0.95 (95% CI 0.93-0.96) when applied to the test cohort and 0.93 (95% CI 0.91 - 0.94) when it was applied to the MLL cohort.
The model also provides individual-level explanations for predictions, providing top differential diagnoses and individual-level explanations of how features influence a putative diagnosis (Figure 1b).
Conclusions
We developed and externally validated a highly accurate and interpretable model that can distinguish MDS from other myeloid malignancies using clinical and mutational data from a large international cohort. The model can provide personalized interpretations of its outcome and can aid physicians and hematopathologists in recognizing MDS with high accuracy when encountering pts with pancytopenia and with a suspected diagnosis of MDS.
Disclosures
Sekeres: Pfizer: Consultancy, Membership on an entity's Board of Directors or advisory committees; Takeda/Millenium: Consultancy, Membership on an entity's Board of Directors or advisory committees; BMS: Consultancy, Membership on an entity's Board of Directors or advisory committees. Mukherjee:Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Partnership for Health Analytic Research, LLC (PHAR, LLC): Honoraria; Bristol Myers Squib: Honoraria; Celgene: Consultancy, Honoraria, Research Funding; Aplastic Anemia and MDS International Foundation: Honoraria; Celgene/Acceleron: Membership on an entity's Board of Directors or advisory committees; EUSA Pharma: Consultancy. Gerds:Sierra Oncology: Research Funding; Imago Biosciences: Research Funding; Apexx Oncology: Consultancy; Celgene: Consultancy, Research Funding; Incyte Corporation: Consultancy, Research Funding; Roche/Genentech: Research Funding; CTI Biopharma: Consultancy, Research Funding; AstraZeneca/MedImmune: Consultancy; Gilead Sciences: Research Funding; Pfizer: Research Funding. Maciejewski:Alexion, BMS: Speakers Bureau; Novartis, Roche: Consultancy, Honoraria. Nazha:Jazz: Research Funding; Incyte: Speakers Bureau; Novartis: Speakers Bureau; MEI: Other: Data monitoring Committee
Recommended from our members
Geno-Clinical Model for the Diagnosis of Bone Marrow Myeloid Neoplasms
Background
Myelodysplastic syndromes (MDS) and other myeloid neoplasms are mainly diagnosed based on morphological changes in the bone marrow. Diagnosis can be challenging in patients (pts) with pancytopenia with minimal dysplasia, and is subject to inter-observer variability, with up to 40% disagreement in diagnosis (Zhang, ASH 2018). Somatic mutations can be identified in all myeloid neoplasms, but no gene or set of genes are diagnostic for each disease phenotype.
We developed a geno-clinical model that uses mutational data, peripheral blood values, and clinical variables to distinguish among several bone marrow disorders that include: MDS, idiopathic cytopenia of undetermined significance (ICUS), clonal cytopenia of undetermined significance (CCUS), MDS/myeloproliferative neoplasm (MPN) overlaps including chronic myelomonocytic leukemia (CMML), and MPNs such as polycythemia vera (PV), essential thrombocythemia (ET), and myelofibrosis (PMF).
Methods
We combined genomic and clinical data from 2471 pts treated at our institution (684) and the Munich Leukemia Laboratory (1787). Pts were diagnosed with MDS, ICUS, CCUS, CMML, MDS/MPN, PV, ET, and PMF according to 2016 WHO criteria. Diagnoses were confirmed by independent hematopathologists not associated with the study. A panel of 60 genes commonly mutated in myeloid malignancies was included. The cohort was randomly divided into learner (80%) and validation (20%) cohorts. Machine learning algorithms were applied to predict the phenotype. Feature extraction algorithms were used to extract genomic/clinical variables that impacted the algorithm decision and to visualize the impact of each variable on phenotype. Prediction performance was evaluated according to the area under the curve of the receiver operator characteristic (ROC-AUC).
Results
Of 2471 pts, 1306 had MDS, 223 had ICUS, 107 had CCUS, 478 had CMML, 89 had MDS/MPN, 79 had PV, 90 had ET, and 99 had PMF. The median age for the entire cohort was 71 years (range, 9-102); 38% were female. The median white blood cell count (WBC) was 3.2x10^9/L (range, 0.00-179), absolute monocyte count (AMC) 0.21x10^9/L (range, 0-96), absolute lymphocyte count (ALC) 0.88x10^9/L (range, 0-357), absolute neutrophil count (ANC) 0.60x10^9/L (range, 0-170), and hemoglobin (Hgb) 10.50 g/dL (range, 3.9-24.0).
The most commonly mutated genes in all pts were: TET2 (28%), ASXL1 (23%), SF3B1 (15%). In MDS, they were: TET2 (26%), SF3B1 (24%), ASXL1 (21%). In CCUS: TET2 (46%), SRSF2 (24%), ASXL1 (23%). In CMML, TET2 (51%), ASXL1 (43 %), SRSF2 (25%). In MDS/MPN: SF3B1 (39%), JAK2 (37%), TET2 (20%). In PV, JAK2 (94%), TET2 (22%), DNMT3A (8%). In ET: JAK2 (44%), TET2 (13%), DNMT3A (8%). In PMF: JAK2 (67%), ASXL1 (43%), SRSF2 (17%).
71 genomic/clinical variables were evaluated. Feature extraction algorithms were used to identify the variables with the most significant impacts on prediction. The top variables are shown in the Figure 1. Overall, the most important variables were: age, AMC, ANC, Hgb, Plt, ALC, total number of mutations, JAK2, ASXL1, TET2, U2AF1, SRSF2, SF3B1, BCOR, EZH2, and DNMT3A. The top variables for each disease were different, see Figure.
When applying the model to the validation cohort, AUC performance was as follows (a perfect predictor has an AUC of 1, and AUC ≥ 0.90 are generally considered excellent): MDS: 0.95 +/- 0.04, ICUS: 0.96 +/- 0.05, CCUS: 0.95 +/- 0.05, CMML: 0.95 +/- 0.05, MDS/MPN: 0.95 +/- 0.05, PV: 0.95 +/- 0.05, ET: 0.96 +/- 0.05, PMF: 0.95 +/- 0.05. When the analysis was restricted to MDS, ICUS, and CCUS, the AUC remained high, 0.95 +/- 0.4. The model can also provide personalized explanations of the variables supporting the prediction and the impact of each variable on the outcome (Figure).
Conclusions
We propose a new approach using interpretable, individualized modeling to predict myeloid neoplasm phenotypes based on genomic and clinical data without bone marrow biopsy data. This approach can aid clinicians and hematopathologists when encountering pts with cytopenias and suspicion for these disorders. The model also provides feature attributions that allow for quantitative understanding of the complex interplay among genotypes, clinical variables, and phenotypes. A web application to facilitate the translation of this model into the clinic is under development and will be presented at the meeting.
Figure 1
Disclosures
Meggendorfer: MLL Munich Leukemia Laboratory: Employment. Sekeres:Syros: Membership on an entity's Board of Directors or advisory committees; Celgene: Membership on an entity's Board of Directors or advisory committees; Millenium: Membership on an entity's Board of Directors or advisory committees. Walter:MLL Munich Leukemia Laboratory: Employment. Hutter:MLL Munich Leukemia Laboratory: Employment. Savona:Incyte Corporation: Membership on an entity's Board of Directors or advisory committees, Research Funding; Karyopharm Therapeutics: Consultancy, Equity Ownership, Membership on an entity's Board of Directors or advisory committees; Selvita: Membership on an entity's Board of Directors or advisory committees; Sunesis: Research Funding; TG Therapeutics: Membership on an entity's Board of Directors or advisory committees, Research Funding; Takeda: Membership on an entity's Board of Directors or advisory committees, Research Funding; AbbVie: Membership on an entity's Board of Directors or advisory committees; Boehringer Ingelheim: Patents & Royalties; Celgene Corporation: Membership on an entity's Board of Directors or advisory committees. Gerds:Incyte: Consultancy, Research Funding; Roche: Research Funding; Imago Biosciences: Research Funding; CTI Biopharma: Consultancy, Research Funding; Pfizer: Consultancy; Celgene Corporation: Consultancy, Research Funding; Sierra Oncology: Research Funding. Mukherjee:Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Projects in Knowledge: Honoraria; Celgene Corporation: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Partnership for Health Analytic Research, LLC (PHAR, LLC): Consultancy; McGraw Hill Hematology Oncology Board Review: Other: Editor; Pfizer: Honoraria; Bristol-Myers Squibb: Speakers Bureau; Takeda: Membership on an entity's Board of Directors or advisory committees. Komrokji:JAZZ: Speakers Bureau; Agios: Consultancy; Incyte: Consultancy; DSI: Consultancy; pfizer: Consultancy; celgene: Consultancy; JAZZ: Consultancy; Novartis: Speakers Bureau. Haferlach:MLL Munich Leukemia Laboratory: Employment, Equity Ownership. Maciejewski:Alexion: Consultancy; Novartis: Consultancy. Haferlach:MLL Munich Leukemia Laboratory: Employment, Equity Ownership. Nazha:Tolero, Karyopharma: Honoraria; MEI: Other: Data monitoring Committee; Novartis: Speakers Bureau; Jazz Pharmacutical: Research Funding; Incyte: Speakers Bureau; Daiichi Sankyo: Consultancy; Abbvie: Consultancy
Recommended from our members
A geno-clinical decision model for the diagnosis of myelodysplastic syndromes
Abstract
The differential diagnosis of myeloid malignancies is challenging and subject to interobserver variability. We used clinical and next-generation sequencing (NGS) data to develop a machine learning model for the diagnosis of myeloid malignancies independent of bone marrow biopsy data based on a 3-institution, international cohort of patients. The model achieves high performance, with model interpretations indicating that it relies on factors similar to those used by clinicians. In addition, we describe associations between NGS findings and clinically important phenotypes and introduce the use of machine learning algorithms to elucidate clinicogenomic relationships