Artificial Intelligence Analysis of Gene Expression Data Predicted the Prognosis of Patients with Diffuse Large B-Cell Lymphoma

Abstract

OBJECTIVE: We aimed to identify new biomarkers in Diffuse Large B-cell Lymphoma (DLBCL) using the deep learning technique. METHODS AND RESULTS: The multilayer perceptron (MLP) analysis was performed in the GSE10846 series, divided into discovery (n = 100) and validation (n = 414) sets. The top 25 gene-probes from a total of 54,614 were selected based on their normalized importance for outcome prediction (dead/alive). By Gene Set Enrichment Analysis (GSEA) the association to unfavorable prognosis was confirmed. In the validation set, by univariate Cox regression analysis, high expression of ARHGAP19, MESD, WDCP, DIP2A, CACNA1B, TNFAIP8, POLR3H, ENO3, SERPINB8, SZRD1, KIF23 and GGA3 associated to poor, and high SFTPC, ZSCAN12, LPXN and METTL21A to favorable outcome. A multivariate analysis confirmed MESD, TNFAIP8 and ENO3 as risk factors and ZSCAN12 and LPXN as protective factors. Using a risk score formula, the 25 genes identified two groups of patients with different survival that was independent to the cell-of-origin molecular classification (5-year OS, low vs. high risk): 65% vs. 24%, respectively (Hazard Risk = 3.2, P < 0.000001). Finally, correlation with known DLBCL markers showed that high expression of all MYC, BCL2 and ENO3 associated to the worst outcome. CONCLUSION: By artificial intelligence we identified a set of genes with prognostic relevance

    Similar works