Skip to main content
Article thumbnail
Location of Repository

Applications of Machine Learning in Cancer Prediction and Prognosis

By Joseph A. Cruz and David S. Wishart


Machine learning is a branch of artificial intelligence that employs a variety of statistical, probabilistic and optimization techniques that allows computers to “learn” from past examples and to detect hard-to-discern patterns from large, noisy or complex data sets. This capability is particularly well-suited to medical applications, especially those that depend on complex proteomic and genomic measurements. As a result, machine learning is frequently used in cancer diagnosis and detection. More recently machine learning has been applied to cancer prognosis and prediction. This latter approach is particularly interesting as it is part of a growing trend towards personalized, predictive medicine. In assembling this review we conducted a broad survey of the different types of machine learning methods being used, the types of data being integrated and the performance of these methods in cancer prediction and prognosis. A number of trends are noted, including a growing dependence on protein biomarkers and microarray data, a strong bias towards applications in prostate and breast cancer, and a heavy reliance on “older” technologies such artificial neural networks (ANNs) instead of more recently developed or more easily interpretable machine learning methods. A number of published studies also appear to lack an appropriate level of validation or testing. Among the better designed and validated studies it is clear that machine learning methods can be used to substantially (15–25%) improve the accuracy of predicting cancer susceptibility, recurrence and mortality. At a more fundamental level, it is also evident that machine learning is also helping to improve our basic understanding of cancer development and progression

Topics: Review
Publisher: Libertas Academica
OAI identifier:
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • Suggested articles


    1. (2003). A Bayesian neural network approach for modelling censored data with an application to prognosis after surgery for breast cancer.
    2. (2005). A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients. Cancer Res,
    3. (2003). A combined neural network and decision trees model for prognosis of breast cancer relapse.
    4. (2002). A computer-based diagnostic and prognostic system for assessing urinary bladder tumour grade and predicting cancer recurrence. Med Inform Internet Med,
    5. (2003). A fuzzy logic based-method for prognostic decision making in breast and prostate cancers.
    6. (1970). A generalized k-nearest neighbor rule.
    7. (1998). A neural network model for prognostic prediction.
    8. (2000). A neural network predicts progression for men with gleason score 3+4 versus 4+3 tumors after radical prostatectomy.
    9. (1999). A prognostic model that makes quantitative estimates of probability of relapse for breast cancer patients. Clin Cancer Res,
    10. (1975). Adaptation in Natural and Artifi cial Systems. University of Michigan Press, Ann Arbor76 Cruz and Wishart Cancer Informatics 2006:2 Hsia TC, Chiang HC, Chiang D.
    11. (1961). Adaptive Control Processes: A Guided Tour,
    12. (1992). An analysis of Bayesian classifi ers.
    13. (2005). An artifi cial neural network for predicting the incidence of radiation pneumonitis.
    14. (1999). An evolutionary approach to constructing prognostic models.
    15. (2003). Application of breast cancer risk prediction models in clinical practice.
    16. (2005). Application of fuzzy inference to European patients to predict cervical lymph node metastasis in carcinoma of the tongue.
    17. (2005). Application of serum protein fi ngerprinting coupled with artifi cial neural network model in diagnosis of hepatocellular carcinoma.
    18. (2004). Applications of machine learning and high-dimensional visualization in cancer detection, diagnosis, and management.
    19. (2003). Artifi cial intelligence in predicting bladder cancer outcome: a comparison of neuro-fuzzy modeling and artifi cial neural networks.
    20. (2003). Artifi cial Intelligence: A Modern Approach. 2nd ed.
    21. (2005). Artifi cial neural network and tissue genotyping of hepatocellular carcinoma in liver-transplant recipients: prediction of recurrence.
    22. (2004). Artifi cial neural network for prediction of lymph node metastases in gastric cancer: a phase II diagnostic study. Ann Surg Oncol,
    23. (1998). Artifi cial neural network model of survival in patients treated with irradiation with and without concurrent chemotherapy for advanced carcinoma of the head and neck.
    24. (1998). Artifi cial neural networks and logistic regression as tools for prediction of survival in patients with Stages I and II non-small cell lung cancer. Mod Pathol,
    25. (1997). Artifi cial neural networks applied to outcome prediction for colorectal cancer patients in separate institutions.
    26. (1999). Artifi cial neural networks applied to survival prediction in breast cancer.
    27. (1997). Artifi cial neural networks improve the accuracy of cancer survival prediction.
    28. (2002). Assessment of nodal involvement and survival analysis in breast cancer patients using image cytometric data: statistical, neural network and fuzzy approaches. Anticancer Res,
    29. (2004). BagBoosting for tumor classifi cation with gene expression data.
    30. (2001). Biochemical markers in breast cancer: which ones are clinically useful? Clin Biochem,
    31. (2000). Breast cancer survival and chemotherapy: A support vector machine analysis.
    32. (2005). Bringing molecular prognosis and prediction to the clinic.
    33. (2004). Cancer classifi cation and prediction using logistic regression with Bayesian gene selection.
    34. (1999). Case-based prediction of survival in colorectal cancer patients. Anal Quant Cytol Histol,
    35. (2004). Childhood obesity and hormonal abnormalities associated with cancer risk.
    36. (2003). Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions.
    37. (2005). Communicating prognosis in cancer care: a systematic review of the literature.
    38. (1997). Comparison of a genetic algorithm neural network with logistic regression for predicting outcome after surgery for patients with nonsmall cell lung carcinoma.
    39. (2003). Comparison of Cox regression with other methods for determining prediction models and nomograms.
    40. (2005). Cotarla I. 2005.Understanding breast cancer risk —where do we stand in
    41. (1998). Decision-tree approach to the immunophenotype-based prognosis of the B-cell chronic lymphocytic leukemia.
    42. (2005). Delayed disease progression after allogeneic cell vaccination in hormone-resistant prostate cancer and correlation with immunologic variables.
    43. (2004). Detection of single and clustered microcalcifi cations in mammograms using fractals models and neural networks.
    44. (2002). Diffuse large B-cell lymphoma outcome prediction by gene-expression profi ling and supervised machine learning.
    45. (2006). Dis, 38:171-86.77 Cancer prediction and prognosis Cancer Informatics
    46. (1999). DNA ploidy and cell cycle distribution of breast cancer aspirate cells measured by image cytometry and analyzed by artifi cial neural networks for their prognostic signifi cance.
    47. (1982). Estimation of Dependences Based on Empirical Data.
    48. (2005). Expression profi les of osteosarcoma that can predict response to chemotherapy. Cancer Res,
    49. (2005). Expression profi ling using a tumor-specifi c cDNA microarray predicts the prognosis of intermediate risk neuroblastomas.
    50. (2002). Forecasting the performance status of head and neck cancer patient treatment by an interval arithmetic pruned perceptron.
    51. (2005). Forecasting the prognosis of choroidal melanoma with an artifi cial neural network.
    52. (2002). Fuzzy neural network applied to gene expression profi ling for predicting the prognosis of diffuse large B-cell lymphoma.
    53. (2004). Gene expression profi les predict survival and progression of pleural mesothelioma.
    54. (2001). Genetic adaptive neural network to predict biochemical failure after radical prostatectomy: a multiinstitutional study. Mol Urol,
    55. (1999). Genetically engineered neural networks for predicting prostate cancer progression after radical prostatectomy.
    56. (2005). Genomic determinants of prognosis in colorectal cancer.
    57. (2000). Immunohistochemical analysis and prognostic value of cathepsin D determination in laryngeal squamous cell carcinoma.
    58. (2001). Impact of different variables on the outcome of patients with clinically confi ned prostate carcinoma: prediction of pathologic stage and biochemical failure using an artifi cial neural network.
    59. (2004). Improved prediction of prostate cancer recurrence based on an automated tissue image analysis system.
    60. (1986). Induction of decision trees.
    61. (1994). Induction of selective Bayesian classifi ers.
    62. (2001). Introduction to artifi cial neural networks for physicians: taking the lid off the black box.
    63. (1986). Learning representations by back-propagating errors.
    64. (2000). Machine learning for survival analysis: a case study on recurrence of prostate cancer.
    65. (2005). Model to predict prostate biopsy outcome in large screening population with independent validation in referral setting.
    66. (2004). Modelling survival after treatment of intraocular melanoma using artifi cial neural networks and Bayes theorem.
    67. (2004). Molecular signatures of lymphoma.
    68. (2004). MUC1 and the MUCs: a family of human mucins with impact in cancer biology. Crit Rev Clin Lab Sci,
    69. (2003). Multiple fuzzy neural network system for outcome prediction and classifi cation of 220 lymphoma patients on the basis of molecular profi ling.
    70. (2005). NBS1 expression as a prognostic marker in uveal melanoma.
    71. (1998). Neural network analysis of combined conventional and experimental prognostic markers in prostate cancer: a pilot study.
    72. (2003). Neural network analysis of lymphoma microarray data: prognosis and diagnosis near-perfect.
    73. (2001). Neural network and regression predictions of 5-year survival after colon carcinoma treatment.
    74. (2003). Neural network-based assessment of prognostic markers and outcome prediction in bilharziasis-associated bladder cancer.
    75. (1992). Neural networks and diagnosis in the clinical laboratory: state of the art.
    76. (2003). Neural networks in the prediction of survival in patients with colorectal cancer.
    77. (2004). Non-linear survival analysis using neural networks.
    78. (2003). Oligonucleotide microarray for prediction of early intrahepatic recurrence of hepatocellular carcinoma after curative resection.
    79. (2001). Pattern classifi cation (2nd edition).
    80. (1990). Performance comparisons between backpropagation networks and classifi cation trees on three realworld applications.
    81. (2005). Possible prediction of chemoradiosensitivity of esophageal cancer by serum protein profi ling. Clin Cancer Res,
    82. (2005). Predicting breast cancer survivability: a comparison of three data mining methods.
    83. (2003). Predicting disease outcome of non-invasive transitional cell carcinoma of the urinary bladder using an artifi cial neural network model: results of patient follow-up for 15 years or longer.
    84. (1998). Predicting survival in malignant skin melanoma using Bayesian networks automatically induced by genetic algorithms. An empirical comparison between different approaches.
    85. (2003). Prediction of clinical behaviour and treatment for cancers.
    86. (2004). Prediction of clinical outcome using gene expression profi ling and artifi cial neural networks for patients with neuroblastoma. Cancer Res, 64(19):6883-91. Erratum in: Cancer Res,
    87. (2004). Prediction of lymph node metastasis with use of artifi cial neural networks based on gene expression profi les in esophageal squamous cell carcinoma. Ann Surg Oncol,
    88. (1997). Prediction of nodal metastasis and prognosis in breast cancer: a neural model. Anticancer Res,
    89. (2004). Prediction of postoperative morbidity after lung resection using an artifi cial neural network ensemble.
    90. (1999). Prediction of prostatic cancer progression after radical prostatectomy using artifi cial neural networks: afeasibility study.
    91. (2005). Prediction of survival in patients with esophageal carcinoma using artifi cial neural networks.
    92. (2004). Prediction of the axillary lymph node status in mammary cancer on the basis of clinicopathological data and fl ow cytometry.
    93. (1995). Prediction of the early prognosis of the hepatectomized patient with hepatocellular carcinoma with a neural network.
    94. (2005). Predictive markers in breast and other cancers: a review. Clin Chem,
    95. (2004). Predictive models for breast cancer susceptibility from multiple single nucleotide polymorphisms. Clin Cancer Res,
    96. (2004). Preoperative neural network using combined magnetic resonance imaging variables, prostate specifi c antigen, and Gleason score to predict prostate cancer recurrence after radical prostatectomy. Eur Urol,
    97. (2004). Preoperative neural network using combined magnetic resonance imaging variables, prostatespecifi c antigen, and gleason score for predicting prostate cancer biochemical recurrence after radical prostatectomy.
    98. (1996). Prognosing the survival time of the patients with the anaplastic thyroid carcinoma with machine learning.
    99. (1997). Prognostic factors for metachronous contralateral breast cancer: a comparison of the linear Cox regression model and its artifi cial neural network extension.
    100. (2003). Prognostic models in patients with non-small-cell lung cancer using artifi cial neural networks in comparison with logistic regression.
    101. (2005). Prostate cancer outcome: epidemiology and biostatistics. Anal Quant Cytol Histol,
    102. (1999). Radon-induced lung cancer in smokers and nonsmokers: risk implications using a two-mutation carcinogenesis model. Radiat Environ Biophys,
    103. (1999). Reasoning with uncertainty in pathology: artifi cial neural networks and logistic regression as tools for prediction of lymph node status in breast cancer patients. Mod Pathol,
    104. (2001). Risk models used to counsel women for breast and ovarian cancer: a guide for clinicians. Fam Cancer, 1:197-206. 75 Cancer prediction and prognosis Cancer Informatics 2006:2 Cochran AJ.
    105. (2003). Screening test data analysis for liver disease prediction model using growth curve.
    106. (2004). SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer.
    107. (1982). Self-organized formation of topologically correct feature maps.
    108. (2005). Serum proteomic fi ngerprinting discriminates between clinical stages and predicts disease progression in melanoma patients.
    109. (1995). Support-vector networks.
    110. (2002). Survival prediction using artifi cial neural networks in patients with uterine cervical cancer treated by radiation therapy alone.
    111. (2004). Systems biology, proteomics, and the future of health care: toward predictive, preventative, and personalized medicine.
    112. (1992). The future of prognostic factors in outcome prediction for patients with cancer.
    113. (2001). The predictive value of HER2 in breast cancer.
    114. (2000). The use of artifi cial intelligence technology to predict lymph node spread in men with clinically localized prostate carcinoma.
    115. (1992). Tolerating noisy, irrelevant and novel attributes in instancebased learning algorithms.
    116. (1985). Treatment selection for cancer patients: application of statistical decision theory to the treatment of advanced ovarian cancer.
    117. (2000). Use of artifi cial neural networks in evaluating prognostic factors determining the response to dendritic cells pulsed with PSMA peptides in prostate cancer patients.
    118. (2004). Use of artifi cial neural networks to predict biological outcomes for patients receiving radical radiotherapy of the prostate. Radiother Oncol,
    119. (1991). Using neural networks to diagnose cancer.
    120. (2003). Variations in lung cancer risk among smokers.

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.