Estimating An Optimal Backpropagation Algorithm for Training An ANN with the EGFR Exon 19 Nucleotide Sequence: An Electronic Diagnostic Basis for Non–Small Cell Lung Cancer(NSCLC)
One of the most common forms of medical malpractices globally is an error in diagnosis. An improper
diagnosis occurs when a doctor fails to identify a disease or report a disease when the patient is actually
healthy. A disease that is commonly misdiagnosed is lung cancer. This cancer type is a major health problem
internationally because it is responsible for 15% of all cancer diagnosis and 29% of all cancer deaths. The two
major sub-types of lung cancer are; small cell lung cancer (about 13%) and non-small cell lung cancer
(%SCLC- about 87%). The chance of surviving lung cancer depends on its correct diagnosis and/or the stage at
the time it is diagnosed. However, recent studies have identified somatic mutations in the epidermal growth
factor receptor (EGFR) gene in a subset of non-small cell lung cancer (%SCLC) tumors. These mutations occur
in the tyrosine kinase domain of the gene. The most predominant of the mutations in all %SCLC patients
examined is deletion mutation in exon 19 and it accounts for approximately 90% of the EGFR-activating
mutations. This makes EGFR genomic sequence a good candidate for implementing an electronic diagnostic
system for %SCLC. In this study aimed at estimating an optimum backpropagation training algorithm for a
genomic based A%% system for %SCLC diagnosis, the nucleotide sequences of EGFR’s exon 19 of a noncancerous
cell were used to train an artificial neural network (A%%). Several A%% back propagation training
algorithms were tested in MATLAB R2008a to obtain an optimal algorithm for training the network. Of the nine
different algorithms tested, we achieved the best performance (i.e. the least mean square error) with the
minimum epoch (training iterations) and training time using the Levenberg-Marquardt algorithm