The canine lymphoma blood test detects the levels of two biomarkers, the
acute phase proteins (C-Reactive Protein and Haptoglobin). This test can be
used for diagnostics, for screening, and for remission monitoring as well. We
analyze clinical data, test various machine learning methods and select the
best approach to these problems. Three family of methods, decision trees, kNN
(including advanced and adaptive kNN) and probability density evaluation with
radial basis functions, are used for classification and risk estimation.
Several pre-processing approaches were implemented and compared. The best of
them are used to create the diagnostic system. For the differential diagnosis
the best solution gives the sensitivity and specificity of 83.5% and 77%,
respectively (using three input features, CRP, Haptoglobin and standard
clinical symptom). For the screening task, the decision tree method provides
the best result, with sensitivity and specificity of 81.4% and >99%,
respectively (using the same input features). If the clinical symptoms
(Lymphadenopathy) are considered as unknown then a decision tree with CRP and
Hapt only provides sensitivity 69% and specificity 83.5%. The lymphoma risk
evaluation problem is formulated and solved. The best models are selected as
the system for computational lymphoma diagnosis and evaluation the risk of
lymphoma as well. These methods are implemented into a special web-accessed
software and are applied to problem of monitoring dogs with lymphoma after
treatment. It detects recurrence of lymphoma up to two months prior to the
appearance of clinical signs. The risk map visualisation provides a friendly
tool for explanatory data analysis.Comment: 24 pages, 86 references in the bibliography, Significantly extended
version with review of lymphoma biomarkers and data mining methods (Three new
sections are added: 1.1. Biomarkers for canine lymphoma, 1.2. Acute phase
proteins as lymphoma biomarkers and 3.1. Data mining methods for biomarker
cancer diagnosis. Flowcharts of data analysis are included as supplementary
material (20 pages