1 research outputs found
Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach
Lung cancer (LC) remains the primary cause of cancer-related mortality,
largely due to late-stage diagnoses. Effective strategies for early detection
are therefore of paramount importance. In recent years, machine learning (ML)
has demonstrated considerable potential in healthcare by facilitating the
detection of various diseases. In this retrospective development and validation
study, we developed an ML model based on dynamic ensemble selection (DES) for
LC detection. The model leverages standard blood sample analysis and smoking
history data from a large population at risk in Denmark. The study includes all
patients examined on suspicion of LC in the Region of Southern Denmark from
2009 to 2018. We validated and compared the predictions by the DES model with
diagnoses provided by five pulmonologists. Among the 38,944 patients, 9,940 had
complete data of which 2,505 (25\%) had LC. The DES model achieved an area
under the roc curve of 0.770.01, sensitivity of 76.2\%2.4\%,
specificity of 63.8\%2.3\%, positive predictive value of 41.6\%1.2\%,
and F\textsubscript{1}-score of 53.8\%1.1\%. The DES model outperformed
all five pulmonologists, achieving a sensitivity 9\% higher than their average.
The model identified smoking status, age, total calcium levels, neutrophil
count, and lactate dehydrogenase as the most important factors for the
detection of LC. The results highlight the successful application of the ML
approach in detecting LC, surpassing pulmonologists' performance. Incorporating
clinical and laboratory data in future risk assessment models can improve
decision-making and facilitate timely referrals.Comment: 9 pages, 4 figure