Machine learning methods for automated classification of tumors with papillary thyroid carcinoma-like nuclei : A quantitative analysis

Abstract

When approaching thyroid gland tumor classification, the differentiation between samples with and without “papillary thyroid carcinoma-like” nuclei is a daunting task with high inter-observer variability among pathologists. Thus, there is increasing interest in the use of machine learning approaches to provide pathologists real-time decision support. In this paper, we optimize and quantitatively compare two automated machine learning methods for thyroid gland tumor classification on two datasets to assist pathologists in decision-making regarding these methods and their parameters. The first method is a feature-based classification originating from common image processing and consists of cell nucleus segmentation, feature extraction, and subsequent thyroid gland tumor classification utilizing different classifiers. The second method is a deep learning-based classification which directly classifies the input images with a convolutional neural network without the need for cell nucleus segmentation. On the Tharun and Thompson dataset, the feature-based classification achieves an accuracy of 89.7% (Cohen’s Kappa 0.79), compared to the deep learning-based classification of 89.1% (Cohen’s Kappa 0.78). On the Nikiforov dataset, the feature-based classification achieves an accuracy of 83.5% (Cohen’s Kappa 0.46) compared to the deep learning-based classification 77.4% (Cohen’s Kappa 0.35). Thus, both automated thyroid tumor classification methods can reach the classification level of an expert pathologist. To our knowledge, this is the first study comparing feature-based and deep learning-based classification regarding their ability to classify samples with and without papillary thyroid carcinoma-like nuclei on two large-scale datasets

    Similar works