Automated slice classification is clinically relevant since it can be
incorporated into medical image segmentation workflows as a preprocessing step
that would flag slices with a higher probability of containing tumors, thereby
directing physicians attention to the important slices. In this work, we train
a ResNet-18 network to classify axial slices of lymphoma PET/CT images
(collected from two institutions) depending on whether the slice intercepted a
tumor (positive slice) in the 3D image or if the slice did not (negative
slice). Various instances of the network were trained on 2D axial datasets
created in different ways: (i) slice-level split and (ii) patient-level split;
inputs of different types were used: (i) only PET slices and (ii) concatenated
PET and CT slices; and different training strategies were employed: (i)
center-aware (CAW) and (ii) center-agnostic (CAG). Model performances were
compared using the area under the receiver operating characteristic curve
(AUROC) and the area under the precision-recall curve (AUPRC), and various
binary classification metrics. We observe and describe a performance
overestimation in the case of slice-level split as compared to the
patient-level split training. The model trained using patient-level split data
with the network input containing only PET slices in the CAG training regime
was the best performing/generalizing model on a majority of metrics. Our models
were additionally more closely compared using the sensitivity metric on the
positive slices from their respective test sets.Comment: 10 pages, 6 figures, 2 table