Part-prototype models are explainable-by-design image classifiers, and a
promising alternative to black box AI. This paper explores the applicability
and potential of interpretable machine learning, in particular PIP-Net, for
automated diagnosis support on real-world medical imaging data. PIP-Net learns
human-understandable prototypical image parts and we evaluate its accuracy and
interpretability for fracture detection and skin cancer diagnosis. We find that
PIP-Net's decision making process is in line with medical classification
standards, while only provided with image-level class labels. Because of
PIP-Net's unsupervised pretraining of prototypes, data quality problems such as
undesired text in an X-ray or labelling errors can be easily identified.
Additionally, we are the first to show that humans can manually correct the
reasoning of PIP-Net by directly disabling undesired prototypes. We conclude
that part-prototype models are promising for medical applications due to their
interpretability and potential for advanced model debugging