Acute Lymphoblastic Leukemia (ALL) is one of the most common types of
childhood blood cancer. The quick start of the treatment process is critical to
saving the patient's life, and for this reason, early diagnosis of this disease
is essential. Examining the blood smear images of these patients is one of the
methods used by expert doctors to diagnose this disease. Deep learning-based
methods have numerous applications in medical fields, as they have
significantly advanced in recent years. ALL diagnosis is not an exception in
this field, and several machine learning-based methods for this problem have
been proposed. In previous methods, high diagnostic accuracy was reported, but
our work showed that this alone is not sufficient, as it can lead to models
taking shortcuts and not making meaningful decisions. This issue arises due to
the small size of medical training datasets. To address this, we constrained
our model to follow a pipeline inspired by experts' work. We also demonstrated
that, since a judgement based on only one image is insufficient, redefining the
problem as a multiple-instance learning problem is necessary for achieving a
practical result. Our model is the first to provide a solution to this problem
in a multiple-instance learning setup. We introduced a novel pipeline for
diagnosing ALL that approximates the process used by hematologists, is
sensitive to disease biomarkers, and achieves an accuracy of 96.15%, an
F1-score of 94.24%, a sensitivity of 97.56%, and a specificity of 90.91% on ALL
IDB 1. Our method was further evaluated on an out-of-distribution dataset,
which posed a challenging test and had acceptable performance. Notably, our
model was trained on a relatively small dataset, highlighting the potential for
our approach to be applied to other medical datasets with limited data
availability