In the era of open science, public datasets, along with common experimental
protocol, help in the process of designing and validating data science
algorithms; they also contribute to ease reproductibility and fair comparison
between methods. Many datasets for image segmentation are available, each
presenting its own challenges; however just a very few exist for radiotherapy
planning. This paper is the presentation of a new dataset dedicated to the
segmentation of organs at risk (OARs) in the thorax, i.e. the organs
surrounding the tumour that must be preserved from irradiations during
radiotherapy. This dataset is called SegTHOR (Segmentation of THoracic Organs
at Risk). In this dataset, the OARs are the heart, the trachea, the aorta and
the esophagus, which have varying spatial and appearance characteristics. The
dataset includes 60 3D CT scans, divided into a training set of 40 and a test
set of 20 patients, where the OARs have been contoured manually by an
experienced radiotherapist. Along with the dataset, we present some baseline
results, obtained using both the original, state-of-the-art architecture U-Net
and a simplified version. We investigate different configurations of this
baseline architecture that will serve as comparison for future studies on the
SegTHOR dataset. Preliminary results show that room for improvement is left,
especially for smallest organs.Comment: Submitted to a journal in december 201