453 research outputs found
The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions
Training of neural networks for automated diagnosis of pigmented skin lesions
is hampered by the small size and lack of diversity of available datasets of
dermatoscopic images. We tackle this problem by releasing the HAM10000 ("Human
Against Machine with 10000 training images") dataset. We collected
dermatoscopic images from different populations acquired and stored by
different modalities. Given this diversity we had to apply different
acquisition and cleaning methods and developed semi-automatic workflows
utilizing specifically trained neural networks. The final dataset consists of
10015 dermatoscopic images which are released as a training set for academic
machine learning purposes and are publicly available through the ISIC archive.
This benchmark dataset can be used for machine learning and for comparisons
with human experts. Cases include a representative collection of all important
diagnostic categories in the realm of pigmented lesions. More than 50% of
lesions have been confirmed by pathology, while the ground truth for the rest
of the cases was either follow-up, expert consensus, or confirmation by in-vivo
confocal microscopy
- …