Finding a Suitable Class Distribution for Building Histological Images Data Sets Used in Deep Model Training - the Case of Cancer Detection

Abstract

International audienceThe class distribution of a training data set is an important factor which influences the performance of a deep learning-based system. Understanding the optimal class distribution is therefore crucial when building a new training set which may be costly to annotate. This is the case for histological images used in cancer diagnosis where image annotation requires domain experts. In this paper we tackle the problem of finding the optimal class distribution of a training set to be able to train an optimal model that detects cancer in histological images. We formulate several hypotheses which are then tested in scores of experiments with hundreds of trials. The experiments have been designed to account for both segmentation and cla

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 19/05/2022