Finding a Suitable Class Distribution for Building Histological Images Data Sets Used in Deep Model Training - the Case of Cancer Detection

Brousset, Pierre; Cussat-Blanc, Sylvain; Franchet, Camille; Gaspard, Margot; Ionescu, Radu Tudor; Luga, Hervé; Mothe, Josiane; Reshma, Ismat Ara

Finding a Suitable Class Distribution for Building Histological Images Data Sets Used in Deep Model Training - the Case of Cancer Detection

Authors: Pierre Brousset
Sylvain Cussat-Blanc
Camille Franchet
Margot Gaspard
Radu Tudor Ionescu
Hervé Luga
Josiane Mothe
Ismat Ara Reshma
Publication date: 1 January 2022
Publisher: Springer Verlag

Abstract

International audienceThe class distribution of a training data set is an important factor which influences the performance of a deep learning-based system. Understanding the optimal class distribution is therefore crucial when building a new training set which may be costly to annotate. This is the case for histological images used in cancer diagnosis where image annotation requires domain experts. In this paper we tackle the problem of finding the optimal class distribution of a training set to be able to train an optimal model that detects cancer in histological images. We formulate several hypotheses which are then tested in scores of experiments with hundreds of trials. The experiments have been designed to account for both segmentation and cla

Similar works

Full text

Available Versions

HAL-Inserm

oai:HAL:hal-03604324v1

Last time updated on 19/05/2022