The recent surge in performance for image analysis of digitised pathology
slides can largely be attributed to the advance of deep learning. Deep models
can be used to initially localise various structures in the tissue and hence
facilitate the extraction of interpretable features for biomarker discovery.
However, these models are typically trained for a single task and therefore
scale poorly as we wish to adapt the model for an increasing number of
different tasks. Also, supervised deep learning models are very data hungry and
therefore rely on large amounts of training data to perform well. In this paper
we present a multi-task learning approach for segmentation and classification
of nuclei, glands, lumen and different tissue regions that leverages data from
multiple independent data sources. While ensuring that our tasks are aligned by
the same tissue type and resolution, we enable simultaneous prediction with a
single network. As a result of feature sharing, we also show that the learned
representation can be used to improve downstream tasks, including nuclear
classification and signet ring cell detection. As part of this work, we use a
large dataset consisting of over 600K objects for segmentation and 440K patches
for classification and make the data publicly available. We use our approach to
process the colorectal subset of TCGA, consisting of 599 whole-slide images, to
localise 377 million, 900K and 2.1 million nuclei, glands and lumen
respectively. We make this resource available to remove a major barrier in the
development of explainable models for computational pathology