Histopathological images provide the definitive source of cancer diagnosis,
containing information used by pathologists to identify and subclassify
malignant disease, and to guide therapeutic choices. These images contain vast
amounts of information, much of which is currently unavailable to human
interpretation. Supervised deep learning approaches have been powerful for
classification tasks, but they are inherently limited by the cost and quality
of annotations. Therefore, we developed Histomorphological Phenotype Learning,
an unsupervised methodology, which requires no annotations and operates via the
self-discovery of discriminatory image features in small image tiles. Tiles are
grouped into morphologically similar clusters which appear to represent
recurrent modes of tumor growth emerging under natural selection. These
clusters have distinct features which can be identified using orthogonal
methods. Applied to lung cancer tissues, we show that they align closely with
patient outcomes, with histopathologically recognised tumor types and growth
patterns, and with transcriptomic measures of immunophenotype