1 research outputs found
Artificial intelligence for breast cancer precision pathology
Breast cancer is the most common cancer type in women globally but is associated with a
continuous decline in mortality rates. The improved prognosis can be partially attributed to
effective treatments developed for subgroups of patients. However, nowadays, it remains
challenging to optimise treatment plans for each individual. To improve disease outcome and
to decrease the burden associated with unnecessary treatment and adverse drug effects, the
current thesis aimed to develop artificial intelligence based tools to improve individualised
medicine for breast cancer patients.
In study I, we developed a deep learning based model (DeepGrade) to stratify patients that
were associated with intermediate risks. The model was optimised with haematoxylin and eosin
(HE) stained whole slide images (WSIs) with grade 1 and 3 tumours and applied to stratify
grade 2 tumours into grade 1-like (DG2-low) and grade 3-like (DG2-high) subgroups. The
efficacy of the DeepGrade model was validated using recurrence free survival where the
dichotomised groups exhibited an adjusted hazard ratio (HR) of 2.94 (95% confidence interval
[CI] 1.24-6.97, P = 0.015). The observation was further confirmed in the external test cohort
with an adjusted HR of 1.91 (95% CI: 1.11-3.29, P = 0.019).
In study II, we investigated whether deep learning models were capable of predicting gene
expression levels using the morphological patterns from tumours. We optimised convolutional
neural networks (CNNs) to predict mRNA expression for 17,695 genes using HE stained WSIs
from the training set. An initial evaluation on the validation set showed that a significant
correlation between the RNA-seq measurements and model predictions was observed for
52.75% of the genes. The models were further tested in the internal and external test sets.
Besides, we compared the model's efficacy in predicting RNA-seq based proliferation scores.
Lastly, the ability of capturing spatial gene expression variations for the optimised CNNs was
evaluated and confirmed using spatial transcriptomics profiling.
In study III, we investigated the relationship between intra-tumour gene expression
heterogeneity and patient survival outcomes. Deep learning models optimised from study II
were applied to generate spatial gene expression predictions for the PAM50 gene panel. A set
of 11 texture based features and one slide average gene expression feature per gene were
extracted as input to train a Cox proportional hazards regression model with elastic net
regularisation to predict patient risk of recurrence. Through nested cross-validation, the model
dichotomised the training cohort into low and high risk groups with an adjusted HR of 2.1
(95% CI: 1.30-3.30, P = 0.002). The model was further validated on two external cohorts.
In study IV, we investigated the agreement between the Stratipath Breast, which is the
modified, commercialised DeepGrade model developed in study I, and the Prosigna® test.
Both tests sought to stratify patients with distinct prognosis. The outputs from Stratipath Breast
comprise a risk score and a two-level risk stratification whereas the outputs from Prosigna®
include the risk of recurrence score and a three-tier risk stratification. By comparing the number
of patients assigned to ‘low’ or ‘high’ risk groups, we found an overall moderate agreement
(76.09%) between the two tests. Besides, the risk scores by two tests also revealed a good
correlation (Spearman's rho = 0.59, P = 1.16E-08). In addition, a good correlation was observed
between the risk score from each test and the Ki67 index. The comparison was also carried out
in the subgroup of patients with grade 2 tumours where similar but slightly dropped correlations
were found