Breast cancer is one of the most common causes of death in women, and yet is one
of the more 'curable' cancers if caught early. Since its inception in 1987, the Breast
Screening Programme has been the principal tool in the National Health Service's fight
to reduce the number of cancer related deaths in the UK.
Breast screening using mammography is widely viewed as the most effective way of
detecting early breast cancer, with the UK population of women over the age of 50
being invited to a screening session every three years. However, national shortages
of clinical staff willing to enter and remain in this field mean that the NHS Breast
Screening Programme is severely understaffed.
This thesis discusses one way in which technology can assist in the screening programme;
specifically, the use of a computer-aided cancer detection system. Here, we will present
the design and analysis of a sequence of experiments used to develop and evaluate such
a system. PROMAM (PROmpting for MAMmography) involved the scanning and
digitising of mammograms, and the subsequent analysis of the digital image by a series
of algorithms.
Initial evaluation was done to ensure that the algorithms were performing satisfactorily
at a technical level before being introduced into a clinical setting. Two large experiments
with the algorithms were designed and evaluated:
1. offering radiologists three levels of algorithm prompting and, as a control, an
unprompted level, on samples of mammographic films, with outcomes being their
recall rate and subjective views at each prompting level,
2. a pre-clinical experiment, conducted under semi-clinical conditions, where two
readers would see a batch of films seeded with higher than normal numbers of
cancers, with readers allocated randomly to prompted and unprompted views of
films.
The first experiment was designed using a Graeco-Latin Square, with three 'nuisance'
variables and the treatment factor of prompting levels (no prompts, low level of prompt¬
ing, medium and high). Four radiologists read at each level of prompting once, on dif¬
ferent sets of films. One of the more interesting results was that the recall rate did not
increase as the prompting rate rose - contrary to prior expectations. Most of the differ¬
ences seen between the prompting rates could be explained as radiologist differences.
Once these were taken into account, the level of prompting had little effect. Addition¬
ally, although the time taken to read a set of films increased as the prompting rate
increased (as would be expected), it was only an increase of 26% from the unprompted
set to the set with the highest number of prompts. Observational data suggested that
the lowest level of prompting was not maintaining the interest of the radiologist, thus
leading them to neglect the prompts.
The following experiment moved the system a step closer to a true clinical demonstra¬
tion of the efficacy of PROMAM, being conducted under semi-clinical conditions. Using
a method of minimisation, the number of cancers each radiologist viewed as first reader,
second reader, prompted or unprompted were balanced. Preliminary exploratory anal¬
ysis indicated that the recall rate declined with the introduction of the prompting
system, but more detailed, analysis indicated that much of this difference was due to
a
radiologist effect. Although cancer detection was slightly lower with the prompting
system, examination of the 11 cancers missed by the prompted radiologist showed that
six of these had been correctly prompted by the algorithms. This demonstrated scope
to improve the cancer detection rate by nearly 5%.
These experiments determined the 'production' version of the prompting system. A
design to evaluate the system in a sample of 100,000 women in six centres was produced,
but due to circumstances beyond the project team's control, it was not possible to take
this work to the stage of a full 'trial' of the system. The design concept can, however,
apply to the evaluation of any similar prompting system. The recommended design is
therefore presented, together with an analysis of data from a simulated application of
this design.
This simulation has allowed recommendations to be made on the most appropriate ways
to analyse the extensive and complicated dataset that will be obtained. In particular,
it identified technical problems that can arise from the application on one candidate
analytical method, and an explanation for the failure obtained
It is quite clear from the evidence presented in this thesis that there is much scope
for improvement in the cancer detection rate by the use of a prompting system, with¬
out a corresponding loss in the specificity. With the shortage of radiologists and ra¬
diographers, and the increasing demand placed on the Breast Screening Programme,
technology could play a beneficial role in screening for breast cancer in the coming
year