Clinical practice should be based on the best available evidence. Ideally such
evidence is obtained through rigorously conducted, purpose-designed clinical studies
such as randomised controlled trials and prospective cohort studies. However
gathering information in this way requires a massive effort, can be prohibitively
expensive, is time consuming, and may not always be ethical or practicable. When
answers are needed urgently and purpose-designed prospective studies are not
feasible, retrospective healthcare data may offer the best evidence there is. But can
we rely on analysis with such data to give us meaningful answers?
The current thesis studies this question through analysis with repeated psychological
symptom screening data that were routinely collected from over 20,000 outpatients
who attended selected oncology clinics in Scotland. Linked to patients’ oncology
records these data offer a unique opportunity to study the progress of distress
symptoms on an unprecedented scale in this population. However, the limitations to
such routinely collected observational healthcare data are many. We approach the
analysis within a missing data context and develop a Bayesian model in WinBUGS
to estimate the posterior predictive distribution for the incomplete longitudinal
response and covariate data under both Missing At Random and Missing Not At
Random mechanisms and use this model to generate multiply imputed datasets for
further frequentist analysis.
Additional to the routinely collected screening data we also present a purpose-designed,
prospective cohort study of distress symptoms in the same cancer
outpatient population. This study collected distress outcome scores from enrolled
patients at regular intervals and with very little missing data. Consequently it
contained many of the features that were lacking in the routinely collected screening
data and provided a useful contrast, offering an insight into how the screening data
might have been were it not for the limitations. We evaluate the extent to which it
was possible to reproduce the clinical study results with the analysis of the
observational screening data. Lastly, using the modelling strategy previously developed we analyse the abundant
screening data to estimate the prevalence of depression in a cancer outpatient
population and the associations with demographic and clinical characteristics,
thereby addressing important clinical research questions that have not been
adequately studied elsewhere. The thesis concludes that analysis with observational
healthcare data can potentially be advanced considerably with the use of flexible and
innovative modelling techniques now made practicable with modern computing
power