Bowel cancer registry data made whole: filling in the blanks through imputation in Northern Ireland

Abstract

In healthcare, cost-effectiveness analysis (CEA) compares alternative strategies based on consequences and costs to allocate healthcare resources to benefit public health. CEA modelling assembles components of costs, quality of life utilities and survival analysis. Survival analysis can project the lifetime of a simulated individual based on available data, therefore survival data is vital within CEA.Supplementary data requested from the Northern Ireland Cancer Registry (NICR) obtained outputs published in the Pathway to a Cancer Diagnosis report [1] in NI, to inform colorectal cancer (CRC) natural history contained within a larger CEA model. The proportion of individuals diagnosed with CRC was presented based on the route, stage, sex and age, with the proportions of individuals alive after 3, 6 and 12 months. Missingness existed within the data to protect the patient’s identity. If &lt; 10 individuals were diagnosed with CRC based on a specified route, age group, stage and sex, the data were omitted. Also, if &lt; 3 individuals died 3/6/12 months after diagnosis, the data were omitted. Most missing data problems are solved by Rubin’s multiple imputation methods [2]. However, this approach can be biased towards missing not-at-random data compared to missing at/completely at-random data; thus, other approaches are required.Three approaches were developed to impute the missing values. The first approach randomly generated values based on why the data was initially omitted. The second and third approaches used the NICR’s publicly available 1 and 5-year net survival rates (NSRs) for CRC, categorised by age, sex and stage, however, did not incorporate the same routes found in [1]. The second approach considered the lowest NSRs based on route, stage and age. The third approach randomly generated values within the range of possible NSRs, using both the normal and uniform distributions. The 5-year NSRs from NICR were used to estimate the proportions of individuals after 5 years, to better inform and extend survival within the CEA model. After comparing all imputation approaches with the true NICR 1-year NSRs, the most appropriate choice was the third approach, using the normal distribution. Using this approach, we can illustrate the lifetime of an individual within the CEA model and produce more plausible results.Reference:1.Bannon F, Harbinson A, Mayock M, McKenna H. Pathways to a Cancer Diagnosis: Monitoring variation in the patient journey across Northern Ireland 2012 to 2016.2.Rubin DB. Multiple imputations in sample surveys - a phenomenological Bayesian approach to nonresponse. American Statistical Association. 1978;1:20–34.<br/

    Similar works