Interobserver reproducibility of the PRECISE scoring system for prostate MRI on active surveillance: results from a two-centre pilot study

Abstract

OBJECTIVES: We aimed to determine the interobserver reproducibility of the Prostate Cancer Radiological Estimation of Change in Sequential Evaluation (PRECISE) criteria for magnetic resonance imaging in patients on active surveillance (AS) for prostate cancer (PCa) at two different academic centres. METHODS: The PRECISE criteria score the likelihood of clinically significant change over time. The system is a 1-to-5 scale, where 1 or 2 implies regression of a previously visible lesion, 3 denotes stability and 4 or 5 indicates radiological progression. A retrospective analysis of 80 patients (40 from each centre) on AS with a biopsy-confirmed low- or intermediate-risk PCa (i.e. ≤ Gleason 3 + 4 and prostate-specific antigen ≤ 20 ng/ml) and ≥ 2 prostate MR scans was performed. Two blinded radiologists reported all scans independently and scored the likelihood of radiological change (PRECISE score) from the second scan onwards. Cohen's κ coefficients and percent agreement were computed. RESULTS: Agreement was substantial both at a per-patient and a per-scan level (κ = 0.71 and 0.61; percent agreement = 79% and 81%, respectively) for each PRECISE score. The agreement was superior (κ = 0.83 and 0.67; percent agreement = 90% and 91%, respectively) when the PRECISE scores were grouped according to the absence/presence of radiological progression (PRECISE 1-3 vs 4-5). Higher inter-reader agreement was observed for the scans performed at University College London (UCL) (κ = 0.81 vs 0.55 on a per-patient level and κ = 0.70 vs 0.48 on a per-scan level, respectively). The discrepancies between institutions were less evident for percent agreement (80% vs 78% and 86% vs 75%, respectively). CONCLUSIONS: Expert radiologists achieved substantial reproducibility for the PRECISE scoring system, especially when data were pooled together according to the absence/presence of radiological progression (PRECISE 1-3 vs 4-5). KEY POINTS: • Inter-reader agreement between two experienced prostate radiologists using the PRECISE criteria was substantial. • The agreement was higher when the PRECISE scores were grouped according to the absence/presence of radiological progression (i.e. PRECISE 1-3 vs PRECISE 4 and 5). • Higher inter-reader agreement was observed for the scans performed at UCL, but the discrepancies between institutions were less evident for percent agreement

    Similar works