From Springer Nature via Jisc Publications RouterHistory: received 2020-02-26, rev-recd 2020-03-28, accepted 2020-05-14, registration 2020-05-14, pub-electronic 2020-06-01, online 2020-06-01, pub-print 2020-11Publication status: PublishedFunder: Cancer Research UK; Grant(s): C147/A18083, C147/A25254, C19221/A22746Funder: Manchester Biomedical Research Centre; doi: http://dx.doi.org/10.13039/100014653Abstract: Objective: To investigate the effects of Image Biomarker Standardisation Initiative (IBSI) compliance, harmonisation of calculation settings and platform version on the statistical reliability of radiomic features and their corresponding ability to predict clinical outcome. Methods: The statistical reliability of radiomic features was assessed retrospectively in three clinical datasets (patient numbers: 108 head and neck cancer, 37 small-cell lung cancer, 47 non-small-cell lung cancer). Features were calculated using four platforms (PyRadiomics, LIFEx, CERR and IBEX). PyRadiomics, LIFEx and CERR are IBSI-compliant, whereas IBEX is not. The effects of IBSI compliance, user-defined calculation settings and platform version were assessed by calculating intraclass correlation coefficients and confidence intervals. The influence of platform choice on the relationship between radiomic biomarkers and survival was evaluated using univariable cox regression in the largest dataset. Results: The reliability of radiomic features calculated by the different software platforms was only excellent (ICC > 0.9) for 4/17 radiomic features when comparing all four platforms. Reliability improved to ICC > 0.9 for 15/17 radiomic features when analysis was restricted to the three IBSI-compliant platforms. Failure to harmonise calculation settings resulted in poor reliability, even across the IBSI-compliant platforms. Software platform version also had a marked effect on feature reliability in CERR and LIFEx. Features identified as having significant relationship to survival varied between platforms, as did the direction of hazard ratios. Conclusion: IBSI compliance, user-defined calculation settings and choice of platform version all influence the statistical reliability and corresponding performance of prognostic models in radiomics. Key Points: • Reliability of radiomic features varies between feature calculation platforms and with choice of software version. • Image Biomarker Standardisation Initiative (IBSI) compliance improves reliability of radiomic features across platforms, but only when calculation settings are harmonised. • IBSI compliance, user-defined calculation settings and choice of platform version collectively affect the prognostic value of features