Predicting seizure outcome after epilepsy surgery: do we need more complex models, larger samples, or better data?

Adler, Sophie; Baldeweg, Torsten; Booth, John; Caballero, Ana Perez; Chari, Aswin; Cooray, Gerald; Cross, J Helen; Das, Krishna B; Eltze, Christin; Eriksson, Maria H; Martin Sanfilippo, Patricia; McTague, Amy; Menzies, Lara; Moeller, Friederike; Piper, Rory J; Ripart, Mathilde; Tisdall, Martin M; Wagstyl, Konrad; Whitaker, Kirstie J

Predicting seizure outcome after epilepsy surgery: do we need more complex models, larger samples, or better data?

Authors: Sophie Adler
Torsten Baldeweg
John Booth
Ana Perez Caballero
Aswin Chari
Gerald Cooray
J Helen Cross
Krishna B Das
Christin Eltze
Maria H Eriksson
Patricia Martin Sanfilippo
Amy McTague
Lara Menzies
Friederike Moeller
Rory J Piper
Mathilde Ripart
Martin M Tisdall
Konrad Wagstyl
Kirstie J Whitaker
Publication date: 2 May 2023
Publisher: 'Royal College of Obstetricians & Gynaecologists (RCOG)'

Abstract

OBJECTIVE: The accurate prediction of seizure freedom after epilepsy surgery remains challenging. We investigated if 1) training more complex models, 2) recruiting larger sample sizes, or 3) using data-driven selection of clinical predictors would improve our ability to predict post-operative seizure outcome using clinical features. We also conducted the first substantial external validation of a machine learning model trained to predict post-operative seizure outcome. METHODS: We performed a retrospective cohort study of 797 children who had undergone resective or disconnective epilepsy surgery at a tertiary center. We extracted patient information from medical records and trained three models - a logistic regression, a multilayer perceptron, and an XGBoost model - to predict one-year post-operative seizure outcome on our dataset. We evaluated the performance of a recently published XGBoost model on the same patients. We further investigated the impact of sample size on model performance, using learning curve analysis to estimate performance at samples up to N=2,000. Finally, we examined the impact of predictor selection on model performance. RESULTS: Our logistic regression achieved an accuracy of 72% (95% CI=68-75%,AUC=0.72), while our multilayer perceptron and XGBoost both achieved accuracies of 71% (95% CIMLP =67-74%,AUCMLP =0.70; 95% CIXGBoost own =68-75%,AUCXGBoost own =0.70). There was no significant difference in performance between our three models (all P>0.4) and they all performed better than the external XGBoost, which achieved an accuracy of 63% (95% CI=59-67%,AUC=0.62; PLR =0.005,PMLP =0.01,PXGBoost own =0.01) on our data. All models showed improved performance with increasing sample size, but limited improvements beyond our current sample. The best model performance was achieved with data-driven feature selection. SIGNIFICANCE: We show that neither the deployment of complex machine learning models nor the assembly of thousands of patients alone is likely to generate significant improvements in our ability to predict post-operative seizure freedom. We instead propose that improved feature selection alongside collaboration, data standardization, and model sharing is required to advance the field

Similar works

Full text

Available Versions

UCL Discovery

oai:eprints.ucl.ac.uk.OAI2:101...

Last time updated on 11/05/2023